Stat Recorder#

The stat recorder is a utility class that records task-specific episode return and length. It can be used for logging, to track metrics for a custom curriculum, or for per-task reward normalization. If you pass record_stats=True to your curriculum, a StatRecorder will be automatically be created. Each update_on_step call will also be passed to the StatRecorder and each log_metrics call will also log per-task metrics from the StatRecorder.

If you want to use the stat recorder to normalize rewards for each task, you can call the normalize method on the StatRecorder. This is particularly useful if you have a curricula over reward functions or environment dynamics, where it is more or less difficult to get rewards in different tasks. Below is an example of how you might normalize rewards with a single environment for simplicity:

from syllabus.core import StatRecorder
from syllabus.task_space import DiscreteTaskSpace

task_space = DiscreteTaskSpace(10)
curriculum = DomainRandomization(task_space, record_stats=True)

env = gym.make('procgen:procgen-coinrun-v0')
obs, info = env.reset()
episode_return = 0
episode_length = 0

while True:
    action = agent.act(obs)
    obs, reward, term, trunc, info = env.step(action)
    episode_return += reward
    episode_length += 1
    normalized_reward = curriculum.stat_recorder.normalize(reward, info['task'])

    if term or trunc:
        curriculum.update_on_episode(episode_return, episode_length, info["task"], 0.0)
        obs, info = env.reset()
        episode_return = 0
        episode_length = 0

Stat Recorder#

class syllabus.core.stat_recorder.StatMean(n: int = 0, mu: float = 0, m2: float = 0)[source]#

Bases: object

m2: float = 0#

mean()[source]#

mu: float = 0#

n: int = 0#

reset()[source]#

result()[source]#

std()[source]#

class syllabus.core.stat_recorder.StatRecorder(task_space: TaskSpace, calc_past_n=None, task_names=None)[source]#

Bases: object

Individual statistics tracking for each task.

get_metrics(log_n_tasks=1)[source]#

Log the statistics of the first 5 tasks to the provided tensorboard writer.

Parameters:

writer – Tensorboard summary writer.
log_n_tasks – Number of tasks to log statistics for. Use -1 to log all tasks.

normalize(reward, task)[source]#

Normalize reward by task.

Parameters:

reward – Reward to normalize
task – Task to normalize reward by

record(episode_return: float, episode_length: int, episode_task, env_id=None)[source]#

Record the length and return of an episode for a given task.

Parameters:

episode_length – Length of the episode, i.e. the total number of steps taken during the episode
episodic_return – Total return for the episode
episode_task – Identifier for the task

save_statistics(output_path)[source]#

Write task-specific statistics to file.

Parameters:: output_path – Path to save the statistics file.