Stat Recorder#
The stat recorder is a utility class that records task-specific episode return and length. It can be used for logging, to track metrics for a custom curriculum, or for per-task reward normalization. If you pass record_stats=True
to your curriculum, a StatRecorder will be automatically be created. Each update_on_step
call will also be passed to the StatRecorder
and each log_metrics
call will also log per-task metrics from the StatRecorder
.
If you want to use the stat recorder to normalize rewards for each task, you can call the normalize
method on the StatRecorder
. This is particularly useful if you have a curricula over reward functions or environment dynamics, where it is more or less difficult to get rewards in different tasks. Below is an example of how you might normalize rewards with a single environment for simplicity:
from syllabus.core import StatRecorder
from syllabus.task_space import DiscreteTaskSpace
task_space = DiscreteTaskSpace(10)
curriculum = DomainRandomization(task_space, record_stats=True)
env = gym.make('procgen:procgen-coinrun-v0')
obs, info = env.reset()
episode_return = 0
episode_length = 0
while True:
action = agent.act(obs)
obs, reward, term, trunc, info = env.step(action)
episode_return += reward
episode_length += 1
normalized_reward = curriculum.stat_recorder.normalize(reward, info['task'])
if term or trunc:
curriculum.update_on_episode(episode_return, episode_length, info["task"], 0.0)
obs, info = env.reset()
episode_return = 0
episode_length = 0
Stat Recorder#
- class syllabus.core.stat_recorder.StatMean(n: int = 0, mu: float = 0, m2: float = 0)[source]#
Bases:
object
- m2: float = 0#
- mu: float = 0#
- n: int = 0#
- class syllabus.core.stat_recorder.StatRecorder(task_space: TaskSpace, calc_past_n=None, task_names=None)[source]#
Bases:
object
Individual statistics tracking for each task.
- get_metrics(log_n_tasks=1)[source]#
Log the statistics of the first 5 tasks to the provided tensorboard writer.
- Parameters:
writer – Tensorboard summary writer.
log_n_tasks – Number of tasks to log statistics for. Use -1 to log all tasks.
- normalize(reward, task)[source]#
Normalize reward by task.
- Parameters:
reward – Reward to normalize
task – Task to normalize reward by
- record(episode_return: float, episode_length: int, episode_task, env_id=None)[source]#
Record the length and return of an episode for a given task.
- Parameters:
episode_length – Length of the episode, i.e. the total number of steps taken during the episode
episodic_return – Total return for the episode
episode_task – Identifier for the task