Sequential Curriculum#
The Sequential
curriculum allows you to manually design a sequence of tasks and curricula to train on in order. It provides flexible stopping conditions for transitioning to the next curriculum, including episode count, step count, and mean return. This curriculum passes all update data to the current curriculum in the sequence and sample from it when generating tasks. Note that updates are not passed to inactive curricula in the sequence, so automatic methods will not have a headstart on tracking relevant metrics.
The items in the sequence can be any <Curriculum> object, including <constant> for individual tasks. Sequential
uses syntactic sugar to make it easier to define a sequence of curricula. It takes in a list of curricula curriculum_list
and a list of stopping conditions stopping_conditions
. The curriculum_list
can contain any of the following objects:
A
Curriculum
object - will be directly added to the sequence.A single task - will be wrapped in a
Constant
curriculum.A list of tasks - will be wrapped in a
DomainRandomization
curriculum.A
TaskSpace
object - will be wrapped in aDomainRandomization
curriculum.
Similarly, stopping conditions can be defined with a simple string format. These conditions are composed of metrics, comparison operators, the stopping value, and optional boolean operators to create composite conditions. The format supports the >
, >=
, =
, <=
, <
comparison operators and the &
and |
boolean operators. The currently implemented metrics are:
“steps” - the number of steps taken in the environment during this stage of the sequential curriculum.
“total_steps” - the total number of steps taken in the environment during the entire sequential curriculum.
“episodes” - the number of episodes completed in the environment during this stage of the sequential curriculum.
“total_episodes” - the total number of episodes completed in the environment during the entire sequential curriculum.
“tasks” - the number of tasks completed in the environment during this stage of the sequential curriculum.
“total_tasks” - the total number of tasks completed in the environment during the entire sequential curriculum.
“episode_return” - the mean return of the environment during this stage of the sequential curriculum.
During logging, the Sequential
curriculum will accurately set the probability of tasks outside the current stage’s task space to 0.
Sequential#
- class syllabus.curricula.sequential.SequentialCurriculum(curriculum_list: List[Curriculum], stopping_conditions: List[Any], *curriculum_args, return_buffer_size: int = 1000, **curriculum_kwargs)[source]#
Bases:
Curriculum
Curriculum that iterates through a list of curricula based on stopping conditions.
- property current_curriculum#
- log_metrics(writer, logs, step=None, log_n_tasks=1)[source]#
Log the task distribution to the provided writer.
- Parameters:
writer – Tensorboard summary writer or wandb object
logs – Cumulative list of logs to write
step – Global step number
log_n_tasks – Maximum number of tasks to log, defaults to 1. Use -1 to log all tasks.
- Returns:
Updated logs list
- property requires_step_updates#
Returns whether the curriculum requires step updates from the environment.
- Returns:
True if the curriculum requires step updates, False otherwise
- update_on_episode(episode_return, length, task, progress, env_id=None)[source]#
Update the curriculum with episode results from the environment.
- Parameters:
episode_return – Episodic return
length – Length of the episode
task – Task for which the episode was completed
progress – Progress toward completion or success rate of the given task. 1.0 or True typically indicates a complete task.
env_id – Environment identifier
- update_on_step(task, obs, rew, term, trunc, info, progress, env_id=None)[source]#
Update the curriculum with the current step results from the environment.
- Parameters:
obs – Observation from the environment
rew – Reward from the environment
term – True if the episode ended on this step, False otherwise
trunc – True if the episode was truncated on this step, False otherwise
info – Extra information from the environment
progress – Progress toward completion or success rate of the given task. 1.0 or True typically indicates a complete task.
env_id – Environment identifier
- Raises:
NotImplementedError –
- update_on_step_batch(step_results, env_id=None)[source]#
Update the curriculum with a batch of step results from the environment.
This method can be overridden to provide a more efficient implementation. It is used as a convenience function and to optimize the multiprocessing message passing throughput.
- Parameters:
step_results – List of step results
env_id – Environment identifier
- update_task_progress(task, progress, env_id=None)[source]#
Update the curriculum with a task and its progress. This is used for binary tasks that can be completed mid-episode.
- Parameters:
task – Task for which progress is being updated.
progress – Progress toward completion or success rate of the given task. 1.0 or True typically indicates a complete task.
env_id – Environment identifier