Sequential Curriculum#

The Sequential curriculum allows you to manually design a sequence of tasks and curricula to train on in order. It provides flexible stopping conditions for transitioning to the next curriculum, including episode count, step count, and mean return. This curriculum passes all update data to the current curriculum in the sequence and sample from it when generating tasks. Note that updates are not passed to inactive curricula in the sequence, so automatic methods will not have a headstart on tracking relevant metrics.

The items in the sequence can be any <Curriculum> object, including <constant> for individual tasks. Sequential uses syntactic sugar to make it easier to define a sequence of curricula. It takes in a list of curricula curriculum_list and a list of stopping conditions stopping_conditions. The curriculum_list can contain any of the following objects:

  • A Curriculum object - will be directly added to the sequence.

  • A single task - will be wrapped in a Constant curriculum.

  • A list of tasks - will be wrapped in a DomainRandomization curriculum.

  • A TaskSpace object - will be wrapped in a DomainRandomization curriculum.

Similarly, stopping conditions can be defined with a simple string format. These conditions are composed of metrics, comparison operators, the stopping value, and optional boolean operators to create composite conditions. The format supports the >, >=, =, <=, < comparison operators and the & and | boolean operators. The currently implemented metrics are:

  • “steps” - the number of steps taken in the environment during this stage of the sequential curriculum.

  • “total_steps” - the total number of steps taken in the environment during the entire sequential curriculum.

  • “episodes” - the number of episodes completed in the environment during this stage of the sequential curriculum.

  • “total_episodes” - the total number of episodes completed in the environment during the entire sequential curriculum.

  • “tasks” - the number of tasks completed in the environment during this stage of the sequential curriculum.

  • “total_tasks” - the total number of tasks completed in the environment during the entire sequential curriculum.

  • “episode_return” - the mean return of the environment during this stage of the sequential curriculum.

During logging, the Sequential curriculum will accurately set the probability of tasks outside the current stage’s task space to 0.

Sequential#

class syllabus.curricula.sequential.SequentialCurriculum(curriculum_list: List[Curriculum], stopping_conditions: List[Any], *curriculum_args, return_buffer_size: int = 1000, **curriculum_kwargs)[source]#

Bases: Curriculum

Curriculum that iterates through a list of curricula based on stopping conditions.

check_stopping_conditions()[source]#
property current_curriculum#
log_metrics(writer, logs, step=None, log_n_tasks=1)[source]#

Log the task distribution to the provided writer.

Parameters:
  • writer – Tensorboard summary writer or wandb object

  • logs – Cumulative list of logs to write

  • step – Global step number

  • log_n_tasks – Maximum number of tasks to log, defaults to 1. Use -1 to log all tasks.

Returns:

Updated logs list

property requires_step_updates#

Returns whether the curriculum requires step updates from the environment.

Returns:

True if the curriculum requires step updates, False otherwise

sample(k: int = 1) List | Any[source]#

Choose the next k tasks from the list.

update_on_episode(episode_return, length, task, progress, env_id=None)[source]#

Update the curriculum with episode results from the environment.

Parameters:
  • episode_return – Episodic return

  • length – Length of the episode

  • task – Task for which the episode was completed

  • progress – Progress toward completion or success rate of the given task. 1.0 or True typically indicates a complete task.

  • env_id – Environment identifier

update_on_step(task, obs, rew, term, trunc, info, progress, env_id=None)[source]#

Update the curriculum with the current step results from the environment.

Parameters:
  • obs – Observation from the environment

  • rew – Reward from the environment

  • term – True if the episode ended on this step, False otherwise

  • trunc – True if the episode was truncated on this step, False otherwise

  • info – Extra information from the environment

  • progress – Progress toward completion or success rate of the given task. 1.0 or True typically indicates a complete task.

  • env_id – Environment identifier

Raises:

NotImplementedError

update_on_step_batch(step_results, env_id=None)[source]#

Update the curriculum with a batch of step results from the environment.

This method can be overridden to provide a more efficient implementation. It is used as a convenience function and to optimize the multiprocessing message passing throughput.

Parameters:
  • step_results – List of step results

  • env_id – Environment identifier

update_task_progress(task, progress, env_id=None)[source]#

Update the curriculum with a task and its progress. This is used for binary tasks that can be completed mid-episode.

Parameters:
  • task – Task for which progress is being updated.

  • progress – Progress toward completion or success rate of the given task. 1.0 or True typically indicates a complete task.

  • env_id – Environment identifier