Curriculum#

Syllabus’s Curriculum API is a unified interface for curriculum learning methods. Curricula following this API can be used with all of Syllabus’s infrastructure. We hope that future curriculum learning research will provide implementations following this API to encourage reproducibility and ease of use.

The full documentation for the curriculum class can be found Global Synchronization

The Curriculum class has three main jobs:

  • Maintain a sampling distribution over the task space.

  • Incorporate feedback from the environments or training process to update the sampling distribution.

  • Provide a sampling interface for the environment to draw tasks from.

In reality, the sampling distribution can be whatever you want, such as a uniform distribution, a deterministic sequence of tasks, or a single constant task depending on the curriculum learning method.

To incorporate feedback from the environment, the API provides multiple methods:

Curriculum#

class syllabus.core.curriculum_base.Curriculum(task_space: TaskSpace, random_start_tasks: int = 0, task_names: Callable | None = None, record_stats: bool = False)[source]#

Bases: object

Base class and API for defining curricula to interface with Gym environments.

add_agent(agent: Agent)[source]#

Add an agent to the curriculum.

Parameters:

agent – Agent to add to the curriculum

Return agent_id:

Identifier of the added agent

get_agent(agent_id: int) Agent[source]#

Load an agent from the buffer of saved agents.

Parameters:

agent_id – Identifier of the agent to load

Returns:

Loaded agent

log_metrics(writer, logs: List[Dict], step: int | None = None, log_n_tasks: int = 1)[source]#

Log the task distribution to the provided writer.

Parameters:
  • writer – Tensorboard summary writer or wandb object

  • logs – Cumulative list of logs to write

  • step – Global step number

  • log_n_tasks – Maximum number of tasks to log, defaults to 1. Use -1 to log all tasks.

Returns:

Updated logs list

normalize(reward: float, task: Any) float[source]#

Normalize reward by task.

Parameters:
  • reward – Reward to normalize

  • task – Task for which the reward was received

Returns:

Normalized reward

property num_tasks: int#

Counts the number of tasks in the task space.

Returns:

Returns the number of tasks in the task space if it is countable, TODO: -1 otherwise

property requires_step_updates: bool#

Returns whether the curriculum requires step updates from the environment.

Returns:

True if the curriculum requires step updates, False otherwise

sample(k: int = 1) List | Any[source]#

Sample k tasks from the curriculum.

Parameters:

k – Number of tasks to sample, defaults to 1

Returns:

Either returns a single task if k=1, or a list of k tasks

property tasks: List[tuple]#

List all of the tasks in the task space.

Returns:

List of tasks if task space is enumerable, TODO: empty list otherwise?

update_on_episode(episode_return: float, length: int, task: Any, progress: float | bool, env_id: int | None = None) None[source]#

Update the curriculum with episode results from the environment.

Parameters:
  • episode_return – Episodic return

  • length – Length of the episode

  • task – Task for which the episode was completed

  • progress – Progress toward completion or success rate of the given task. 1.0 or True typically indicates a complete task.

  • env_id – Environment identifier

update_on_step(task: Any, obs: Any, rew: float, term: bool, trunc: bool, info: dict, progress: float | bool, env_id: int | None = None) None[source]#

Update the curriculum with the current step results from the environment.

Parameters:
  • obs – Observation from the environment

  • rew – Reward from the environment

  • term – True if the episode ended on this step, False otherwise

  • trunc – True if the episode was truncated on this step, False otherwise

  • info – Extra information from the environment

  • progress – Progress toward completion or success rate of the given task. 1.0 or True typically indicates a complete task.

  • env_id – Environment identifier

Raises:

NotImplementedError

update_on_step_batch(step_results: Tuple[List[Any], List[Any], List[int], List[bool], List[bool], List[Dict], List[int]], env_id: int | None = None) None[source]#

Update the curriculum with a batch of step results from the environment.

This method can be overridden to provide a more efficient implementation. It is used as a convenience function and to optimize the multiprocessing message passing throughput.

Parameters:
  • step_results – List of step results

  • env_id – Environment identifier

update_task_progress(task: Any, progress: float | bool, env_id: int | None = None) None[source]#

Update the curriculum with a task and its progress. This is used for binary tasks that can be completed mid-episode.

Parameters:
  • task – Task for which progress is being updated.

  • progress – Progress toward completion or success rate of the given task. 1.0 or True typically indicates a complete task.

  • env_id – Environment identifier