Task Wrappers#

CartPole Task Wrapper#

class syllabus.examples.task_wrappers.cartpole_task_wrapper.CartPoleTaskWrapper(env, discretize=False)[source]#

Bases: TaskWrapper

reset(**kwargs)[source]#

Uses the reset() of the env that can be overwritten to change the returned data.

Minigrid Task Wrapper#

Task wrapper that can select a new MiniGrid task on reset.

class syllabus.examples.task_wrappers.minigrid_task_wrapper.MinigridTaskWrapper(env: Env)[source]#

Bases: TaskWrapper

This wrapper allows you to change the task of an NLE environment.

change_task(new_task: int)[source]#

Change task by directly editing environment class.

Ignores requests for unknown tasks or task changes outside of a reset.

observation(obs)[source]#

Adds the goal encoding to the observation. Override to add additional task-specific observations. Returns a modified observation. TODO: Complete this implementation and find way to support centralized encodings

reset(new_task=None, **kwargs)[source]#

Resets the environment along with all available tasks, and change the current task.

This ensures that all instance variables are reset, not just the ones for the current task. We do this efficiently by keeping track of which reset functions have already been called, since very few tasks override reset. If new_task is provided, we change the task before calling the final reset.

step(action)[source]#

Step through environment and update task completion.

NetHack Task Wrapper#

Task wrapper for NLE that can change tasks at reset using the NLE’s task definition format.

class syllabus.examples.task_wrappers.nethack_wrappers.NetHackCollect(*args, **kwargs)[source]#

Bases: NetHackGold

Environment for “staircase” task.

This task requires the agent to get on top of a staircase down (>). The reward function is \(I + ext{TP}\), where \(I\) is 1 if the task is successful, and 0 otherwise, and :math:` ext{TP}` is the time step function as defined by NetHackScore.

reset(wizkit_items=None)[source]#

Resets the environment.

Note

We attempt to manually navigate the first few menus so that the first seen state is ready to be acted upon by the user. This might fail in case Nethack is initialized with some uncommon options.

Returns:

(tuple) (Observation of the state as

defined by self.observation_space, Extra game state information)

class syllabus.examples.task_wrappers.nethack_wrappers.NetHackDescend(*args, penalty_mode='constant', penalty_step: float = -0.01, penalty_time: float = -0.0, **kwargs)[source]#

Bases: NetHackScore

Environment for “staircase” task.

This task requires the agent to get on top of a staircase down (>). The reward function is \(I + ext{TP}\), where \(I\) is 1 if the task is successful, and 0 otherwise, and :math:` ext{TP}` is the time step function as defined by NetHackScore.

reset(wizkit_items=None)[source]#

Resets the environment.

Note

We attempt to manually navigate the first few menus so that the first seen state is ready to be acted upon by the user. This might fail in case Nethack is initialized with some uncommon options.

Returns:

(tuple) (Observation of the state as

defined by self.observation_space, Extra game state information)

class syllabus.examples.task_wrappers.nethack_wrappers.NetHackSatiate(*args, penalty_mode='constant', penalty_step: float = -0.01, penalty_time: float = -0.0, **kwargs)[source]#

Bases: NetHackScore

Environment for the “eat” task.

The task is similar to the one defined by NetHackScore, but the reward uses positive changes in the character’s hunger level (e.g. by consuming comestibles or monster corpses), rather than the score.

class syllabus.examples.task_wrappers.nethack_wrappers.NetHackScoutClipped(*args, penalty_mode='constant', penalty_step: float = -0.01, penalty_time: float = -0.0, **kwargs)[source]#

Bases: NetHackScore

Environment for the “scout” task.

The task is similar to the one defined by NetHackScore, but the score is defined by the changes in glyphs discovered by the agent.

reset(*args, **kwargs)[source]#

Resets the environment.

Note

We attempt to manually navigate the first few menus so that the first seen state is ready to be acted upon by the user. This might fail in case Nethack is initialized with some uncommon options.

Returns:

(tuple) (Observation of the state as

defined by self.observation_space, Extra game state information)

class syllabus.examples.task_wrappers.nethack_wrappers.NetHackSeed(*args, character='@', allow_all_yn_questions=True, allow_all_modes=True, penalty_mode='constant', penalty_step: float = -0.0, penalty_time: float = -0.0, max_episode_steps: int = 1000000.0, observation_keys=('glyphs', 'chars', 'colors', 'specials', 'blstats', 'message', 'inv_glyphs', 'inv_strs', 'inv_letters', 'inv_oclasses', 'tty_chars', 'tty_colors', 'tty_cursor', 'misc'), no_progress_timeout: int = 10000, **kwargs)[source]#

Bases: NetHackScore

Environment for the NetHack Challenge.

The task is an augmentation of the standard NLE task. This is the NLE Score Task but with some subtle differences: * the action space is fixed to include the full keyboard * menus and “<More>” tokens are not skipped * starting character is randomly assigned

reset(*args, **kwargs)[source]#

Resets the environment.

Note

We attempt to manually navigate the first few menus so that the first seen state is ready to be acted upon by the user. This might fail in case Nethack is initialized with some uncommon options.

Returns:

(tuple) (Observation of the state as

defined by self.observation_space, Extra game state information)

class syllabus.examples.task_wrappers.nethack_wrappers.NethackSeedWrapper(env: Env, seed: int = 0, num_seeds: int = 200)[source]#

Bases: TaskWrapper

This wrapper allows you to change the task of an NLE environment.

This wrapper was designed to meet two goals.
  1. Allow us to change the task of the NLE environment at the start of an episode

  2. Allow us to use the predefined NLE task definitions without copying/modifying their code. This makes it easier to integrate with other work on nethack tasks or curricula.

Each task is defined as a subclass of the NLE, so you need to cast and reinitialize the environment to change its task. This wrapper manipulates the __class__ property to achieve this, but does so in a safe way. Specifically, we ensure that the instance variables needed for each task are available and reset at the start of the episode regardless of which task is active.

change_task(new_task: int)[source]#

Change task by setting the seed.

observation(observation)[source]#

Parses current inventory and new items gained this timestep from the observation. Returns a modified observation.

reset(new_task=None, **kwargs)[source]#

Resets the environment along with all available tasks, and change the current task.

This ensures that all instance variables are reset, not just the ones for the current task. We do this efficiently by keeping track of which reset functions have already been called, since very few tasks override reset. If new_task is provided, we change the task before calling the final reset.

seed(seed)[source]#
step(action)[source]#

Step through environment and update task completion.

class syllabus.examples.task_wrappers.nethack_wrappers.NethackTaskWrapper(env: Env, additional_tasks: List[NLE] | None = None, use_default_tasks: bool = True, env_kwargs: Dict[str, Any] = {}, wrappers: List[Tuple[Wrapper, List[Any], Dict[str, Any]]] | None = None, seed: int | None = None)[source]#

Bases: TaskWrapper

This wrapper allows you to change the task of an NLE environment.

This wrapper was designed to meet two goals.
  1. Allow us to change the task of the NLE environment at the start of an episode

  2. Allow us to use the predefined NLE task definitions without copying/modifying their code. This makes it easier to integrate with other work on nethack tasks or curricula.

Each task is defined as a subclass of the NLE, so you need to cast and reinitialize the environment to change its task. This wrapper manipulates the __class__ property to achieve this, but does so in a safe way. Specifically, we ensure that the instance variables needed for each task are available and reset at the start of the episode regardless of which task is active.

change_task(new_task: int)[source]#

Change task by directly editing environment class.

Ignores requests for unknown tasks or task changes outside of a reset.

observation(observation)[source]#

Parses current inventory and new items gained this timestep from the observation. Returns a modified observation.

reset(new_task=None, **kwargs)[source]#

Resets the environment along with all available tasks, and change the current task.

This ensures that all instance variables are reset, not just the ones for the current task. We do this efficiently by keeping track of which reset functions have already been called, since very few tasks override reset. If new_task is provided, we change the task before calling the final reset.

seed(seed)[source]#
step(action)[source]#

Step through environment and update task completion.

Pistonball Task Wrapper#

Task wrapper for NLE that can change tasks at reset using the NLE’s task definition format.

class syllabus.examples.task_wrappers.pistonball_task_wrapper.PistonballTaskWrapper(env: ParallelEnv)[source]#

Bases: PettingZooTaskWrapper

This wrapper simply changes the seed of a Minigrid environment.

reset(new_task: int | None = None, **kwargs)[source]#

Resets the environment.

And returns a dictionary of observations (keyed by the agent name)

Procgen Task Wrapper#

class syllabus.examples.task_wrappers.procgen_task_wrapper.ProcgenTaskWrapper(env: Env, env_id, seed=0)[source]#

Bases: TaskWrapper

This wrapper allows you to change the task of an NLE environment.

change_task(new_task: int)[source]#

Change task by directly editing environment class.

Ignores requests for unknown tasks or task changes outside of a reset.

observation(obs)[source]#

Adds the goal encoding to the observation. Override to add additional task-specific observations. Returns a modified observation. TODO: Complete this implementation and find way to support centralized encodings

reset(new_task=None, **kwargs)[source]#

Resets the environment along with all available tasks, and change the current task.

This ensures that all instance variables are reset, not just the ones for the current task. We do this efficiently by keeping track of which reset functions have already been called, since very few tasks override reset. If new_task is provided, we change the task before calling the final reset.

seed(seed)[source]#
step(action)[source]#

Step through environment and update task completion.

SimpleTag Task Wrapper#

Task wrapper for NLE that can change tasks at reset using the NLE’s task definition format.

class syllabus.examples.task_wrappers.simpletag_task_wrapper.SimpleTagTaskWrapper(env: ParallelEnv)[source]#

Bases: PettingZooTaskWrapper

This wrapper simply changes the seed of a Minigrid environment.

reset(new_task: int | None = None, **kwargs)[source]#

Resets the environment.

And returns a dictionary of observations (keyed by the agent name)