Runner 

This page is organized as follow:

Objectives 

The runner class aims at:

facilitate the evaluation of the performance of grid2op.Agent by performing automatically the “gymnasium loop” (see below)
define a format to store the results of the evaluation of such agent in a standardized manner
this “agent logs” can then be re read by third party applications, such as grid2viz or by internal class to ease the study of the behaviour of such agent, for example with the classes grid2op.Episode.EpisodeData or grid2op.Episode.EpisodeReplay
allow easy use of parallelization of this assessment.

Basically, the runner simplifies the assessment of the performance of some agent. This is the “usual” gymnasium code to run an agent:

import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
NB_EPISODE = 10  # assess the performance for 10 episodes, for example
for i in range(NB_EPISODE):
    reward = env.reward_range[0]
    done = False
    obs = env.reset()
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = env.step(act)

The above code does not store anything, cannot be run easily in parallel and is already pretty verbose. To have a shorter code, that saves most of the data (and make it easier to integrate it with other applications) we can use the runner the following way:

import grid2op
from grid2op.Runner import Runner
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
NB_EPISODE = 10  # assess the performance for 10 episodes, for example
NB_CORE = 2  # do it on 2 cores, for example
PATH_SAVE = "agents_log"  # and store the results in the "agents_log" folder
runner = Runner(**env.get_params_for_runner(), agentClass=RandomAgent)
runner.run(nb_episode=NB_EPISODE, nb_process=NB_CORE, path_save=PATH_SAVE)

As we can see, with less lines of code, we could execute parallel assessment of our agent, on 10 episode and save the results (observations, actions, rewards, etc.) into a dedicated folder.

If your agent is inialiazed with a custom __init__ method that takes more than the action space to be built, you can also use the Runner pretty easily by passing it an instance of your agent, for example:

import grid2op
from grid2op.Runner import Runner
env = grid2op.make("l2rpn_case14_sandbox")
NB_EPISODE = 10  # assess the performance for 10 episodes, for example
NB_CORE = 2  # do it on 2 cores, for example
PATH_SAVE = "agents_log"  # and store the results in the "agents_log" folder

# initilize your agent
my_agent = FancyAgentWithCustomInitialization(env.action_space,
                                              env.observation_space,
                                              "whatever else you want"
                                              )

# and proceed as following for the runner
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=my_agent)
runner.run(nb_episode=NB_EPISODE, nb_process=NB_CORE, path_save=PATH_SAVE)

Other tools are available for this runner class, for example the easy integration of progress bars. See bellow for more information.

Note on parallel processing 

The “Runner” class allows for parallel execution of the same agent on different scenarios. In this case, each scenario will be run in independent process.

Depending on the platform and python version, you might end up with some bugs and error like

AttributeError: Can’t get attribute ‘ActionSpace_l2rpn_case14_sandbox’ on <module ‘grid2op.Space.GridObjects’ from ‘/lib/python3.8/site-packages/grid2op/Space/GridObjects.py’> Process SpawnPoolWorker-4:

or like:

File “/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/pool.py”, line 125, in worker result = (True, func(*args, **kwds))

File “/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/pool.py”, line 51, in starmapstar return list(itertools.starmap(args[0], args[1]))

In this case this means grid2op has a hard time dealing with the multi processing part. In that case, it is recommended to disable it completely, for example by using, before any call to “runner.run” the following code:

import os
from grid2op.Runner import Runner

os.environ[Runner.FORCE_SEQUENTIAL] = "1"

This will force (starting grid2op >= 1.5) grid2op to use the sequential runner and not deal with the added complexity of multi processing.

This is especially handy for “windows” system in case of trouble.

For information, as of writing (march 2021):

macOS with python <= 3.7 will behave like any python version on linux
windows and macOS with python >=3.8 will behave differently than linux but similarly to one another

Some common runner options:

Specify an agent instance and not a class 

By default, if you specify an agent class (eg AgentCLS), then the runner will initialize it with:

agent = AgentCLS(env.action_space)

But you might want to use agent initialized in a more complex way. To that end, you can customize the agent instance you want to use (and not only its class) with the following code:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=agent_instance)
res = runner.run(nb_episode=nn_episode)

Customize the scenarios 

You can customize the seeds, the scenarios ID you want, the number of initial steps to skip, the maximum duration of an episode etc. For more information, please refer to the Runner.run() for more information. But basically, you can do:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=agent_instance)
res = runner.run(nb_episode=nn_episode,

                 # nb process to use
                 nb_process=1,

                 # path where the outcome will be saved
                 path_save=None,

                 # max number of steps in an environment
                 max_iter=None,

                 # progress bar to use
                 pbar=False,

                 # seeds to use for the environment
                 env_seeds=None,

                 # seeds to use for the agent
                 agent_seeds=None,

                 # id the time serie to use
                 episode_id=None,

                 # whether to add the outcome (EpisodeData) as a result of this function
                 add_detailed_output=False,

                 # whether to keep track of the number of call to "high resolution simulator"
                 # (eg obs.simulate or obs.get_forecasted_env)
                 add_nb_highres_sim=False,

                 # which initial state you want the grid to be in
                 init_states=None,

                 # options passed  in `env.reset(..., options=XXX)`
                 reset_options=None,
                 )

Retrieve what has happened 

You can also easily retrieve the grid2op.Episode.EpisodeData representing your runs with:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=agent_instance)
res = runner.run(nb_episode=2,
                    add_detailed_output=True)
for *_, ep_data in res:
    # ep_data are the EpisodeData you can use to do whatever
    ...

Save the results 

You can save the results in a standardized format with:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(),
                agentClass=None,
                agentInstance=agent_instance)
res = runner.run(nb_episode=2,
                    save_path="A/PATH/SOMEWHERE")  # eg "/home/user/you/grid2op_results/this_run"

Multi processing 

You can also easily (on some platform) easily make the evaluation faster by using the “multi processing” python package with:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(),
                agentClass=None,
                agentInstance=agent_instance)
res = runner.run(nb_episode=2,
                    nb_process=2)

Customize the multi processing 

And, as of grid2op 1.10.3 you can know customize the multi processing context you want to use to evaluate your agent, like this:

import multiprocessing as mp
import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)

ctx = mp.get_context('spawn')  # or "fork" or "forkserver"
runner = Runner(**env.get_params_for_runner(),
                agentClass=None,
                agentInstance=agent_instance,
                mp_context=ctx)
res = runner.run(nb_episode=2,
                    nb_process=2)

If you set this, the multiprocessing Pool used to evaluate your agents will be made with:

with mp_context.Pool(nb_process) as p:
    ....

Otherwise the default “Pool” is used:

with Pool(nb_process) as p:
    ....

Detailed Documentation by class 

Classes:

Runner(init_env_path, init_grid_path, path_chron)

A runner is a utility tool that allows to run simulations more easily.

class grid2op.Runner.Runner(init_env_path: str, init_grid_path: str, path_chron, n_busbar=2, name_env='unknown', parameters_path=None, names_chronics_to_backend=None, actionClass=<class 'grid2op.Action.topologyAction.TopologyAction'>, observationClass=<class 'grid2op.Observation.completeObservation.CompleteObservation'>, rewardClass=<class 'grid2op.Reward.flatReward.FlatReward'>, legalActClass=<class 'grid2op.Rules.AlwaysLegal.AlwaysLegal'>, envClass=<class 'grid2op.Environment.environment.Environment'>, other_env_kwargs=None, gridStateclass=<class 'grid2op.Chronics.gridStateFromFile.GridStateFromFile'>, backendClass=<class 'grid2op.Backend.pandaPowerBackend.PandaPowerBackend'>, backend_kwargs=None, agentClass=<class 'grid2op.Agent.doNothing.DoNothingAgent'>, agentInstance=None, verbose=False, gridStateclass_kwargs={}, voltageControlerClass=<class 'grid2op.VoltageControler.ControlVoltageFromFile.ControlVoltageFromFile'>, thermal_limit_a=None, max_iter=-1, other_rewards={}, opponent_space_type=<class 'grid2op.Opponent.opponentSpace.OpponentSpace'>, opponent_action_class=<class 'grid2op.Action.dontAct.DontAct'>, opponent_class=<class 'grid2op.Opponent.baseOpponent.BaseOpponent'>, opponent_init_budget=0.0, opponent_budget_per_ts=0.0, opponent_budget_class=<class 'grid2op.Opponent.neverAttackBudget.NeverAttackBudget'>, opponent_attack_duration=0, opponent_attack_cooldown=99999, opponent_kwargs={}, grid_layout=None, with_forecast=True, attention_budget_cls=<class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget=None, has_attention_budget=False, logger=None, use_compact_episode_data=False, kwargs_observation=None, observation_bk_class=None, observation_bk_kwargs=None, mp_context=None, _read_from_local_dir=None, _is_test=False, _local_dir_cls=None, _overload_name_multimix=None)[source]

A runner is a utility tool that allows to run simulations more easily.

It is a more convenient way to execute the following loops:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

# use of a Runner
runner = Runner(**env.get_params_for_runner(), agentClass=RandomAgent)
res = runner.run(nb_episode=nn_episode)

###############
# the "equivalent" gym loops
nb_episode = 5
for i in range(nb_episode):
    obs = env.reset()
    done = False
    reward = env.reward_range[0]
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = env.step(act)
# but this loop does not handle the seeding, does not save the results
# does not store anything related to the run you made etc.
# the Runner can do that with simple calls (see bellow)
###############

This specific class as for main purpose to evaluate the performance of a trained grid2op.Agent.BaseAgent rather than to train it.

It has also the good property to be able to save the results of a experiment in a standardized manner described in the grid2op.Episode.EpisodeData.

NB we do not recommend to create a runner from scratch by providing all the arguments. We strongly encourage you to use the grid2op.Environment.Environment.get_params_for_runner() for creating a runner.

You can customize the agent instance you want with the following code:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=agent_instance)
res = runner.run(nb_episode=nn_episode)

You can customize the seeds, the scenarios ID you want, the number of initial steps to skip, the maximum duration of an episode etc. For more information, please refer to the Runner.run()

You can also easily retrieve the grid2op.Episode.EpisodeData representing your runs with:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=agent_instance)
res = runner.run(nb_episode=2,
                 add_detailed_output=True)
for *_, ep_data in res:
    # ep_data are the EpisodeData you can use to do whatever
    ...

You can save the results in a standardized format with:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=agent_instance)
res = runner.run(nb_episode=2,
                 save_path="A/PATH/SOMEWHERE")  # eg "/home/user/you/grid2op_results/this_run"

You can also easily (on some platform) easily make the evaluation faster by using the “multi processing” python package with:

import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=agent_instance)
res = runner.run(nb_episode=2,
                 nb_process=2)

And, as of grid2op 1.10.3 you can know customize the multi processing context you want to use to evaluate your agent, like this:

import multiprocessing as mp
import grid2op
from grid2op.Agent import RandomAgent # for example...
from grid2op.Runner import Runner

env = grid2op.make("l2rpn_case14_sandbox")

agent_instance = RandomAgent(env.action_space)

ctx = mp.get_context('spawn')  # or "fork" or "forkserver"
runner = Runner(**env.get_params_for_runner(),
                agentClass=None,
                agentInstance=agent_instance,
                mp_context=ctx)
res = runner.run(nb_episode=2,
                 nb_process=2)

If you set this, the multiprocessing Pool used to evaluate your agents will be made with:

with mp_context.Pool(nb_process) as p:
    ....

Otherwise the default “Pool” is used:

with Pool(nb_process) as p:
    ....

envClass

The type of the environment used for the game. The class should be given, and not an instance (object) of this class. The default is the grid2op.Environment. If modified, it should derived from this class.

Type:: type

other_env_kwargs

Other kwargs used to build the environment (None for “nothing”)

Type:: dict

actionClass

The type of action that can be performed by the agent / bot / controler. The class should be given, and not an instance of this class. This type should derived from grid2op.BaseAction. The default is grid2op.TopologyAction.

Type:: type

observationClass

This type represents the class that will be used to build the grid2op.BaseObservation visible by the grid2op.BaseAgent. As Runner.actionClass, this should be a type, and not and instance (object) of this type. This type should derived from grid2op.BaseObservation. The default is grid2op.CompleteObservation.

Type:: type

rewardClass

Representes the type used to build the rewards that are given to the BaseAgent. As Runner.actionClass, this should be a type, and not and instance (object) of this type. This type should derived from grid2op.BaseReward. The default is grid2op.ConstantReward that should not be used to train or evaluate an agent, but rather as debugging purpose.

Type:: type

gridStateclass

This types control the mechanisms to read chronics and assign data to the powergrid. Like every “.*Class” attributes the type should be pass and not an intance (object) of this type. Its default is grid2op.GridStateFromFile and it must be a subclass of grid2op.GridValue.

Type:: type

legalActClass

This types control the mechanisms to assess if an grid2op.BaseAction is legal. Like every “.*Class” attributes the type should be pass and not an intance (object) of this type. Its default is grid2op.AlwaysLegal and it must be a subclass of grid2op.BaseRules.

Type:: type

backendClass

This types control the backend, eg. the software that computes the powerflows. Like every “.*Class” attributes the type should be pass and not an intance (object) of this type. Its default is grid2op.PandaPowerBackend and it must be a subclass of grid2op.Backend.

Type:: type

backend_kwargs

Optional arguments used to build the backend. These arguments will not be copied to create the backend used by the runner. They might required to be pickeable on some plateform when using multi processing.

Type:: dict

agentClass

This types control the type of BaseAgent, eg. the bot / controler that will take grid2op.BaseAction and avoid cascading failures. Like every “.*Class” attributes the type should be pass and not an intance (object) of this type. Its default is grid2op.DoNothingAgent and it must be a subclass of grid2op.BaseAgent.

Type:: type

logger: A object than can be used to log information, either in a text file, or by printing them to the command prompt.

init_grid_path

This attributes store the path where the powergrid data are located. If a relative path is given, it will be extended as an absolute path.

Type:: str

names_chronics_to_backend

See description of grid2op.ChronicsHelper.initialize() for more information about this dictionnary

Type:: dict

parameters_path

Where to look for the grid2op.Environment grid2op.Parameters. It defaults to None which corresponds to using default values.

Type:: str, optional

parameters

Type of _parameters used. This is an instance (object) of type grid2op.Parameters initialized from Runner.parameters_path

Type:: grid2op.Parameters

path_chron

Path indicatng where to look for temporal data.

Type:: str

chronics_handler

Initialized from Runner.gridStateclass and Runner.path_chron it represents the input data used to generate grid state by the Runner.env

Type:: grid2op.ChronicsHandler

backend

Used to compute the powerflow. This object has the type given by Runner.backendClass

Type:: grid2op.Backend

env

Represents the environment which the agent / bot / control must control through action. It is initialized from the Runner.envClass

Type:: grid2op.Environment

agent

Represents the agent / bot / controler that takes action performed on a environment (the powergrid) to maximize a certain reward.

Type:: grid2op.Agent

verbose

If True then detailed output of each steps are written.

Type:: bool

gridStateclass_kwargs

Additional keyword arguments used to build the Runner.chronics_handler

Type:: dict

thermal_limit_a

The thermal limit for the environment (if any).

Type:: numpy.ndarray

opponent_action_class

The action class used for the opponent. The opponent will not be able to use action that are invalid with the given action class provided. It defaults to grid2op.Action.DontAct which forbid any type of action possible.

Type:: type, optional

opponent_class

The opponent class to use. The default class is grid2op.Opponent.BaseOpponent which is a type of opponents that does nothing.

Type:: type, optional

opponent_init_budget

The initial budget of the opponent. It defaults to 0.0 which means the opponent cannot perform any action if this is not modified.

Type:: float, optional

opponent_budget_per_ts

The budget increase of the opponent per time step

Type:: float, optional

opponent_budget_class

The class used to compute the attack cost.

Type:: type, optional

grid_layout

The layout of the grid (position of each substation) usefull if you need to plot some things for example.

Type:: dict, optional

TODO

_attention_budget_cls=LinearAttentionBudget,

_kwargs_attention_budget=None,

_has_attention_budget=False

Examples

Different examples are showed in the description of the main method Runner.run()

Notes

Runner does not necessarily behave normally when “nb_process” is not 1 on some platform (windows and some version of macos). Please read the documentation, and especially the Note on parallel processing for more information and possible way to disable this feature.

Methods:

`__init__`(init_env_path, init_grid_path, ...)	Initialize the Runner.
`_clean_up`()	INTERNAL
`_run_parrallel`(nb_episode[, nb_process, ...])	INTERNAL
`_run_sequential`(nb_episode[, path_save, ...])	INTERNAL
`init_env`()	INTERNAL
`reset`()	INTERNAL
`run`(nb_episode, *[, nb_process, path_save, ...])	Main method of the `Runner` class.
`run_one_episode`([indx, path_save, pbar, ...])	INTERNAL

Attributes:

__weakref__

list of weak references to the object (if defined)

__init__(init_env_path: str, init_grid_path: str, path_chron, n_busbar=2, name_env='unknown', parameters_path=None, names_chronics_to_backend=None, actionClass=<class 'grid2op.Action.topologyAction.TopologyAction'>, observationClass=<class 'grid2op.Observation.completeObservation.CompleteObservation'>, rewardClass=<class 'grid2op.Reward.flatReward.FlatReward'>, legalActClass=<class 'grid2op.Rules.AlwaysLegal.AlwaysLegal'>, envClass=<class 'grid2op.Environment.environment.Environment'>, other_env_kwargs=None, gridStateclass=<class 'grid2op.Chronics.gridStateFromFile.GridStateFromFile'>, backendClass=<class 'grid2op.Backend.pandaPowerBackend.PandaPowerBackend'>, backend_kwargs=None, agentClass=<class 'grid2op.Agent.doNothing.DoNothingAgent'>, agentInstance=None, verbose=False, gridStateclass_kwargs={}, voltageControlerClass=<class 'grid2op.VoltageControler.ControlVoltageFromFile.ControlVoltageFromFile'>, thermal_limit_a=None, max_iter=-1, other_rewards={}, opponent_space_type=<class 'grid2op.Opponent.opponentSpace.OpponentSpace'>, opponent_action_class=<class 'grid2op.Action.dontAct.DontAct'>, opponent_class=<class 'grid2op.Opponent.baseOpponent.BaseOpponent'>, opponent_init_budget=0.0, opponent_budget_per_ts=0.0, opponent_budget_class=<class 'grid2op.Opponent.neverAttackBudget.NeverAttackBudget'>, opponent_attack_duration=0, opponent_attack_cooldown=99999, opponent_kwargs={}, grid_layout=None, with_forecast=True, attention_budget_cls=<class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget=None, has_attention_budget=False, logger=None, use_compact_episode_data=False, kwargs_observation=None, observation_bk_class=None, observation_bk_kwargs=None, mp_context=None, _read_from_local_dir=None, _is_test=False, _local_dir_cls=None, _overload_name_multimix=None)[source]

Initialize the Runner.

Parameters:

init_grid_path (str) – Madantory, used to initialize Runner.init_grid_path.
path_chron (str) – Madantory where to look for chronics data, used to initialize Runner.path_chron.
parameters_path (str or dict, optional) – Used to initialize Runner.parameters_path. If it’s a string, this will suppose parameters are located at this path, if it’s a dictionary, this will use the parameters converted from this dictionary.
names_chronics_to_backend (dict, optional) – Used to initialize Runner.names_chronics_to_backend.
actionClass (type, optional) – Used to initialize Runner.actionClass.
observationClass (type, optional) – Used to initialize Runner.observationClass.
rewardClass (type, optional) – Used to initialize Runner.rewardClass. Default to grid2op.ConstantReward that should not* be used to train or evaluate an agent, but rather as debugging purpose.
legalActClass (type, optional) – Used to initialize Runner.legalActClass.
envClass (type, optional) – Used to initialize Runner.envClass.
gridStateclass (type, optional) – Used to initialize Runner.gridStateclass.
backendClass (type, optional) – Used to initialize Runner.backendClass.
agentClass (type, optional) – Used to initialize Runner.agentClass.
agentInstance (grid2op.Agent.Agent) – Used to initialize the agent. Note that either agentClass or agentInstance is used at the same time. If both ot them are None or both of them are “not None” it throw an error.
verbose (bool, optional) – Used to initialize Runner.verbose.
thermal_limit_a (numpy.ndarray) – The thermal limit for the environment (if any).
voltagecontrolerClass (grid2op.VoltageControler.ControlVoltageFromFile, optional) – The controler that will change the voltage setpoints of the generators.
use_compact_episode_data (bool, optional) – Whether to use grid2op.Episode.CompactEpisodeData instead of grid2op.Episode.EpisodeData to store Episode to disk (allows it to be replayed later). Defaults to False.
opponent (# TODO documentation on the) –
budget (# TOOD doc for the attention) –

__weakref__: list of weak references to the object (if defined)

_clean_up()[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

close the environment if it has been created

_run_parrallel(nb_episode, nb_process=1, path_save=None, env_seeds=None, agent_seeds=None, max_iter=None, episode_id=None, add_detailed_output=False, add_nb_highres_sim=False, init_states=None, reset_options=None) → List[Tuple[str, str, float, int, int] | Tuple[str, str, float, int, int, EpisodeData] | Tuple[str, str, float, int, int, EpisodeData, int]][source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

This method will run in parallel, independently the nb_episode over nb_process.

In case the agent cannot be cloned using copy.copy: nb_process is set to 1

Note that it restarts completely the Runner.backend and Runner.env if the computation is actually performed with more than 1 cores (nb_process > 1)

It uses the python multiprocess, and especially the multiprocess.Pool to perform the computations. This implies that all runs are completely independent (they happen in different process) and that the memory consumption can be big. Tests may be recommended if the amount of RAM is low.

It has the same return type as the Runner.run_sequential().

Parameters:

nb_episode (int) – Number of episode to simulate
nb_process (int, optional) – Number of process used to play the nb_episode. Default to 1.
path_save (str, optional) – If not None, it specifies where to store the data. See the description of this module Runner for more information
env_seeds (list) – An iterable of the seed used for the experiments. By default None, no seeds are set. If provided, its size should match nb_episode.
agent_seeds (list) – An iterable that contains the seed used for the environment. By default None means no seeds are set. If provided, its size should match the nb_episode. The agent will be seeded at the beginning of each scenario BEFORE calling agent.reset().
add_detailed_output – See Runner.run() method
init_states – See Runner.run() method

Returns:

res –

List of tuple. Each tuple having 3 elements:

”i” unique identifier of the episode (compared to Runner.run_sequential(), the elements of the returned list are not necessarily sorted by this value)

”cum_reward” the cumulative reward obtained by the Runner.BaseAgent on this episode i

”nb_time_step”: the number of time steps played in this episode.

”max_ts” : the maximum number of time steps of the chronics

”episode_data” : The EpisodeData corresponding to this episode run

Return type:

list

_run_sequential(nb_episode, path_save=None, pbar=False, env_seeds=None, agent_seeds=None, max_iter=None, episode_id=None, add_detailed_output=False, add_nb_highres_sim=False, init_states=None, reset_options=None) → List[Tuple[str, str, float, int, int] | Tuple[str, str, float, int, int, EpisodeData] | Tuple[str, str, float, int, int, EpisodeData, int]][source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

This method is called to see how well an agent performed on a sequence of episode.

Parameters:

nb_episode (int) – Number of episode to play.
path_save (str, optional) – If not None, it specifies where to store the data. See the description of this module Runner for more information
pbar (bool or type or object) –
How to display the progress bar, understood as follow:
- if pbar is None nothing is done.
- if pbar is a boolean, tqdm pbar are used, if tqdm package is available and installed on the system [if true]. If it’s false it’s equivalent to pbar being None
- if pbar is a type ( a class), it is used to build a progress bar at the highest level (episode) and and the lower levels (step during the episode). If it’s a type it muyst accept the argument “total” and “desc” when being built, and the closing is ensured by this method.
- if pbar is an object (an instance of a class) it is used to make a progress bar at this highest level (episode) but not at lower levels (setp during the episode)
env_seeds (list) – An iterable of the seed used for the experiments. By default None, no seeds are set. If provided, its size should match nb_episode.
episode_id (list) – For each of the nb_episdeo you want to compute, it specifies the id of the chronix that will be used. By default None, no seeds are set. If provided, its size should match nb_episode.
add_detailed_output – see Runner.run() method
init_states – see Runner.run() method

Returns:

res –

List of tuple. Each tuple having 5 elements:

”id_chron” unique identifier of the episode

”name_chron” name of chronics

”cum_reward” the cumulative reward obtained by the Runner.BaseAgent on this episode i

”nb_time_step”: the number of time steps played in this episode.

”max_ts” : the maximum number of time steps of the chronics

”episode_data” : The EpisodeData corresponding to this episode run

Return type:

list

init_env() → BaseEnv[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Function used to initialized the environment and the agent. It is called by Runner.reset().

reset()[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Used to reset an environment. This method is called at the beginning of each new episode. If the environment is not initialized, then it initializes it with Runner.init_env().

run(nb_episode, *, nb_process=1, path_save=None, max_iter=None, pbar=False, env_seeds=None, agent_seeds=None, episode_id=None, add_detailed_output=False, add_nb_highres_sim=False, init_states=None, reset_options=None) → List[Tuple[str, str, float, int, int] | Tuple[str, str, float, int, int, EpisodeData] | Tuple[str, str, float, int, int, EpisodeData, int]][source]

Main method of the Runner class. It will either call Runner._run_sequential() if “nb_process” is 1 or Runner._run_parrallel() if nb_process >= 2.

Parameters:

nb_episode (int) – Number of episode to simulate
nb_process (int, optional) – Number of process used to play the nb_episode. Default to 1. NB Multitoprocessing is deactivated on windows based platform (it was not fully supported so we decided to remove it)
path_save (str, optional) – If not None, it specifies where to store the data. See the description of this module Runner for more information
max_iter (int) –
Maximum number of iteration you want the runner to perform.

Warning

(only for grid2op >= 1.10.3) If set in this parameters, it will erase all values that may be present in the reset_options kwargs (key “max step”)
pbar (bool or type or object) –
How to display the progress bar, understood as follow:
- if pbar is None nothing is done.
- if pbar is a boolean, tqdm pbar are used, if tqdm package is available and installed on the system [if true]. If it’s false it’s equivalent to pbar being None
- if pbar is a type ( a class), it is used to build a progress bar at the highest level (episode) and and the lower levels (step during the episode). If it’s a type it muyst accept the argument “total” and “desc” when being built, and the closing is ensured by this method.
- if pbar is an object (an instance of a class) it is used to make a progress bar at this highest level (episode) but not at lower levels (setp during the episode)
env_seeds (list) – An iterable of the seed used for the environment. By default None, no seeds are set. If provided, its size should match nb_episode.
agent_seeds (list) – An iterable that contains the seed used for the environment. By default None means no seeds are set. If provided, its size should match the nb_episode. The agent will be seeded at the beginning of each scenario BEFORE calling agent.reset().
episode_id (list) –
For each of the nb_episdeo you want to compute, it specifies the id of the chronix that will be used. By default None, no seeds are set. If provided, its size should match nb_episode.

Warning

(only for grid2op >= 1.10.3) If set in this parameters, it will erase all values that may be present in the reset_options kwargs (key “time serie id”).

Danger

As of now, it’s not properly handled to compute twice the same episode_id more than once using the runner (more specifically, the computation will happen but file might not be saved correctly on the hard drive: attempt to save all the results in the same location. We do not advise to do it)
add_detailed_output (bool) – A flag to add an EpisodeData object to the results, containing a lot of information about the run
add_nb_highres_sim (bool) – Whether to add an estimated number of “high resolution simulator” called performed by the agent (either by obs.simulate, or by obs.get_forecast_env or by obs.get_simulator)
init_states –
(added in grid2op 1.10.2) Possibility to set the initial state of the powergrid (when calling env.reset). It should either be:
- a dictionary representing an action (see doc of grid2op.Environment.Environment.reset())
- a grid2op action (see doc of grid2op.Environment.Environment.reset())
- a list / tuple of one of the above with the same size as the number of episode you want.
If you provide a dictionary or a grid2op action, then this element will be used for all scenarios you want to run.

Warning

(only for grid2op >= 1.10.3) If set in this parameters, it will erase all values that may be present in the reset_options kwargs (key “init state”).
reset_options –
(added in grid2op 1.10.3) Possibility to customize the call to env.reset made internally by the Runner. More specifically, it will pass a custom options when the runner calls env.reset(…, options=XXX).

It should either be:
- a dictionary that can be used directly by grid2op.Environment.Environment.reset(). In this case the same dictionary will be used for all the episodes computed by the runner.
- a list / tuple of one of the above with the same size as the number of episode you want to compute which allow a full customization for each episode.
Warning

If the kwargs max_iter is present when calling runner.run function, then the key max step will be ignored in all the reset_options dictionary.

Warning

If the kwargs episode_id is present when calling runner.run function, then the key time serie id will be ignored in all the reset_options dictionary.

Warning

If the kwargs init_states is present when calling runner.run function, then the key init state will be ignored in all the reset_options dictionary.

Danger

If you provide the key “time serie id” in one of the reset_options dictionary, we recommend you do it for all reset options otherwise you might not end up computing the correct episodes.

Danger

As of now, it’s not properly handled to compute twice the same time serie more than once using the runner (more specifically, the computation will happen but file might not be saved correctly on the hard drive: attempt to save all the results in the same location. We do not advise to do it)

Returns:

res –

List of tuple. Each tuple having 3[4] elements:

”id_chron” unique identifier of the episode

”name_chron” name of the time series (usually it is the path where it is stored)

”cum_reward” the cumulative reward obtained by the Runner.Agent on this episode i

”nb_time_step”: the number of time steps played in this episode.

”total_step”: the total number of time steps possible in this episode.

”episode_data” : [Optional] The EpisodeData corresponding to this episode run only if add_detailed_output=True

”add_nb_highres_sim”: [Optional] The estimated number of calls to high resolution simulator made by the agent. Only preset if add_nb_highres_sim=True in the kwargs

Return type:

list

Examples

You can use the runner this way:

import grid2op
from gri2op.Runner import Runner
from grid2op.Agent import RandomAgent

env = grid2op.make("l2rpn_case14_sandbox")
runner = Runner(**env.get_params_for_runner(), agentClass=RandomAgent)
res = runner.run(nb_episode=1)

If you would rather to provide an agent instance (and not a class) you can do it this way:

import grid2op
from gri2op.Runner import Runner
from grid2op.Agent import RandomAgent

env = grid2op.make("l2rpn_case14_sandbox")
my_agent = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=my_agent)
res = runner.run(nb_episode=1)

Finally, in the presence of stochastic environments or stochastic agent you might want to set the seeds for ensuring reproducible experiments you might want to seed both the environment and your agent. You can do that by passing env_seeds and agent_seeds parameters (on the example bellow, the agent will be seeded with 42 and the environment with 0.

import grid2op
from gri2op.Runner import Runner
from grid2op.Agent import RandomAgent

env = grid2op.make("l2rpn_case14_sandbox")
my_agent = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=my_agent)
res = runner.run(nb_episode=1, agent_seeds=[42], env_seeds=[0])

Since grid2op 1.10.2 you can also set the initial state of the grid when calling the runner. You can do that with the kwargs init_states, for example like this:

import grid2op
from gri2op.Runner import Runner
from grid2op.Agent import RandomAgent

env = grid2op.make("l2rpn_case14_sandbox")
my_agent = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=my_agent)
res = runner.run(nb_episode=1,
                 agent_seeds=[42],
                 env_seeds=[0],
                 init_states=[{"set_line_status": [(0, -1)]}]
                 )

Note

We recommend that you provide init_states as a list having a length of nb_episode. Each episode will be initialized with the provided element of the list. However, if you provide only one element, then all episodes you want to compute will be initialized with this same action.

Note

At the beginning of each episode, if an init_state is set, the environment is reset with a call like: env.reset(options={“init state”: init_state})

This is why we recommend you to use dictionary to set the initial state so that you can control what exactly is done (set the “method”) more information about this on the doc of the grid2op.Environment.Environment.reset() function.

Since grid2op 1.10.3 you can also customize the way the runner will “reset” the environment with the kwargs reset_options.

Concretely, if you specify runner.run(…, reset_options=XXX) then the environment will be reset with a call to env.reset(options=reset_options).

As for the init states kwargs, reset_options can be either a dictionnary, in this case the same dict will be used for running all the episode or a list / tuple of dictionnaries with the same size as the nb_episode kwargs.

import grid2op
from gri2op.Runner import Runner
from grid2op.Agent import RandomAgent

env = grid2op.make("l2rpn_case14_sandbox")
my_agent = RandomAgent(env.action_space)
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=my_agent)
res = runner.run(nb_episode=2,
                 agent_seeds=[42, 43],
                 env_seeds=[0, 1],
                 reset_options={"init state": {"set_line_status": [(0, -1)]}}
                 )
# same initial state will be used for the two epusode

res2 = runner.run(nb_episode=2,
                  agent_seeds=[42, 43],
                  env_seeds=[0, 1],
                  reset_options=[{"init state": {"set_line_status": [(0, -1)]}},
                                 {"init state": {"set_line_status": [(1, -1)]}}]
                  )
# two different initial states will be used: the first one for the
# first episode and the second one for the second

Note

In case of conflicting inputs, for example when you specify:

runner.run(...,
           init_states=XXX,
           reset_options={"init state"=YYY}
           )

or

runner.run(...,
           max_iter=XXX,
           reset_options={"max step"=YYY}
           )

or

runner.run(...,
           episode_id=XXX,
           reset_options={"time serie id"=YYY}
           )

Then: 1) a warning is issued to inform you that you might have done something wrong and 2) the value in XXX above (ie the value provided in the runner.run kwargs) is always used instead of the value YYY (ie the value present in the reset_options).

In other words, the arguments of the runner.run have the priority over the arguments passed to the reset_options.

Danger

If you provide the key “time serie id” in one of the reset_options dictionary, we recommend you do it for all reset_options otherwise you might not end up computing the correct episodes.

run_one_episode(indx=0, path_save=None, pbar=False, env_seed=None, max_iter=None, agent_seed=None, episode_id=None, detailed_output=False, add_nb_highres_sim=False, init_state=None, reset_options=None) → Tuple[str, str, float, int, int] | Tuple[str, str, float, int, int, EpisodeData] | Tuple[str, str, float, int, int, EpisodeData, int][source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Function used to run one episode of the Runner.agent and see how it performs in the Runner.env.

Parameters:

indx (int) – The index of the episode to run (ignored if episode_id is not None)
path_save (str, optional) – Path where to save the data. See the description of grid2op.Runner for the structure of the saved file.
detailed_output – See descr. of Runner.run() method
add_nb_highres_sim – See descr. of Runner.run() method

Returns:

TODO DEPRECATED DOC
cum_reward (np.float32) – The cumulative reward obtained by the agent during this episode
time_step (int) – The number of timesteps that have been played before the end of the episode (because of a “game over” or because there were no more data)

If you still can’t find what you’re looking for, try in one of the following pages:

Still trouble finding the information ? Do not hesitate to send a github issue about the documentation at this link: Documentation issue template