Runner¶
This page is organized as follow:
Objectives¶
The runner class aims at:
facilitate the evaluation of the performance of
grid2op.Agent
by performing automatically the “open ai gym loop” (see below)define a format to store the results of the evaluation of such agent in a standardized manner
this “agent logs” can then be re read by third party applications, such as grid2viz or by internal class to ease the study of the behaviour of such agent, for example with the classes
grid2op.Episode.EpisodeData
orgrid2op.Episode.EpisodeReplay
allow easy use of parallelization of this assessment.
Basically, the runner simplifies the assessment of the performance of some agent. This is the “usual” gym code to run an agent:
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make()
agent = RandomAgent(env.action_space)
NB_EPISODE = 10 # assess the performance for 10 episodes, for example
for i in range(NB_EPISODE):
reward = env.reward_range[0]
done = False
obs = env.reset()
while not done:
act = agent.act(obs, reward, done)
obs, reward, done, info = env.step(act)
The above code does not store anything, cannot be run easily in parallel and is already pretty verbose. To have a shorter code, that saves most of the data (and make it easier to integrate it with other applications) we can use the runner the following way:
import grid2op
from grid2op.Runner import Runner
from grid2op.Agent import RandomAgent
env = grid2op.make()
NB_EPISODE = 10 # assess the performance for 10 episodes, for example
NB_CORE = 2 # do it on 2 cores, for example
PATH_SAVE = "agents_log" # and store the results in the "agents_log" folder
runner = Runner(**env.get_params_for_runner(), agentClass=RandomAgent)
runner.run(nb_episode=NB_EPISODE, nb_process=NB_CORE, path_save=PATH_SAVE)
As we can see, with less lines of code, we could execute parallel assessment of our agent, on 10 episode and save the results (observations, actions, rewards, etc.) into a dedicated folder.
If your agent is inialiazed with a custom __init__ method that takes more than the action space to be built, you can also use the Runner pretty easily by passing it an instance of your agent, for example:
import grid2op
from grid2op.Runner import Runner
env = grid2op.make()
NB_EPISODE = 10 # assess the performance for 10 episodes, for example
NB_CORE = 2 # do it on 2 cores, for example
PATH_SAVE = "agents_log" # and store the results in the "agents_log" folder
# initilize your agent
my_agent = FancyAgentWithCustomInitialization(env.action_space,
env.observation_space,
"whatever else you want"
)
# and proceed as following for the runner
runner = Runner(**env.get_params_for_runner(), agentClass=None, agentInstance=my_agent)
runner.run(nb_episode=NB_EPISODE, nb_process=NB_CORE, path_save=PATH_SAVE)
Other tools are available for this runner class, for example the easy integration of progress bars. See bellow for more information.
Note on parallel processing¶
The “Runner” class allows for parallel execution of the same agent on different scenarios. In this case, each scenario will be run in independent process.
Depending on the platform and python version, you might end up with some bugs and error like
AttributeError: Can’t get attribute ‘ActionSpace_l2rpn_case14_sandbox’ on <module ‘grid2op.Space.GridObjects’ from ‘/lib/python3.8/site-packages/grid2op/Space/GridObjects.py’> Process SpawnPoolWorker-4:
or like:
File “/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/pool.py”, line 125, in worker result = (True, func(*args, **kwds))
File “/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/multiprocessing/pool.py”, line 51, in starmapstar return list(itertools.starmap(args[0], args[1]))
In this case this means grid2op has a hard time dealing with the multi processing part. In that case, it is recommended to disable it completely, for example by using, before any call to “runner.run” the following code:
import os
from grid2op.Runner import Runner
os.environ[Runner.FORCE_SEQUENTIAL] = "1"
This will force (starting grid2op >= 1.5) grid2op to use the sequential runner and not deal with the added complexity of multi processing.
This is especially handy for “windows” system in case of trouble.
For information, as of writing (march 2021):
macOS with python <= 3.7 will behave like any python version on linux
windows and macOS with python >=3.8 will behave differently than linux but similarly to one another
Detailed Documentation by class¶
Classes:
|
A runner is a utility tool that allows to run simulations more easily. |
- class grid2op.Runner.Runner(init_env_path: str, init_grid_path: str, path_chron, name_env='unknown', parameters_path=None, names_chronics_to_backend=None, actionClass=<class 'grid2op.Action.topologyAction.TopologyAction'>, observationClass=<class 'grid2op.Observation.completeObservation.CompleteObservation'>, rewardClass=<class 'grid2op.Reward.flatReward.FlatReward'>, legalActClass=<class 'grid2op.Rules.AlwaysLegal.AlwaysLegal'>, envClass=<class 'grid2op.Environment.environment.Environment'>, other_env_kwargs=None, gridStateclass=<class 'grid2op.Chronics.gridStateFromFile.GridStateFromFile'>, backendClass=<class 'grid2op.Backend.pandaPowerBackend.PandaPowerBackend'>, backend_kwargs=None, agentClass=<class 'grid2op.Agent.doNothing.DoNothingAgent'>, agentInstance=None, verbose=False, gridStateclass_kwargs={}, voltageControlerClass=<class 'grid2op.VoltageControler.ControlVoltageFromFile.ControlVoltageFromFile'>, thermal_limit_a=None, max_iter=-1, other_rewards={}, opponent_space_type=<class 'grid2op.Opponent.opponentSpace.OpponentSpace'>, opponent_action_class=<class 'grid2op.Action.dontAct.DontAct'>, opponent_class=<class 'grid2op.Opponent.baseOpponent.BaseOpponent'>, opponent_init_budget=0.0, opponent_budget_per_ts=0.0, opponent_budget_class=<class 'grid2op.Opponent.neverAttackBudget.NeverAttackBudget'>, opponent_attack_duration=0, opponent_attack_cooldown=99999, opponent_kwargs={}, grid_layout=None, with_forecast=True, attention_budget_cls=<class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget=None, has_attention_budget=False, logger=None, kwargs_observation=None, observation_bk_class=None, observation_bk_kwargs=None, _read_from_local_dir=False, _is_test=False)[source]¶
A runner is a utility tool that allows to run simulations more easily.
It is a more convenient way to execute the following loops:
import grid2op from grid2op.Agent import RandomAgent # for example... from grid2op.Runner import Runner env = grid2op.make() ############### # the gym loops nb_episode = 5 for i in range(nb_episode): obs = env.reset() done = False reward = env.reward_range[0] while not done: act = agent.act(obs, reward, done) obs, reward, done, info = env.step(act) ############### # equivalent with use of a Runner runner = Runner(**env.get_params_for_runner(), agentClass=RandomAgent) res = runner.run(nb_episode=nn_episode)
This specific class as for main purpose to evaluate the performance of a trained
grid2op.Agent.BaseAgent
rather than to train it.It has also the good property to be able to save the results of a experiment in a standardized manner described in the
grid2op.Episode.EpisodeData
.NB we do not recommend to create a runner from scratch by providing all the arguments. We strongly encourage you to use the
grid2op.Environment.Environment.get_params_for_runner()
for creating a runner.- envClass¶
The type of the environment used for the game. The class should be given, and not an instance (object) of this class. The default is the
grid2op.Environment
. If modified, it should derived from this class.- Type:
type
- other_env_kwargs¶
Other kwargs used to build the environment (None for “nothing”)
- Type:
dict
- actionClass¶
The type of action that can be performed by the agent / bot / controler. The class should be given, and not an instance of this class. This type should derived from
grid2op.BaseAction
. The default isgrid2op.TopologyAction
.- Type:
type
- observationClass¶
This type represents the class that will be used to build the
grid2op.BaseObservation
visible by thegrid2op.BaseAgent
. AsRunner.actionClass
, this should be a type, and not and instance (object) of this type. This type should derived fromgrid2op.BaseObservation
. The default isgrid2op.CompleteObservation
.- Type:
type
- rewardClass¶
Representes the type used to build the rewards that are given to the
BaseAgent
. AsRunner.actionClass
, this should be a type, and not and instance (object) of this type. This type should derived fromgrid2op.BaseReward
. The default isgrid2op.ConstantReward
that should not be used to train or evaluate an agent, but rather as debugging purpose.- Type:
type
- gridStateclass¶
This types control the mechanisms to read chronics and assign data to the powergrid. Like every “.*Class” attributes the type should be pass and not an intance (object) of this type. Its default is
grid2op.GridStateFromFile
and it must be a subclass ofgrid2op.GridValue
.- Type:
type
- legalActClass¶
This types control the mechanisms to assess if an
grid2op.BaseAction
is legal. Like every “.*Class” attributes the type should be pass and not an intance (object) of this type. Its default isgrid2op.AlwaysLegal
and it must be a subclass ofgrid2op.BaseRules
.- Type:
type
- backendClass¶
This types control the backend, eg. the software that computes the powerflows. Like every “.*Class” attributes the type should be pass and not an intance (object) of this type. Its default is
grid2op.PandaPowerBackend
and it must be a subclass ofgrid2op.Backend
.- Type:
type
- backend_kwargs¶
Optional arguments used to build the backend. These arguments will not be copied to create the backend used by the runner. They might required to be pickeable on some plateform when using multi processing.
- Type:
dict
- agentClass¶
This types control the type of BaseAgent, eg. the bot / controler that will take
grid2op.BaseAction
and avoid cascading failures. Like every “.*Class” attributes the type should be pass and not an intance (object) of this type. Its default isgrid2op.DoNothingAgent
and it must be a subclass ofgrid2op.BaseAgent
.- Type:
type
- logger¶
A object than can be used to log information, either in a text file, or by printing them to the command prompt.
- init_grid_path¶
This attributes store the path where the powergrid data are located. If a relative path is given, it will be extended as an absolute path.
- Type:
str
- names_chronics_to_backend¶
See description of
grid2op.ChronicsHelper.initialize()
for more information about this dictionnary- Type:
dict
- parameters_path¶
Where to look for the
grid2op.Environment
grid2op.Parameters
. It defaults toNone
which corresponds to using default values.- Type:
str
, optional
- parameters¶
Type of _parameters used. This is an instance (object) of type
grid2op.Parameters
initialized fromRunner.parameters_path
- Type:
- path_chron¶
Path indicatng where to look for temporal data.
- Type:
str
- chronics_handler¶
Initialized from
Runner.gridStateclass
andRunner.path_chron
it represents the input data used to generate grid state by theRunner.env
- Type:
grid2op.ChronicsHandler
- backend¶
Used to compute the powerflow. This object has the type given by
Runner.backendClass
- Type:
- env¶
Represents the environment which the agent / bot / control must control through action. It is initialized from the
Runner.envClass
- Type:
- agent¶
Represents the agent / bot / controler that takes action performed on a environment (the powergrid) to maximize a certain reward.
- Type:
- verbose¶
If
True
then detailed output of each steps are written.- Type:
bool
- gridStateclass_kwargs¶
Additional keyword arguments used to build the
Runner.chronics_handler
- Type:
dict
- thermal_limit_a¶
The thermal limit for the environment (if any).
- Type:
numpy.ndarray
- opponent_action_class¶
The action class used for the opponent. The opponent will not be able to use action that are invalid with the given action class provided. It defaults to
grid2op.Action.DontAct
which forbid any type of action possible.- Type:
type
, optional
- opponent_class¶
The opponent class to use. The default class is
grid2op.Opponent.BaseOpponent
which is a type of opponents that does nothing.- Type:
type
, optional
- opponent_init_budget¶
The initial budget of the opponent. It defaults to 0.0 which means the opponent cannot perform any action if this is not modified.
- Type:
float
, optional
- opponent_budget_per_ts¶
The budget increase of the opponent per time step
- Type:
float
, optional
- opponent_budget_class¶
The class used to compute the attack cost.
- Type:
type
, optional
- grid_layout¶
The layout of the grid (position of each substation) usefull if you need to plot some things for example.
- Type:
dict
, optional
- TODO¶
- _attention_budget_cls=LinearAttentionBudget,
- _kwargs_attention_budget=None,
- _has_attention_budget=False
Examples
Different examples are showed in the description of the main method
Runner.run()
Notes
Runner does not necessarily behave normally when “nb_process” is not 1 on some platform (windows and some version of macos). Please read the documentation, and especially the Note on parallel processing for more information and possible way to disable this feature.
Methods:
__init__
(init_env_path, init_grid_path, ...)Initialize the Runner.
INTERNAL
_run_parrallel
(nb_episode[, nb_process, ...])INTERNAL
_run_sequential
(nb_episode[, path_save, ...])INTERNAL
init_env
()INTERNAL
reset
()INTERNAL
run
(nb_episode[, nb_process, path_save, ...])Main method of the
Runner
class.run_one_episode
([indx, path_save, pbar, ...])INTERNAL
Attributes:
list of weak references to the object (if defined)
- __init__(init_env_path: str, init_grid_path: str, path_chron, name_env='unknown', parameters_path=None, names_chronics_to_backend=None, actionClass=<class 'grid2op.Action.topologyAction.TopologyAction'>, observationClass=<class 'grid2op.Observation.completeObservation.CompleteObservation'>, rewardClass=<class 'grid2op.Reward.flatReward.FlatReward'>, legalActClass=<class 'grid2op.Rules.AlwaysLegal.AlwaysLegal'>, envClass=<class 'grid2op.Environment.environment.Environment'>, other_env_kwargs=None, gridStateclass=<class 'grid2op.Chronics.gridStateFromFile.GridStateFromFile'>, backendClass=<class 'grid2op.Backend.pandaPowerBackend.PandaPowerBackend'>, backend_kwargs=None, agentClass=<class 'grid2op.Agent.doNothing.DoNothingAgent'>, agentInstance=None, verbose=False, gridStateclass_kwargs={}, voltageControlerClass=<class 'grid2op.VoltageControler.ControlVoltageFromFile.ControlVoltageFromFile'>, thermal_limit_a=None, max_iter=-1, other_rewards={}, opponent_space_type=<class 'grid2op.Opponent.opponentSpace.OpponentSpace'>, opponent_action_class=<class 'grid2op.Action.dontAct.DontAct'>, opponent_class=<class 'grid2op.Opponent.baseOpponent.BaseOpponent'>, opponent_init_budget=0.0, opponent_budget_per_ts=0.0, opponent_budget_class=<class 'grid2op.Opponent.neverAttackBudget.NeverAttackBudget'>, opponent_attack_duration=0, opponent_attack_cooldown=99999, opponent_kwargs={}, grid_layout=None, with_forecast=True, attention_budget_cls=<class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget=None, has_attention_budget=False, logger=None, kwargs_observation=None, observation_bk_class=None, observation_bk_kwargs=None, _read_from_local_dir=False, _is_test=False)[source]¶
Initialize the Runner.
- Parameters:
init_grid_path (
str
) – Madantory, used to initializeRunner.init_grid_path
.path_chron (
str
) – Madantory where to look for chronics data, used to initializeRunner.path_chron
.parameters_path (
str
ordict
, optional) – Used to initializeRunner.parameters_path
. If it’s a string, this will suppose parameters are located at this path, if it’s a dictionary, this will use the parameters converted from this dictionary.names_chronics_to_backend (
dict
, optional) – Used to initializeRunner.names_chronics_to_backend
.actionClass (
type
, optional) – Used to initializeRunner.actionClass
.observationClass (
type
, optional) – Used to initializeRunner.observationClass
.rewardClass (
type
, optional) – Used to initializeRunner.rewardClass
. Default togrid2op.ConstantReward
that should not* be used to train or evaluate an agent, but rather as debugging purpose.legalActClass (
type
, optional) – Used to initializeRunner.legalActClass
.envClass (
type
, optional) – Used to initializeRunner.envClass
.gridStateclass (
type
, optional) – Used to initializeRunner.gridStateclass
.backendClass (
type
, optional) – Used to initializeRunner.backendClass
.agentClass (
type
, optional) – Used to initializeRunner.agentClass
.agentInstance (
grid2op.Agent.Agent
) – Used to initialize the agent. Note that eitheragentClass
oragentInstance
is used at the same time. If both ot them areNone
or both of them are “notNone
” it throw an error.verbose (
bool
, optional) – Used to initializeRunner.verbose
.thermal_limit_a (
numpy.ndarray
) – The thermal limit for the environment (if any).voltagecontrolerClass (
grid2op.VoltageControler.ControlVoltageFromFile
, optional) – The controler that will change the voltage setpoints of the generators.opponent (# TODO documentation on the) –
budget (# TOOD doc for the attention) –
- __weakref__¶
list of weak references to the object (if defined)
- _clean_up()[source]¶
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
close the environment if it has been created
- _run_parrallel(nb_episode, nb_process=1, path_save=None, env_seeds=None, agent_seeds=None, max_iter=None, episode_id=None, add_detailed_output=False, add_nb_highres_sim=False) List[Tuple[str, str, float, int, int] | Tuple[str, str, float, int, int, EpisodeData] | Tuple[str, str, float, int, int, EpisodeData, int]] [source]¶
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
This method will run in parallel, independently the nb_episode over nb_process.
In case the agent cannot be cloned using copy.copy: nb_process is set to 1
Note that it restarts completely the
Runner.backend
andRunner.env
if the computation is actually performed with more than 1 cores (nb_process > 1)It uses the python multiprocess, and especially the
multiprocess.Pool
to perform the computations. This implies that all runs are completely independent (they happen in different process) and that the memory consumption can be big. Tests may be recommended if the amount of RAM is low.It has the same return type as the
Runner.run_sequential()
.- Parameters:
nb_episode (
int
) – Number of episode to simulatenb_process (
int
, optional) – Number of process used to play the nb_episode. Default to 1.path_save (
str
, optional) – If not None, it specifies where to store the data. See the description of this moduleRunner
for more informationenv_seeds (
list
) – An iterable of the seed used for the experiments. By defaultNone
, no seeds are set. If provided, its size should matchnb_episode
.agent_seeds (
list
) – An iterable that contains the seed used for the environment. By defaultNone
means no seeds are set. If provided, its size should match thenb_episode
. The agent will be seeded at the beginning of each scenario BEFORE calling agent.reset().add_detailed_output (see Runner.run method) –
- Returns:
res –
List of tuple. Each tuple having 3 elements:
”i” unique identifier of the episode (compared to
Runner.run_sequential()
, the elements of the returned list are not necessarily sorted by this value)”cum_reward” the cumulative reward obtained by the
Runner.BaseAgent
on this episode i”nb_time_step”: the number of time steps played in this episode.
”max_ts” : the maximum number of time steps of the chronics
”episode_data” : The
EpisodeData
corresponding to this episode run
- Return type:
list
- _run_sequential(nb_episode, path_save=None, pbar=False, env_seeds=None, agent_seeds=None, max_iter=None, episode_id=None, add_detailed_output=False, add_nb_highres_sim=False) List[Tuple[str, str, float, int, int] | Tuple[str, str, float, int, int, EpisodeData] | Tuple[str, str, float, int, int, EpisodeData, int]] [source]¶
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
This method is called to see how well an agent performed on a sequence of episode.
- Parameters:
nb_episode (
int
) – Number of episode to play.path_save (
str
, optional) – If not None, it specifies where to store the data. See the description of this moduleRunner
for more informationpbar (
bool
ortype
orobject
) –How to display the progress bar, understood as follow:
if pbar is
None
nothing is done.if pbar is a boolean, tqdm pbar are used, if tqdm package is available and installed on the system [if
true
]. If it’s false it’s equivalent to pbar beingNone
if pbar is a
type
( a class), it is used to build a progress bar at the highest level (episode) and and the lower levels (step during the episode). If it’s a type it muyst accept the argument “total” and “desc” when being built, and the closing is ensured by this method.if pbar is an object (an instance of a class) it is used to make a progress bar at this highest level (episode) but not at lower levels (setp during the episode)
env_seeds (
list
) – An iterable of the seed used for the experiments. By defaultNone
, no seeds are set. If provided, its size should matchnb_episode
.episode_id (
list
) – For each of the nb_episdeo you want to compute, it specifies the id of the chronix that will be used. By defaultNone
, no seeds are set. If provided, its size should matchnb_episode
.add_detailed_output (see Runner.run method) –
- Returns:
res –
List of tuple. Each tuple having 5 elements:
”id_chron” unique identifier of the episode
”name_chron” name of chronics
”cum_reward” the cumulative reward obtained by the
Runner.BaseAgent
on this episode i”nb_time_step”: the number of time steps played in this episode.
”max_ts” : the maximum number of time steps of the chronics
”episode_data” : The
EpisodeData
corresponding to this episode run
- Return type:
list
- init_env() BaseEnv [source]¶
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Function used to initialized the environment and the agent. It is called by
Runner.reset()
.
- reset()[source]¶
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Used to reset an environment. This method is called at the beginning of each new episode. If the environment is not initialized, then it initializes it with
Runner.make_env()
.
- run(nb_episode, nb_process=1, path_save=None, max_iter=None, pbar=False, env_seeds=None, agent_seeds=None, episode_id=None, add_detailed_output=False, add_nb_highres_sim=False) List[Tuple[str, str, float, int, int] | Tuple[str, str, float, int, int, EpisodeData] | Tuple[str, str, float, int, int, EpisodeData, int]] [source]¶
Main method of the
Runner
class. It will either callRunner._run_sequential()
if “nb_process” is 1 orRunner._run_parrallel()
if nb_process >= 2.- Parameters:
nb_episode (
int
) – Number of episode to simulatenb_process (
int
, optional) – Number of process used to play the nb_episode. Default to 1. NB Multitoprocessing is deactivated on windows based platform (it was not fully supported so we decided to remove it)path_save (
str
, optional) – If not None, it specifies where to store the data. See the description of this moduleRunner
for more informationmax_iter (
int
) – Maximum number of iteration you want the runner to perform.pbar (
bool
ortype
orobject
) –How to display the progress bar, understood as follow:
if pbar is
None
nothing is done.if pbar is a boolean, tqdm pbar are used, if tqdm package is available and installed on the system [if
true
]. If it’s false it’s equivalent to pbar beingNone
if pbar is a
type
( a class), it is used to build a progress bar at the highest level (episode) and and the lower levels (step during the episode). If it’s a type it muyst accept the argument “total” and “desc” when being built, and the closing is ensured by this method.if pbar is an object (an instance of a class) it is used to make a progress bar at this highest level (episode) but not at lower levels (setp during the episode)
env_seeds (
list
) – An iterable of the seed used for the environment. By defaultNone
, no seeds are set. If provided, its size should matchnb_episode
.agent_seeds (
list
) – An iterable that contains the seed used for the environment. By defaultNone
means no seeds are set. If provided, its size should match thenb_episode
. The agent will be seeded at the beginning of each scenario BEFORE calling agent.reset().episode_id (
list
) – For each of the nb_episdeo you want to compute, it specifies the id of the chronix that will be used. By defaultNone
, no seeds are set. If provided, its size should matchnb_episode
.add_detailed_output (
bool
) – A flag to add anEpisodeData
object to the results, containing a lot of information about the runadd_nb_highres_sim (
bool
) – Whether to add an estimated number of “high resolution simulator” called performed by the agent (either by obs.simulate, or by obs.get_forecast_env or by obs.get_simulator)
- Returns:
res –
List of tuple. Each tuple having 3[4] elements:
”i” unique identifier of the episode (compared to
Runner.run_sequential()
, the elements of the returned list are not necessarily sorted by this value)”cum_reward” the cumulative reward obtained by the
Runner.Agent
on this episode i”nb_time_step”: the number of time steps played in this episode.
”episode_data” : [Optional] The
EpisodeData
corresponding to this episode run only if add_detailed_output=True”add_nb_highres_sim”: [Optional] The estimated number of calls to high resolution simulator made by the agent
- Return type:
list
Examples
You can use the runner this way:
If you would rather to provide an agent instance (and not a class) you can do it this way:
Finally, in the presence of stochastic environments or stochastic agent you might want to set the seeds for ensuring reproducible experiments you might want to seed both the environment and your agent. You can do that by passing env_seeds and agent_seeds parameters (on the example bellow, the agent will be seeded with 42 and the environment with 0.
- run_one_episode(indx=0, path_save=None, pbar=False, env_seed=None, max_iter=None, agent_seed=None, episode_id=None, detailed_output=False, add_nb_highres_sim=False) Tuple[str, str, float, int, int] | Tuple[str, str, float, int, int, EpisodeData] | Tuple[str, str, float, int, int, EpisodeData, int] [source]¶
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Function used to run one episode of the
Runner.agent
and see how it performs in theRunner.env
.- Parameters:
indx (
int
) – The number of episode previously runpath_save (
str
, optional) – Path where to save the data. See the description ofgrid2op.Runner
for the structure of the saved file.detailed_output – See descr. of
Runner.run()
methodadd_nb_highres_sim – See descr. of
Runner.run()
method
- Returns:
TODO DEPRECATED DOC
cum_reward (
np.float32
) – The cumulative reward obtained by the agent during this episodetime_step (
int
) – The number of timesteps that have been played before the end of the episode (because of a “game over” or because there were no more data)
If you still can’t find what you’re looking for, try in one of the following pages:
Still trouble finding the information ? Do not hesitate to send a github issue about the documentation at this link: Documentation issue template