Environment
This page is organized as follow:
Objectives
This module defines the Environment
the higher level representation of the world with which an
grid2op.Agent.BaseAgent
will interact.
The environment receive an grid2op.Action.BaseAction
from the grid2op.Agent.BaseAgent
in the
Environment.step()
and returns an
grid2op.Observation.BaseObservation
that the grid2op.Agent.BaseAgent
will use to perform the next action.
An environment is better used inside a grid2op.Runner.Runner
, mainly because runners abstract the interaction
between environment and agent, and ensure the environment are properly reset after each episode.
Usage
In this section we present some way to use the Environment
class.
Basic Usage
This example is adapted from gymnasium documentation available at gym random_agent.py ):
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0) # for reproducible experiments
episode_count = 100 # i want to make 100 episodes
# i initialize some useful variables
reward = 0
done = False
total_reward = 0
# and now the loop starts
for i in range(episode_count):
obs = env.reset()
while True:
action = agent.act(obs, reward, done)
obs, reward, done, info = env.step(action)
total_reward += reward
if done:
# in this case the episode is over
break
# Close the env and write monitor result info to disk
env.close()
print("The total reward was {:.2f}".format(total_reward))
What happens here is the following:
obs = env.reset() will reset the environment to be usable again. It will load, by default the next “chronics” (you can imagine chronics as the graphics of a video game: it tells where the enemies are located, where are the walls, the ground etc. - each chronics can be thought a different “game level”).
action = agent.act(obs, reward, done) will chose an action facing the observation ob. This action should be of type
grid2op.Action.BaseAction
(or one of its derivate class). In case of a video game that would be you receiving and observation (usually display on the screen) and action on a controller. For example you could chose to go “left” / “right” / “up” or “down”. Of course in the case of the powergrid the actions are more complicated that than.obs, reward, done, info = env.step(action) is the call to go to the next steps. You can imagine it as being a the next “frame”. To continue the parallel with video games, at the previous line you asked “pacman” to go left (for example) and then the next frame is displayed (here returned as an new observation obs).
You might want to customize this general behaviour in multiple way:
you might want to study only one chronics (equivalent to only one level of a video game) see Study always the same time serie
you might want to loop through the chronics, but not always in the same order. If that is the case you might want to consult the section Shuffle the chronics order
you might also have spotted some chronics that have bad properties. In this case, you can “remove” them from the environment (they will be ignored). This is explained in Skipping some chronics
you might also want to select at random, the next chronic you will use. This allows some compromise between all the above solution. Instead of ignoring some chronics you might want to select them less frequently, instead of always using the same one, you can sampling it more often and of course, because the sampling is done randomly it’s unlikely that the order will remain the same. To use that you can check the Sampling the chronics
In a different scenarios, you might also want to skip the first time steps of the chronics, that would be equivalent to starting into the “middle” of a video game. If that is the case, the subsection Skipping some time steps is made for you.
Finally, you might have noticed that each call to “env.reset” might take a while. This can dramatically increase the training time, especially at the beginning. This is due to the fact that each time env.reset is called, the whole chronics is read from the hard drive. If you want to lower this impact then you might consult the Optimize the data pipeline page of the doc.
Go to the next scenario
Starting grid2op 1.9.8 we attempt to make an easier user experience in the selection of time series, seed, initial state of the grid, etc.
All of the above can be done when calling env.reset() function.
For customizing the seed, you can for example do:
import grid2op
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)
obs = env.reset(seed=0)
For customizing the time series id you want to use:
import grid2op
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)
obs = env.reset(options={"time serie id": 1}) # time serie by id (sorted alphabetically)
# or
obs = env.reset(options={"time serie id": "0001"}) # time serie by name (folder name)
For customizing the initial state of the grid, for example forcing the powerline 0 to be disconnected in the initial observation:
import grid2op
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)
init_state_dict = {"set_line_status": [(0, -1)]}
obs = env.reset(options={"init state": init_state_dict})
Feel free to consult the documentation of the Environment.reset()
function
for more information (this doc might be outdated, the one of the function should
be more up to date with the code).
Note
In the near future (next few releases) we will also attempt to make the customization of the parameters or the skip number of steps, maximum duration of the scenarios also available in env.reset() options.
Time series Customization
Study always the same time serie
If you spotted a particularly interesting chronics, or if you want, for some reason your agent to see only one chronics, you can do this rather easily with grid2op.
All chronics are given a unique persistent ID (it means that as long as the data is not modified the same chronics will have always the same ID each time you load the environment). The environment has a “set_id” method that allows you to use it. Just add “env.set_id(THE\_ID\_YOU\_WANT)” before the call to “env.reset”. This gives the following code:
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0) # for reproducible experiments
episode_count = 100 # i want to make 100 episodes
###################################
THE_CHRONIC_ID = 42
###################################
# i initialize some useful variables
reward = 0
done = False
total_reward = 0
# and now the loop starts
for i in range(episode_count):
###################################
# with recent grid2op
obs = env.reset(options={"time serie id": THE_CHRONIC_ID})
###################################
###################################
# 'old method (oldest grid2op version)'
# env.set_id(THE_CHRONIC_ID)
# obs = env.reset()
###################################
# now play the episode as usual
while True:
action = agent.act(obs, reward, done)
obs, reward, done, info = env.step(action)
total_reward += reward
if done:
# in this case the episode is over
break
# Close the env and write monitor result info to disk
env.close()
print("The total reward was {:.2f}".format(total_reward))
(as always added line compared to the base code are highlighted: they are “circle” with #####)
Shuffle the chronics order
In some other usecase, you might want to go through the whole set of chronics, and then loop again through them, but in a different order (remember that by default it will always loop in the same order 0, 1, 2, 3, …, 0, 1, 2, 3, …, 0, 1, 2, 3, …).
Again, doing so with grid2op is rather easy. To that end you can use the chronics_handler.shuffle function that will do exactly that. You can use it like this:
import numpy as np
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0) # for reproducible experiments
episode_count = 10000 # i want to make lots of episode
# total number of episode
total_episode = len(env.chronics_handler.subpaths)
# i initialize some useful variables
reward = 0
done = False
total_reward = 0
# and now the loop starts
for i in range(episode_count):
###################################
if i % total_episode == 0:
# I shuffle each time i need to
env.chronics_handler.shuffle()
###################################
obs = env.reset()
# now play the episode as usual
while True:
action = agent.act(obs, reward, done)
obs, reward, done, info = env.step(action)
total_reward += reward
if done:
# in this case the episode is over
break
(as always added line compared to the base code are highlighted: they are “circle” with #####)
Skipping some chronics
Some chronics might be too hard to start a training (“learn to walk before running”) and conversely some chronics might be too easy after a while (you can solve them without doing nothing basically). This is why grid2op allows you to have some control about which chronics will be used by the environment.
For this purpose you can use the chronics_handler.set_filter function. This function takes a
“filtering function” as argument. This “filtering function” takes as argument the full path of the
chronics and should return True
/ False
whether or not you want to keep the There is an example:
import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0) # for reproducible experiments
###################################
# this is the only line of code to add
# here i select only the chronics that start by "00"
env.chronics_handler.set_filter(lambda path: re.match(".*00[0-9].*", path) is not None)
kept = env.chronics_handler.reset() # if you don't do that it will not have any effect
print(kept) # i print the chronics kept
###################################
episode_count = 10000 # i want to make lots of episode
# i initialize some useful variables
reward = 0
done = False
total_reward = 0
# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):
obs = env.reset()
# now play the episode as usual
while True:
action = agent.act(obs, reward, done)
obs, reward, done, info = env.step(action)
total_reward += reward
if done:
# in this case the episode is over
break
(as always added line compared to the base code are highlighted: they are “circle” with #####)
Sampling the chronics
Finally, for even more flexibility, you can choose to sample what will be the next used chronics. To achieve that you can call the chronics_handler.sample_next_chronics This function takes a vector of probabilities as input (if not provided it assumes all probabilities are equal) and will select an id based on this probability vector.
In the following example we assume that the vector of probabilities is always the same and that we want, for some reason oversampling the 10 first chronics, and under sample the last 10:
import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0) # for reproducible experiments
episode_count = 10000 # i want to make lots of episode
# i initialize some useful variables
reward = 0
done = False
total_reward = 0
###################################
# total number of episode
total_episode = len(env.chronics_handler.subpaths)
probas = np.ones(total_episode)
# oversample the first 10 episode
probas[:10]*= 5
# undersample the last 10 episode
probas[-10:] /= 5
###################################
# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):
###################################
_ = env.chronics_handler.sample_next_chronics(probas) # this is added
###################################
obs = env.reset()
# now play the episode as usual
while True:
action = agent.act(obs, reward, done)
obs, reward, done, info = env.step(action)
total_reward += reward
if done:
# in this case the episode is over
break
(as always added line compared to the base code are highlighted: they are “circle” with #####)
NB here we have a constant vector of probabilities, but you might imagine adapting it during the training, for example to oversample scenarios your agent is having trouble to solve during the training.
Skipping some time steps
Another way to customize which data your agent will face is to make as if the chronics started at different date and time. This might be handy in case a scenario is hard at the beginning but less hard at the end, or if you want your agent to learn to start controlling the grid at any date and time (in grid2op most of the chronics data provided start at midnight for example).
To achieve this goal, you can use the BaseEnv.fast_forward_chronics()
function. This function skip a given
number of steps. In the following example, we always skip the first 42 time steps before starting the
episode:
import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0) # for reproducible experiments
episode_count = 10000 # i want to make lots of episode
# i initialize some useful variables
reward = 0
done = False
total_reward = 0
# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):
obs = env.reset()
###################################
# below are the two lines added
env.fast_forward_chronics(42)
obs = env.get_obs()
###################################
# now play the episode as usual
while True:
action = agent.act(obs, reward, done)
obs, reward, done, info = env.step(action)
total_reward += reward
if done:
# in this case the episode is over
break
(as always added line compared to the base code are highlighted: they are “circle” with #####)
Generating chronics that are always new
New in version 1.6.6: This functionality is only available for some environments, for example “l2rpn_wcci_2022”
Warning
A much better alternative to this class is to have a “process” generate the data, thanks to the
grid2op.Environment.Environment.generate_data()
and then to reload
the data in a (separate) training script.
This is explained in section Generate and use an “infinite” data of the documentation.
Though it is not recommended at all (for performance reasons), you have, starting from grid2op 1.6.6 (and using
a compatible environment eg “l2rpn_wcci_2022”) to generate a possibly infinite amount of data thanks to the
grid2op.Chronics.FromChronix2grid
class.
The data generation process is rather slow for different reasons. The main one is that the data need to meet a lot of “constraints” to be realistic, some of them are given in the Elements modeled in this environment and their main properties module. On our machines, it takes roughly 40-50 seconds to generate a weekly scenario for the l2rpn_wcci_2022 environment (usually an agent will fail in 1 or 2s… This is why we do not recommend to use it)
To generate data “on the fly” you simply need to create the environment with the right chronics class as follow:
import grid2op
from grid2op.Chronics import FromChronix2grid
env_nm = "l2rpn_wcci_2022" # only compatible environment at time of writing
env = grid2op.make(env_nm,
chronics_class=FromChronix2grid,
data_feeding_kwargs={"env_path": os.path.join(grid2op.get_current_local_dir(), env_nm),
"with_maintenance": True, # whether to include maintenance (optional)
"max_iter": 2 * 288, # duration (in number of steps) of the data generated (optional)
}
)
And this is it. Each time you call env.reset() it will internally call chronix2grid package to generate new data for this environment (this is why env.reset() will take roughly 50s…).
Warning
For this class to be available, you need to have the “chronix2grid” package installed and working.
Please install it with pip intall grid2op[chronix2grid] and make sure to have the coinor-cbc solver available on your system (more information at https://github.com/bdonnot/chronix2grid#installation)
Warning
Because I know from experience warnings are skipped half of the time: please consult Generate and use an “infinite” data for a better way to generate infinite data !
Generate and use an “infinite” data
New in version 1.6.6.
Warning
For this class to be available, you need to have the “chronix2grid” package installed and working.
Please install it with pip intall grid2op[chronix2grid] and make sure to have the coinor-cbc solver available on your system (more information at https://github.com/bdonnot/chronix2grid#installation)
In this section we present a new way to generate possibly an infinite amount of data for training your agent ( in case the data shipped with the environment are too limited).
One way to do this is to split the data “generation” process on one python script, and the data “consumption” process (for example by training an agent) on another one.
This is much more efficient than using the grid2op.Chronics.FromChronix2grid
because you will not spend 50s
waiting the data to be generated at each call to env.reset() after the episode is over.
First, create a script to generate all the data that you want. For example in the script “generation.py”:
import grid2op
env_name = "l2rpn_wcci_2022" # only compatible with what comes next (at time of writing)
env = grid2op.make(env_name)
nb_year = 50 # or any "big" number...
env.generate_data(nb_year=nb_year) # generates 50 years of data
# (takes roughly 50s per week, around 45mins per year, in this case 50 * 45 mins = 37.5 hours)
Then create a script to “consume” your data, for example by training an agent (say “train.py”) [we demonstrate it with l2rpn baselines but you can use whatever you want]:
import os
import grid2op
from lightsim2grid import LightSimBackend # highly recommended for speed !
env_name = "l2rpn_wcci_2022" # only compatible with what comes next (at time of writing)
env = grid2op.make(env_name, backend=LightSimBackend())
# now train an agent
# see l2rpn_baselines package for more information, for example
# l2rpn-baselines.readthedocs.io/
from l2rpn_baselines.PPO_SB3 import train
nb_iter = 10000 # train for that many iterations
agent_name = "WhaetverIWant" # or any other name
agent_path = os.path.expand("~") # or anywhere else on your computer
trained_agent = train(env,
iterations=nb_iter,
name=agent_name,
save_path=agent_path)
# this agent will be trained only on the data available at the creation of the environment
# the training loop will take some time, so more data will be generated when it's over
# reload them
env.chronics_handler.init_subpath()
env.chronics_handler.reset()
# and retrain your agent including the data you just generated
trained_agent = train(env,
iterations=nb_iter,
name=agent_name,
save_path=agent_path,
load_path=agent_path
)
# once it's over, more time has passed, and more data are available
# reload them
env.chronics_handler.init_subpath()
env.chronics_handler.reset()
# and retrain your agent
trained_agent = train(env,
iterations=nb_iter,
name=agent_name,
save_path=agent_path,
load_path=agent_path
)
# well you got the idea
# etc. etc.
Warning
This way of doing things will always increase the size of the data in your hard drive. We do recommend to somehow delete some of the data from time to time
Deleting the data you be done before the env.chronics_handler.init_subpath() for example:
### delete the folder you want to get rid off
names_folder_to_delete = ...
# To build `names_folder_to_delete`
# you could for examaple:
# - remove the `nth` oldest directories
# see: https://stackoverflow.com/questions/47739262/find-remove-oldest-file-in-directory
# - or keep only the `kth`` most recent directories
# - or keep only `k` folder at random among the one in `grid2op.get_current_local_dir()`
# - or delete all the oldest files and keep your directory at a fixed size
# see: https://gist.github.com/ginz/1ba7de8b911651cfc9c85a82a723f952
# etc.
for nm in names_folder_to_delete:
shutil.rmtree(os.path.join(grid2op.get_current_local_dir(), nm))
####
# reload the remaining data:
env.chronics_handler.init_subpath()
env.chronics_handler.reset()
# continue normally
Splitting into raining, validation, test scenarios
In machine learning the “training / validation / test” framework is particularly usefull to avoid overfitting and develop models as performant as possible.
Grid2op allows for such usage at the environment level. There is the possibility to “split” an environment into training / validation and test (ie using only some chronics for training, some others for validation and some others for testing).
This can be done with:
import grid2op
env_name = "l2rpn_case14_sandbox" # or any other...
env = grid2op.make(env_name)
# extract 1% of the "chronics" to be used in the validation environment. The other 99% will
# be used for test
nm_env_train, nm_env_val, nm_env_test = env.train_val_split_random(pct_val=1., pct_test=1.)
# and now you can use the training set only to train your agent:
print(f"The name of the training environment is \\"{nm_env_train}\\"")
print(f"The name of the validation environment is \\"{nm_env_val}\\"")
print(f"The name of the test environment is \\"{nm_env_test}\\"")
env_train = grid2op.make(nm_env_train)
You can then use, in the above case:
import grid2op
env_name = "l2rpn_case14_sandbox" # matching above
env_train = grid2op.make(env_name+"_train") # to only use the "training chronics"
# do whatever you want with env_train
And then, at time of validation:
import grid2op
env_name = "l2rpn_case14_sandbox" # matching above
env_val = grid2op.make(env_name+"_val") # to only use the "validation chronics"
# do whatever you want with env_val
# and of course
env_test = grid2op.make(env_name+"_test")
Customization
Environments can be customized in three major ways:
Backend: you change the solver that computes the state of the power more or less faste or be more realistically
Parameters: you change the behaviour of the Environment. For example you can prevent the powerline to be disconnected when too much current flows on it etc.
Rules: you can affect the operational constraint that your agent must meet. For example you can affect more or less powerlines in the same action etc.
You can do these at creation time:
import grid2op
env_name = "l2rpn_case14_sandbox" # or any other name
# create the regular environment:
env_reg = grid2op.make(env_name)
# to change the backend
# (here using the lightsim2grid faster backend)
from lightsim2grid import LightSimBackend
env_faster = grid2op.make(env_name, backend=LightSimBackend())
# to change the parameters, for example
# to prevent line disconnect when there is overflow
param = env_reg.parameters
param.NO_OVERFLOW_DISCONNECTION = True
env_easier = grid2op.make(env_name, param=param)
Of course you can combine everything. More examples are given in section Customize your environment.
Detailed Documentation by class
Classes:
|
INTERNAL |
|
This class allows to evaluate a single agent instance on multiple environments running in parrallel. |
|
This class is the grid2op implementation of the "Environment" entity in the RL framework. |
|
This class is the grid2op implementation of a "maked" environment: lines not in the lines_of_interest mask will NOT be deactivated by the environment is the flow is too high (or moderately high for too long.) |
|
This class allows to evaluate a single agent instance on multiple environments running in parrallel. |
|
This class represent a single powergrid configuration, backed by multiple environments parameters and chronics |
|
This class allows to evaluate a single agent instance on multiple environments running in parallel. |
|
This class is the grid2op implementation of a "timed out environment" entity in the RL framework. |
- class grid2op.Environment.BaseEnv(init_env_path: ~os.PathLike, init_grid_path: ~os.PathLike, parameters: ~grid2op.Parameters.Parameters, voltagecontrolerClass: type, name='unknown', thermal_limit_a: ~numpy.ndarray | None = None, epsilon_poly: float = 0.0001, tol_poly: float = 0.01, other_rewards: dict | None = None, with_forecast: bool = True, opponent_space_type: type = <class 'grid2op.Opponent.opponentSpace.OpponentSpace'>, opponent_action_class: type = <class 'grid2op.Action.dontAct.DontAct'>, opponent_class: type = <class 'grid2op.Opponent.baseOpponent.BaseOpponent'>, opponent_init_budget: float = 0.0, opponent_budget_per_ts: float = 0.0, opponent_budget_class: type = <class 'grid2op.Opponent.neverAttackBudget.NeverAttackBudget'>, opponent_attack_duration: int = 0, opponent_attack_cooldown: int = 99999, kwargs_opponent: dict | None = None, has_attention_budget: bool = False, attention_budget_cls: type = <class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget: dict | None = None, logger: ~logging.Logger | None = None, kwargs_observation: dict | None = None, observation_bk_class=None, observation_bk_kwargs=None, highres_sim_counter=None, update_obs_after_reward=False, n_busbar=2, _is_test: bool = False, _init_obs: ~grid2op.Observation.baseObservation.BaseObservation | None = None, _local_dir_cls=None, _read_from_local_dir=None, _raw_backend_class=None)[source]
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
This class represent some usefull abstraction that is re used by
Environment
andgrid2op.Observation._Obsenv
for example.The documentation is showed here to document the common attributes of an “BaseEnvironment”.
Notes
Note en environment data ownership
Danger
A non pythonic decision has been implemented in grid2op for various reasons: an environment owns everything created from it.
This means that if you (or the python interpreter) deletes the environment, you might not use some data generate with this environment.
More precisely, you cannot do something like:
import grid2op env = grid2op.make("l2rpn_case14_sandbox") saved_obs = [] obs = env.reset() saved_obs.append(obs) obs2, reward, done, info = env.step(env.action_space()) saved_obs.append(obs2) saved_obs[0].simulate(env.action_space()) # works del env saved_obs[0].simulate(env.action_space()) # DOES NOT WORK
It will raise an error like Grid2OpException EnvError “This environment is closed. You cannot use it anymore.”
This will also happen if you do things inside functions, for example like this:
import grid2op def foo(manager): env = grid2op.make("l2rpn_case14_sandbox") obs = env.reset() manager.append(obs) obs2, reward, done, info = env.step(env.action_space()) manager.append(obs2) manager[0].simulate(env.action_space()) # works return manager manager = [] manager = foo(manager) manager[0].simulate(env.action_space()) # DOES NOT WORK
The same error is raised because the environment env is automatically deleted by python when the function foo ends (well it might work on some cases, if the function is called before the variable env is actually deleted but you should not rely on this behaviour.)
- parameters
The parameters of the game (to expose more control on what is being simulated)
- with_forecast
Whether the chronics allow to have some kind of “forecast”. See
BaseEnv.activate_forceast()
for more information- Type:
bool
- logger
TO BE DONE: a way to log what is happening (currently not implemented)
- time_stamp
The actual time stamp of the current observation.
- Type:
datetime.datetime
- nb_time_step
Number of time steps played in the current environment
- Type:
int
- current_obs
The current observation (or None if it’s not intialized)
- backend
The backend used to compute the powerflows.
- Type:
- done
Whether the environment is “done”. If
True
you need to callEnvironment.reset()
in order to continue.- Type:
bool
- current_reward
The last computed reward (reward of the current step)
- Type:
float
- other_rewards
Dictionary with key being the name (identifier) and value being some RewardHelper. At each time step, all the values will be computed by the
Environment
and the information about it will be returned in the “reward” key of the “info” dictionnary of theEnvironment.step()
.- Type:
dict
- chronics_handler
The object in charge managing the “chronics”, which store the information about load and generator for example.
- reward_range
For open ai gym compatibility. It represents the range of the rewards: reward min, reward max
- Type:
tuple
- _viewer
For open ai gym compatibility.
- viewer_fig
For open ai gym compatibility.
- _gen_activeprod_t
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Should be initialized at 0. for “step” to properly recognize it’s the first time step of the game
- _no_overflow_disconnection
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Whether or not cascading failures are computed or not (TRUE = the powerlines above their thermal limits will not be disconnected). This is initialized based on the attribute
grid2op.Parameters.Parameters.NO_OVERFLOW_DISCONNECTION
.- Type:
bool
- _timestep_overflow
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Number of consecutive timesteps each powerline has been on overflow.
- Type:
numpy.ndarray
, dtype: int
- _nb_timestep_overflow_allowed
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Number of consecutive timestep each powerline can be on overflow. It is usually read from
grid2op.Parameters.Parameters.NB_TIMESTEP_POWERFLOW_ALLOWED
.- Type:
numpy.ndarray
, dtype: int
- _hard_overflow_threshold
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Number of timestep before an
grid2op.BaseAgent.BaseAgent
can reconnet a powerline that has been disconnected by the environment due to an overflow.- Type:
float
- _env_dc
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Whether the environment computes the powerflow using the DC approximation or not. It is usually read from
grid2op.Parameters.Parameters.ENV_DC
.- Type:
bool
- _names_chronics_to_backend
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Configuration file used to associated the name of the objects in the backend (both extremities of powerlines, load or production for example) with the same object in the data (
Environment.chronics_handler
). The idea is that, usually data generation comes from a different software that does not take into account the powergrid infrastructure. Hence, the same “object” can have a different name. This mapping is present to avoid the need to rename the “object” when providing data. A more detailed description is available atgrid2op.ChronicsHandler.GridValue.initialize()
.- Type:
dict
- _env_modification
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Representation of the actions of the environment for the modification of the powergrid.
- Type:
grid2op.Action.Action
- _rewardClass
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Type of reward used. Should be a subclass of
grid2op.BaseReward.BaseReward
- Type:
type
- _init_grid_path
Warning
/!\ Internal, do not use unless you know what you are doing /!\
The path where the description of the powergrid is located.
- Type:
str
- _game_rules
Warning
/!\ Internal, do not use unless you know what you are doing /!\
The rules of the game (define which actions are legal and which are not)
- _action_space
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Helper used to manipulate more easily the actions given to / provided by the
grid2op.Agent.BaseAgent
(player)
- _helper_action_env
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Helper used to manipulate more easily the actions given to / provided by the environment to the backend.
- _observation_space
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Helper used to generate the observation that will be given to the
grid2op.BaseAgent
- _reward_helper
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Helper that is called to compute the reward at each time step.
- Type:
grid2p.BaseReward.RewardHelper
- kwargs_observation
TODO
- Type:
dict
- # TODO add the units (eg MW, MWh, MW/time step,etc.) in the redispatching related attributes
Attributes:
this are the keys of the dictionnary options that can be used when calling env.reset(..., options={})
this represent a view on the action space
this represent a view on the action space
Return a deepcopy of the parameters used by the environment
Methods:
attach_layout
(grid_layout)Compare to the method of the base class, this one performs a check.
change_forecast_parameters
(new_parameters)Allows to change the parameters of a "forecast environment" that is for the method
grid2op.Observation.BaseObservation.simulate()
andgrid2op.Observation.BaseObservation.get_forecast_env()
change_parameters
(new_parameters)Allows to change the parameters of an environment.
change_reward
(new_reward_func)Change the reward function used for the environment.
Whether the classes created when this environment has been made are store on the hard drive (will return True) or not.
close
()close an environment: this will attempt to free as much memory as possible.
This function will have the effect to deactivate the obs.simulate, the forecast will not be updated in the observation space.
fast_forward_chronics
(nb_timestep)This method allows you to skip some time step at the beginning of the chronics.
generate_classes
(*[, local_dir_id, _guard, ...])Use with care, but can be incredibly useful !
INTERNAL
get_obs
([_update_state, _do_copy])Return the observations of the current environment made by the
grid2op.Agent.BaseAgent
.Get the path that allows to create this environment.
INTERNAL
Get the current thermal limit in amps registered for the environment.
Internal
Internal
This function will have the effect to reactivate the obs.simulate, the forecast will be updated in the observation space.
reset
(*[, seed, options])Reset the base environment (set the appropriate variables to correct initialization).
seed
([seed, _seed_me])Set the seed of this
Environment
for a better control and to ease reproducible experiments.set_thermal_limit
(thermal_limit)Set the thermal limit effectively.
step
(action)Run one timestep of the environment's dynamics.
- KEYS_RESET_OPTIONS = {'init state', 'init ts', 'max step', 'time serie id'}
this are the keys of the dictionnary options that can be used when calling env.reset(…, options={})
- property action_space: ActionSpace
this represent a view on the action space
- attach_layout(grid_layout)[source]
Compare to the method of the base class, this one performs a check. This method must be called after initialization.
- Parameters:
grid_layout (
dict
) – The layout of the grid (i.e the coordinates (x,y) of all substations). The keys should be the substation names, and the values a tuple (with two float) representing the coordinate of the substation.
Examples
Here is an example on how to attach a layout for an environment:
import grid2op # create the environment env = grid2op.make("l2rpn_case14_sandbox") # assign coordinates (0., 0.) to all substations (this is a dummy thing to do here!) layout = {sub_name: (0., 0.) for sub_name in env.name_sub} env.attach_layout(layout)
- change_forecast_parameters(new_parameters)[source]
Allows to change the parameters of a “forecast environment” that is for the method
grid2op.Observation.BaseObservation.simulate()
andgrid2op.Observation.BaseObservation.get_forecast_env()
Notes
This only affects the environment AFTER env.reset() has been called.
This only affects the “forecast env” and NOT the env itself.
- Parameters:
new_parameters (
grid2op.Parameters.Parameters
) – The new parameters you want the environment to get.
Examples
This can be used like:
import grid2op env_name = "l2rpn_case14_sandbox" # or any other name env = grid2op.make(env_name) param = env.parameters param.NO_OVERFLOW_DISCONNECTION = True # or any other properties of the environment env.change_forecast_parameters(param) # at this point this has no impact. obs = env.reset() # now, after the reset, the right parameters are used sim_obs, sim_reward, sim_done, sim_info = obs.simulate(env.action_space()) # the new parameters `param` are used for this # and also for forecasted_env = obs.get_forecast_env()
- change_parameters(new_parameters)[source]
Allows to change the parameters of an environment.
Notes
This only affects the environment AFTER env.reset() has been called.
This only affects the environment and NOT the forecast.
- Parameters:
new_parameters (
grid2op.Parameters.Parameters
) – The new parameters you want the environment to get.
Examples
You can use this function like:
import grid2op from grid2op.Parameters import Parameters env_name = "l2rpn_case14_sandbox" # or any other name env = grid2op.make(env_name) env.parameters.NO_OVERFLOW_DISCONNECTION # -> False new_param = Parameters() new_param.A_MEMBER = A_VALUE # eg new_param.NO_OVERFLOW_DISCONNECTION = True env.change_parameters(new_param) obs = env.reset() env.parameters.NO_OVERFLOW_DISCONNECTION # -> True
- change_reward(new_reward_func)[source]
Change the reward function used for the environment.
TODO examples !
- Parameters:
new_reward_func – Either an object of class BaseReward, or a subclass of BaseReward: the new reward function to use
Notes
This only affects the environment AFTER env.reset() has been called.
- classes_are_in_files() bool [source]
Whether the classes created when this environment has been made are store on the hard drive (will return True) or not.
Note
This will become the default behaviour in future grid2op versions.
See Pickle issues for more information.
- close()[source]
close an environment: this will attempt to free as much memory as possible. Note that after an environment is closed, you will not be able to use anymore.
Any attempt to use a closed environment might result in non deterministic behaviour.
- deactivate_forecast()[source]
This function will have the effect to deactivate the obs.simulate, the forecast will not be updated in the observation space.
This will most likely lead to some performance increase (~10-15% faster) if you don’t use the obs.simulate function.
Notes
If you really don’t want to use the obs.simulate functionality, you should rather disable it at the creation of the environment. For example, if you use the recommended make function, you can pass an argument that will ignore the chronics even when reading it (using GridStateFromFile instead of GridStateFromFileWithForecast for example) this would give something like:
import grid2op from grid2op.Chronics import GridStateFromFile # tell grid2op not to read the "forecast" env = grid2op.make("l2rpn_case14_sandbox", data_feeding_kwargs={"gridvalueClass": GridStateFromFile}) do_nothing_action = env.action_space() # improve speed ups to not even try to use forecast env.deactivate_forecast() # this is normal behavior obs = env.reset() # but this will make the programm stop working # obs.simulate(do_nothing_action) # DO NOT RUN IT RAISES AN ERROR
- fast_forward_chronics(nb_timestep)[source]
This method allows you to skip some time step at the beginning of the chronics.
This is usefull at the beginning of the training, if you want your agent to learn on more diverse scenarios. Indeed, the data provided in the chronics usually starts always at the same date time (for example Jan 1st at 00:00). This can lead to suboptimal exploration, as during this phase, only a few time steps are managed by the agent, so in general these few time steps will correspond to grid state around Jan 1st at 00:00.
See also
From grid2op version 1.10.3, a similar objective can be obtained directly by calling
grid2op.Environment.Environment.reset()
with “init ts” as option, for example like obs = env.reset(options={“init ts”: 12})Danger
The usage of both
BaseEnv.fast_forward_chronics()
andEnvironment.set_max_iter()
is not recommended at all and might not behave correctly. Please use env.reset with obs = env.reset(options={“max step”: xxx, “init ts”: yyy}) for a correct behaviour.- Parameters:
nb_timestep (
int
) – Number of time step to “fast forward”
Examples
From grid2op version 1.10.3 we recommend not to use this function (which will be deprecated) but to use the
grid2op.Environment.Environment.reset()
functon with the “init ts” option.import grid2op env_name = "l2rpn_case14_sandbox" env = grid2op.make(env_name) obs = env.reset(options={"init ts": 123})
For the legacy usave, this can be used like this:
import grid2op # create the environment env = grid2op.make("l2rpn_case14_sandbox") # skip the first 150 steps of the chronics env.fast_forward_chronics(150) done = env.is_done if not done: obs = env.get_obs() # do something else: # there was a "game over" # you need to reset the env (which will "cancel" the fast_forward) pass # do something else
Notes
This method can set the state of the environment in a ‘game over’ state (done=True) for example if the chronics last xxx time steps and you ask to “fast foward” more than xxx steps. This is why we advise to check the state of the environment after the call to this method if you use it (see the “Examples” paragaph)
- generate_classes(*, local_dir_id=None, _guard=None, _is_base_env__=True, sys_path=None)[source]
Use with care, but can be incredibly useful !
If you get into trouble like :
AttributeError: Can't get attribute 'ActionSpace_l2rpn_icaps_2021_small' on <module 'grid2op.Space.GridObjects' from /home/user/Documents/grid2op_dev/grid2op/Space/GridObjects.py'>
You might want to call this function and that MIGHT solve your problem.
This function will create a subdirectory ino the env directory, that will be accessed when loading the classes used for the environment.
The default behaviour is to build the class on the fly which can cause some issues when using pickle or multiprocessing for example.
Examples
Here is how to best leverage this functionality:
First step, generated the classes once and for all.
Warning
You need to redo this step each time you customize the environment. This customization includes, but is not limited to:
change the backend type: grid2op.make(…, backend=…)
change the action class: grid2op.make(…, action_class=…)
change observation class: grid2op.make(…, observation_class=…)
change the volagecontroler_class
change the grid_path
change the opponent_action_class
etc.
import grid2op env_name = "l2rpn_case14_sandbox" # or any other name env = grid2op.make(env_name, ...) # again: redo this step each time you customize "..." # for example if you change the `action_class` or the `backend` etc. env.generate_classes()
Then, next time you want to use the SAME environment, you can do:
import grid2op env_name = SAME NAME AS ABOVE env = grid2op.make(env_name, experimental_read_from_local_dir=True, SAME ENV CUSTOMIZATION AS ABOVE)
And it should (this is experimerimental for now, and we expect feedback on the matter) solve the issues involving pickle.
Again, if you customize your environment (see above for more information) you’ll have to redo this step !
- get_current_line_status()[source]
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
prefer using
grid2op.Observation.BaseObservation.line_status
This method allows to retrieve the line status.
- get_obs(_update_state=True, _do_copy=True)[source]
Return the observations of the current environment made by the
grid2op.Agent.BaseAgent
.Note
This function is called twice when the env is reset, otherwise once per step
- _do_copy :
Whether or not to make a copy of the returned observation. By default it will do one. Be aware that this might cause trouble if used incorrectly.
- Returns:
res – The current observation usually given to the
grid2op.Agent.BaseAgent
/ bot / controler.- Return type:
Examples
This function can be use at any moment, even if the actual observation is not present.
import grid2op # I create an environment env = grid2op.make("l2rpn_case14_sandbox") obs = env.reset() # have a big piece of code obs2 = env.get_obs() # obs2 and obs are identical.
- get_path_env()[source]
Get the path that allows to create this environment.
It can be used for example in
grid2op.utils.EpisodeStatistics()
to save the information directly inside the environment data.
- get_reward_instance()[source]
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
Returns the instance of the object that is used to compute the reward.
- get_thermal_limit()[source]
Get the current thermal limit in amps registered for the environment.
Examples
It can be used like this:
import grid2op # I create an environment env = grid2op.make("l2rpn_case14_sandbox") thermal_limits = env.get_thermal_limit()
- load_alarm_data()[source]
Internal
Warning
/!\ Only valid with “l2rpn_icaps_2021” environment /!\
Notes
This is called when the environment class is not created, so i need to read the data of the grid from the backend.
I cannot use “self.name_line” for example.
This function update the backend INSTANCE. The backend class is then updated in the
BaseEnv._init_backend()
function with a call to self.backend.assert_grid_correct()
- load_alert_data()[source]
Internal
Notes
This is called to get the alertable lines when the warning is raised “by line”
- property observation_space: ObservationSpace
this represent a view on the action space
- property parameters
Return a deepcopy of the parameters used by the environment
It is a deepcopy, so modifying it will have absolutely no effect on the environment.
If you want to change the parameters of an environment, please use either
grid2op.Environment.BaseEnv.change_parameters()
to change the parameters of this environment orgrid2op.Environment.BaseEnv.change_forecast_parameters()
to change the parameter of the environment used bygrid2op.Observation.BaseObservation.simulate()
orgrid2op.Observation.BaseObservation.get_forecast_env()
Danger
To modify the environment parameters you need to do:
params = env.parameters params.WHATEVER = NEW_VALUE env.change_parameters(params) env.reset()
If you simply do:
env.params.WHATEVER = NEW_VALUE # no effet !
This will have absolutely no impact.
- reactivate_forecast()[source]
This function will have the effect to reactivate the obs.simulate, the forecast will be updated in the observation space.
This will most likely lead to some performance decrease but you will be able to use obs.simulate function.
Warning
Forecast are deactivated by default (and cannot be reactivated) if the backend cannot be copied.
Warning
You need to call ‘env.reset()’ for this function to work properly. It is NOT recommended to reactivate forecasts in the middle of an episode.
Notes
You can use this function as followed:
import grid2op from grid2op.Chronics import GridStateFromFile # tell grid2op not to read the "forecast" env = grid2op.make("l2rpn_case14_sandbox", data_feeding_kwargs={"gridvalueClass": GridStateFromFile}) do_nothing_action = env.action_space() # improve speed ups to not even try to use forecast env.deactivate_forecast() # this is normal behavior obs = env.reset() # but this will make the programm stop working # obs.simulate(do_nothing_action) # DO NOT RUN IT RAISES AN ERROR env.reactivate_forecast() obs = env.reset() # you need to reset the env for this function to have any effects obs, reward, done, info = env.step(do_nothing_action) # and now forecast are available again simobs, sim_r, sim_d, sim_info = obs.simulate(do_nothing_action)
- reset(*, seed: int | None = None, options: Dict[Literal['time serie id'], int] | Dict[Literal['init state'], Dict[Literal['set_line_status', 'change_line_status', 'set_bus', 'change_bus', 'redispatch', 'set_storage', 'curtail', 'raise_alarm', 'raise_alert', 'injection', 'hazards', 'maintenance', 'shunt'], Any]] | Dict[Literal['init ts'], int] | Dict[Literal['max step'], int] | None = None)[source]
Reset the base environment (set the appropriate variables to correct initialization). It is (and must be) overloaded in other
grid2op.Environment
- seed(seed=None, _seed_me=True)[source]
Set the seed of this
Environment
for a better control and to ease reproducible experiments.See also
function
Environment.reset()
for extra informationChanged in version 1.9.8: Starting from version 1.9.8 you can directly set the seed when calling reset.
Warning
It is preferable to call this function just before a call to env.reset() otherwise the seeding might not work properly (especially if some non standard “time serie generators” aka chronics are used)
- Parameters:
seed (
int
) – The seed to set._seed_me (
bool
) – Whether to seed this instance or just the other things. Used internally only.
- Returns:
seed (
tuple
) – The seed used to set the prng (pseudo random number generator) for the environmentseed_chron (
tuple
) – The seed used to set the prng for the chronics_handler (if any), otherwiseNone
seed_obs (
tuple
) – The seed used to set the prng for the observation space (if any), otherwiseNone
seed_action_space (
tuple
) – The seed used to set the prng for the action space (if any), otherwiseNone
seed_env_modif (
tuple
) – The seed used to set the prng for the modification of th environment (if any otherwiseNone
)seed_volt_cont (
tuple
) – The seed used to set the prng for voltage controler (if any otherwiseNone
)seed_opponent (
tuple
) – The seed used to set the prng for the opponent (if any otherwiseNone
)
Examples
Seeding an environment should be done with:
import grid2op env = grid2op.make("l2rpn_case14_sandbox") env.seed(0) obs = env.reset()
As long as the environment instance (variable env in the above code) is not reset the env.seed has no real effect (but can have side effect).
For a full control on the seed mechanism it is more than advised to reset it after it has been seeded.
- set_thermal_limit(thermal_limit)[source]
Set the thermal limit effectively.
- Parameters:
thermal_limit (
numpy.ndarray
) –The new thermal limit. It must be a numpy ndarray vector (or convertible to it). For each powerline it gives the new thermal limit.
Alternatively, this can be a dictionary mapping the line names (keys) to its thermal limits (values). In that case, all thermal limits for all powerlines should be specified (this is a safety measure to reduce the odds of misuse).
Examples
This function can be used like this:
import grid2op # I create an environment env = grid2op.make(""l2rpn_case14_sandbox"", test=True) # i set the thermal limit of each powerline to 20000 amps env.set_thermal_limit([20000 for _ in range(env.n_line)])
Notes
As of grid2op > 1.5.0, it is possible to set the thermal limit by using a dictionary with the keys being the name of the powerline and the values the thermal limits.
- step(action: BaseAction) Tuple[BaseObservation, float, bool, Dict[Literal['disc_lines', 'is_illegal', 'is_ambiguous', 'is_dispatching_illegal', 'is_illegal_reco', 'reason_alarm_illegal', 'reason_alert_illegal', 'opponent_attack_line', 'opponent_attack_sub', 'exception', 'detailed_infos_for_cascading_failures', 'rewards', 'time_series_id'], Any]] [source]
Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state. Accepts an action and returns a tuple (observation, reward, done, info).
If the
grid2op.BaseAction.BaseAction
is illegal or ambiguous, the step is performed, but the action is replaced with a “do nothing” action.- Parameters:
action (
grid2op.Action.Action
) – an action provided by the agent that is applied on the underlying through the backend.- Returns:
observation (
grid2op.Observation.Observation
) – agent’s observation of the current environmentreward (
float
) – amount of reward returned after previous actiondone (
bool
) – whether the episode has ended, in which case further step() calls will return undefined resultsinfo (
dict
) – contains auxiliary diagnostic information (helpful for debugging, and sometimes learning). It is a dictionary with keys:”disc_lines”: a numpy array (or
None
) saying, for each powerline if it has been disconnected due to overflow (if not disconnected it will be -1, otherwise it will be a positive integer: 0 meaning that is one of the cause of the cascading failure, 1 means that it is disconnected just after, 2 that it’s disconnected just after etc.)”is_illegal” (
bool
) whether the action given as input was illegal”is_ambiguous” (
bool
) whether the action given as input was ambiguous.”is_dispatching_illegal” (
bool
) was the action illegal due to redispatching”is_illegal_reco” (
bool
) was the action illegal due to a powerline reconnection”reason_alarm_illegal” (
None
orException
) reason for which the alarm is illegal (it’s None if no alarm are raised or if the alarm feature is not used)”reason_alert_illegal” (
None
orException
) reason for which the alert is illegal (it’s None if no alert are raised or if the alert feature is not used)”opponent_attack_line” (
np.ndarray
,bool
) for each powerline, say if the opponent attacked it (True
) or not (False
).”opponent_attack_sub” (
np.ndarray
,bool
) for each substation, say if the opponent attacked it (True
) or not (False
).”opponent_attack_duration” (
int
) the duration of the current attack (if any)”exception” (
list
ofExceptions.Exceptions.Grid2OpException
if an exception was raised or[]
if everything was fine.)”detailed_infos_for_cascading_failures” (optional, only if the backend has been create with detailed_infos_for_cascading_failures=True) the list of the intermediate steps computed during the simulation of the “cascading failures”.
”rewards”: dictionary of all “other_rewards” provided when the env was built.
”time_series_id”: id of the time series used (if any, similar to a call to env.chronics_handler.get_id())
Examples
This is used like:
import grid2op from grid2op.Agent import RandomAgent # I create an environment env = grid2op.make("l2rpn_case14_sandbox") # define an agent here, this is an example agent = RandomAgent(env.action_space) # environment need to be "reset" before usage: obs = env.reset() reward = env.reward_range[0] done = False # now run through each steps like this while not done: action = agent.act(obs, reward, done) obs, reward, done, info = env.step(action)
Notes
If the flag done=True is raised (ie this is the end of the episode) then the observation is NOT properly updated and should not be used at all.
Actually, it will be in a “game over” state (see
grid2op.Observation.BaseObservation.set_game_over
).
- class grid2op.Environment.BaseMultiProcessEnvironment(envs, obs_as_class=True, return_info=True, logger=None)[source]
This class allows to evaluate a single agent instance on multiple environments running in parrallel.
It uses the python “multiprocessing” framework to work, and thus is suitable only on a single machine with multiple cores (cpu / thread). We do not recommend to use this method on a cluster of different machines.
This class uses the following representation:
an
grid2op.BaseAgent.BaseAgent
: lives in a main processdifferent environments lives into different processes
a call to
MultiEnv.step()
will perform one step per environment, in parallel using aPipe
to transfer data to and from the main process from each individual environment process. It is a synchronous function. It means it will wait for every environment to finish the step before returning all the information.
There are some limitations. For example, even if forecast are available, it’s not possible to use forecast of the observations. This imply that
grid2op.Observation.BaseObservation.simulate()
is not available when usingMultiEnvironment
Compare to regular Environments,
MultiEnvironment
simply stack everything. You need to send not a singlegrid2op.Action.BaseAction
but as many actions as there are underlying environments. You receive not one singlegrid2op.Observation.BaseObservation
but as many observations as the number of underlying environments.A broader support of regular grid2op environment capabilities as well as support for
grid2op.Observation.BaseObservation.simulate()
call might be added in the future.NB As opposed to
Environment.step()
a call toBaseMultiProcessEnvironment.step()
or any of its derived class (SingleEnvMultiProcess
orMultiEnvMultiProcess
) if a sub environment is “done” then it is automatically reset. This means entails that you can callBaseMultiProcessEnvironment.step()
without worrying about having to reset.- envs
Al list of environments for which the evaluation will be made in parallel.
- Type:
list::grid2op.Environment.Environment
- nb_env
Number of parallel underlying environment that will be handled. It is also the size of the list of actions that need to be provided in
MultiEnvironment.step()
and the return sizes of the list of this same function.- Type:
int
- obs_as_class
Whether to convert the observations back to
grid2op.Observation
object to to leave them as numpy array. Default (obs_as_class=True) to send them as observation object, but it’s slower.- Type:
bool
- return_info
Whether to return the information dictionary or not (might speed up computation)
- Type:
bool
Methods:
close
()Close all the environments and all the processes.
Get the computation time (only of the step part, corresponds to sub_env.comp_time) of each sub environments
get_obs
()implement the get_obs function that is "broken" if you use the __getattr__
Get the parameters of each sub environments
Get the computation time (corresponding to sub_env.backend.comp_time) of each sub environments
Get the seeds used to initialize each sub environments.
Get the computation time (corresponding to sub_env._time_step) of each sub environments
reset
()Reset all the environments, and return all the associated observation.
set_chunk_size
(new_chunk_size)Dynamically adapt the amount of data read from the hard drive.
set_ff
([ff_max])This method is primarily used for training.
set_filter
(filter_funs)Set a filter_fun for each of the underlying environment.
set_id
(id_)Set a chronics id for each of the underlying environment to be used for each of the sub_env.
simulate
(actions)Perform the equivalent of obs.simulate in all the underlying environment
step
(actions)Perform a step in all the underlying environments.
- get_comp_time()[source]
Get the computation time (only of the step part, corresponds to sub_env.comp_time) of each sub environments
- get_powerflow_time()[source]
Get the computation time (corresponding to sub_env.backend.comp_time) of each sub environments
- get_step_time()[source]
Get the computation time (corresponding to sub_env._time_step) of each sub environments
- reset()[source]
Reset all the environments, and return all the associated observation.
NB Except in some specific occasion, there is no need to call this function reset. Indeed, when a sub environment is “done” then it is automatically restarted in the :func:BaseMultiEnvMultiProcess.step` function.
- Returns:
res – The list of all observations. This list counts
MultiEnvironment.nb_env
elements, each one being angrid2op.Observation.BaseObservation
.- Return type:
list
- set_chunk_size(new_chunk_size)[source]
Dynamically adapt the amount of data read from the hard drive. Usefull to set it to a low integer value (eg 10 or 100) at the beginning of the learning process, when agent fails pretty quickly.
This takes effect only after a reset has been performed.
- Parameters:
new_chunk_size (
int
) – The new chunk size (positive integer)
- set_ff(ff_max=2016.0)[source]
This method is primarily used for training.
The problem this method aims at solving is the following: most of grid2op environments starts a Monday at 00:00. This method will “fast forward” an environment for a random number of timestep between 0 and
ff_max
- set_filter(filter_funs)[source]
Set a filter_fun for each of the underlying environment.
See
grid2op.Chronis.MultiFolder.set_filter()
for more informationExamples
TODO usage example
- set_id(id_)[source]
Set a chronics id for each of the underlying environment to be used for each of the sub_env.
See
grid2op.Environment.Environment.set_id()
for more informationExamples
TODO usage example
- simulate(actions)[source]
Perform the equivalent of obs.simulate in all the underlying environment
- Parameters:
actions (
list
) – List of all action to simulate- Returns:
sim_obs – The observation resulting from the simulation
sim_rews – The reward resulting from the simulation
sim_dones – For each simulation, whether or not this the simulated action lead to a game over
sim_infos – Additional information for each simulated actions.
Examples
You can use this feature like:
import grid2op from grid2op.Environment import BaseMultiProcessEnvironment env_name = "l2rpn_case14_sandbox" # or any other name env1 = grid2op.make(env_name) env2 = grid2op.make(env_name) multi_env = BaseMultiProcessEnvironment([env1, env2]) obss = multi_env.reset() # simulate actions = [env1.action_space(), env2.action_space()] sim_obss, sim_rs, sim_ds, sim_is = multi_env.simulate(actions)
- step(actions)[source]
Perform a step in all the underlying environments. If one or more of the underlying environments encounters a game over, it is automatically restarted.
The observation sent back to the user is the observation after the
grid2op.Environment.Environment.reset()
has been called.As opposed to
Environment.step
a call to this function will automatically reset any of the underlying environments in case one of them is “done”. This is performed the following way. In the case one underlying environment is over (due to game over or due to end of the chronics), then:the corresponding “done” is returned as
True
the corresponding observation returned is not the observation of the last time step (corresponding to the underlying environment that is game over) but is the first observation after reset.
At the next call to step, the flag done will be (if not game over arise) set to
False
and the corresponding observation is the next observation of this underlying environment: every thing works as usual in this case.We did that because restarting the game over environment added un necessary complexity.
- Parameters:
actions (
list
) – List ofMultiEnvironment.nb_env
grid2op.Action.BaseAction
. Each action will be executed in the corresponding underlying environment.- Returns:
obs (
list
) – List all the observations returned by each underlying environment.rews (
list
) – List all the rewards returned by each underlying environment.dones (
list
) – List all the “done” returned by each underlying environment. If one of this value is “True” this means the environment encounter a game over.infos (
list
) – List of dictionaries corresponding
Examples
You can use this class as followed:
import grid2op from grid2op.Environment import BaseMultiProcessEnv env1 = grid2op.make("l2rpn_case14_sandbox") # create an environment of your choosing env2 = grid2op.make("l2rpn_case14_sandbox") # create another environment of your choosing multi_env = BaseMultiProcessEnv([env1, env2]) obss = multi_env.reset() obs1, obs2 = obss # here i extract the observation of the first environment and of the second one # note that you cannot do obs1.simulate(). # this is equivalent to a call to # obs1 = env1.reset(); obs2 = env2.reset() # then you can do regular steps action_env1 = env1.action_space() action_env2 = env2.action_space() obss, rewards, dones, infos = env.step([action_env1, action_env2]) # if you define # obs1, obs2 = obss # r1, r2 = rewards # done1, done2 = dones # info1, info2 = infos # in this case, it is equivalent to calling # obs1, r1, done1, info1 = env1.step(action_env1) # obs2, r2, done2, info2 = env2.step(action_env2)
Let us now focus on the “automatic” reset part.
# see above for the creation of a multi_env and the proper imports multi_env = BaseMultiProcessEnv([env1, env2]) action_env1 = env1.action_space() action_env2 = env2.action_space() obss, rewards, dones, infos = env.step([action_env1, action_env2]) # say dones[0] is ``True`` # in this case if you define # obs1 = obss[0] # r1=rewards[0] # done1=done[0] # info1=info[0] # in that case it is equivalent to the "single processed" code # obs1_tmp, r1_tmp, done1_tmp, info1_tmp = env1.step(action_env1) # done1 = done1_tmp # r1 = r1_tmp # info1 = info1_tmp # obs1_aux = env1.reset() # obs1 = obs1_aux # CAREFULLL in this case, obs1 is NOT obs1_tmp but is really
- class grid2op.Environment.Environment(init_env_path: str, init_grid_path: str, chronics_handler, backend, parameters, name='unknown', n_busbar: int | ~typing.List[int] | ~typing.Dict[str, int] = 2, names_chronics_to_backend=None, actionClass=<class 'grid2op.Action.topologyAction.TopologyAction'>, observationClass=<class 'grid2op.Observation.completeObservation.CompleteObservation'>, rewardClass=<class 'grid2op.Reward.flatReward.FlatReward'>, legalActClass=<class 'grid2op.Rules.AlwaysLegal.AlwaysLegal'>, voltagecontrolerClass=<class 'grid2op.VoltageControler.ControlVoltageFromFile.ControlVoltageFromFile'>, other_rewards={}, thermal_limit_a=None, with_forecast=True, epsilon_poly=0.0001, tol_poly=0.01, opponent_space_type=<class 'grid2op.Opponent.opponentSpace.OpponentSpace'>, opponent_action_class=<class 'grid2op.Action.dontAct.DontAct'>, opponent_class=<class 'grid2op.Opponent.baseOpponent.BaseOpponent'>, opponent_init_budget=0.0, opponent_budget_per_ts=0.0, opponent_budget_class=<class 'grid2op.Opponent.neverAttackBudget.NeverAttackBudget'>, opponent_attack_duration=0, opponent_attack_cooldown=99999, kwargs_opponent={}, attention_budget_cls=<class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget={}, has_attention_budget=False, logger=None, kwargs_observation=None, observation_bk_class=None, observation_bk_kwargs=None, highres_sim_counter=None, _update_obs_after_reward=True, _init_obs=None, _raw_backend_class=None, _compat_glop_version=None, _read_from_local_dir=None, _is_test=False, _allow_loaded_backend=False, _local_dir_cls=None, _overload_name_multimix=None)[source]
This class is the grid2op implementation of the “Environment” entity in the RL framework.
Danger
Long story short, once a environment is deleted, you cannot use anything it “holds” including, but not limited to the capacity to perform obs.simulate(…) even if the obs is still referenced.
See Notes (first danger block).
- name
The name of the environment
- Type:
str
- action_space
Another name for
Environment.helper_action_player
for gym compatibility.
- observation_space
Another name for
Environment.helper_observation
for gym compatibility.
- reward_range
The range of the reward function
- Type:
(float, float)
- metadata
For gym compatibility, do not use
- Type:
dict
- spec
For Gym compatibility, do not use
- Type:
None
- _viewer
Used to display the powergrid. Currently properly supported.
- Type:
object
Methods:
add_text_logger
([logger])Add a text logger to this
Environment
attach_renderer
([graph_layout])This function will attach a renderer, necessary to use for plotting capabilities.
copy
()Performs a deep copy of the environment
generate_data
([nb_year, nb_core, seed])This function uses the chronix2grid package to generate more data that will then be available locally.
get_kwargs
([with_backend, ...])This function allows to make another Environment with the same parameters as the one that have been used to make this one.
This method is used to initialize a proper
grid2op.Runner.Runner
to use this specific environment.Return the maximum duration (in number of steps) of the current episode.
render
([mode])Render the state of the environment on the screen, using matplotlib Also returns the Matplotlib figure
reset
(*[, seed, options])Reset the environment to a clean state.
reset_grid
([init_act_opt, method])INTERNAL
set_chunk_size
(new_chunk_size)For an efficient data pipeline, it can be usefull to not read all part of the input data (for example for load_p, prod_p, load_q, prod_v).
set_id
(id_)Set the id that will be used at the next call to
Environment.reset()
.set_max_iter
(max_iter)Set the maximum duration of an episode for all the next episodes.
simulate
(action)Another method to call obs.simulate to ensure compatibility between multi environment and regular one.
train_val_split
(val_scen_id[, ...])This function is used as
Environment.train_val_split_random()
.train_val_split_random
([pct_val, ...])By default a grid2op environment contains multiple "scenarios" containing values for all the producers and consumers representing multiple days.
- add_text_logger(logger=None)[source]
Add a text logger to this
Environment
Logging is for now an incomplete feature, really incomplete (not used)
- Parameters:
logger – The logger to use
- attach_renderer(graph_layout=None)[source]
This function will attach a renderer, necessary to use for plotting capabilities.
- Parameters:
graph_layout (
dict
) –Here for backward compatibility. Currently not used.
If you want to set a specific layout call
BaseEnv.attach_layout()
If
None
this class will use the default substations layout provided when the environment was created. Otherwise it will use the data provided.
Examples
Here is how to use the function
import grid2op # create the environment env = grid2op.make("l2rpn_case14_sandbox") if False: # if you want to change the default layout of the powergrid # assign coordinates (0., 0.) to all substations (this is a dummy thing to do here!) layout = {sub_name: (0., 0.) for sub_name in env.name_sub} env.attach_layout(layout) # NB again, this code will make everything look super ugly !!!! Don't change the # default layout unless you have a reason to. # and if you want to use the renderer env.attach_renderer() # and now you can "render" (plot) the state of the grid obs = env.reset() done = False reward = env.reward_range[0] while not done: env.render() action = agent.act(obs, reward, done) obs, reward, done, info = env.step(action)
- copy() Environment [source]
Performs a deep copy of the environment
Unless you have a reason to, it is not advised to make copy of an Environment.
Examples
It should be used as follow:
import grid2op env = grid2op.make("l2rpn_case14_sandbox") cpy_of_env = env.copy()
- generate_data(nb_year=1, nb_core=1, seed=None, **kwargs)[source]
This function uses the chronix2grid package to generate more data that will then be available locally. You need to install it independently (see https://github.com/BDonnot/ChroniX2Grid#installation for more information)
I also requires the lightsim2grid simulator.
This is only available for some environment (only the environment after 2022).
Generating data takes some time (around 1 - 2 minutes to generate a weekly scenario) and this why we recommend to do it “offline” and then use the generated data for training or evaluation.
Warning
You should not start this function twice. Before starting a new run, make sure the previous one has terminated (otherwise you might erase some previously generated scenario)
Examples
The recommended process when you want to use this function is to first generate some more data:
import grid2op env = grid2op.make("l2rpn_wcci_2022") env.generate_data(nb_year=XXX) # replace XXX by the amount of data you want. If you put 1 you will have 52 different # scenarios
Then, later on, you can use it as you please, transparently:
import grid2op env = grid2op.make("l2rpn_wcci_2022") obs = env.reset() # obs might come from the data you have generated
- Parameters:
nb_year (int, optional) – the number of “year” you want to generate. Each “year” is made of 52 weeks meaning that if you ask to generate one year, you have 52 more scenarios, by default 1
nb_core (int, optional) – number of computer cores to use, by default 1.
seed (int, optional) – If the same seed is given, then the same data will be generated.
**kwargs – key word arguments passed to add_data function of chronix2grid.grid2op_utils module
- get_kwargs(with_backend=True, with_chronics_handler=True, with_backend_kwargs=False)[source]
This function allows to make another Environment with the same parameters as the one that have been used to make this one.
This is useful especially in cases where Environment is not pickable (for example if some non pickable c++ code are used) but you still want to make parallel processing using “MultiProcessing” module. In that case, you can send this dictionary to each child process, and have each child process make a copy of
self
NB This function should not be used to make a copy of an environment. Prefer using
Environment.copy()
for such purpose.- Returns:
res – A dictionary that helps build an environment like
self
(which is NOT a copy of self) but rather an instance of an environment with the same properties.- Return type:
dict
Examples
It should be used as follow:
import grid2op from grid2op.Environment import Environment env = grid2op.make("l2rpn_case14_sandbox") # create the environment of your choice copy_of_env = Environment(**env.get_kwargs()) # And you can use this one as you would any other environment. # NB this is not a "proper" copy. for example it will not be at the same step, it will be possible # seeded with a different seed. # use `env.copy()` to make a proper copy of an environment.
- get_params_for_runner()[source]
This method is used to initialize a proper
grid2op.Runner.Runner
to use this specific environment.Examples
It should be used as followed:
import grid2op from grid2op.Runner import Runner from grid2op.Agent import DoNothingAgent # for example env = grid2op.make("l2rpn_case14_sandbox") # create the environment of your choice # create the proper runner runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent) # now you can run runner.run(nb_episode=1) # run for 1 episode
- max_episode_duration()[source]
Return the maximum duration (in number of steps) of the current episode.
Notes
For possibly infinite episode, the duration is returned as np.iinfo(np.int32).max which corresponds to the maximum 32 bit integer (usually 2147483647)
- render(mode='rgb_array')[source]
Render the state of the environment on the screen, using matplotlib Also returns the Matplotlib figure
Examples
Rendering need first to define a “renderer” which can be done with the following code:
import grid2op # create the environment env = grid2op.make("l2rpn_case14_sandbox") # if you want to use the renderer env.attach_renderer() # and now you can "render" (plot) the state of the grid obs = env.reset() done = False reward = env.reward_range[0] while not done: env.render() # this piece of code plot the grid action = agent.act(obs, reward, done) obs, reward, done, info = env.step(action)
- reset(*, seed: int | None = None, options: Dict[Literal['time serie id'], int] | Dict[Literal['init state'], Dict[Literal['set_line_status', 'change_line_status', 'set_bus', 'change_bus', 'redispatch', 'set_storage', 'curtail', 'raise_alarm', 'raise_alert', 'injection', 'hazards', 'maintenance', 'shunt'], Any]] | Dict[Literal['init ts'], int] | Dict[Literal['max step'], int] | None = None) BaseObservation [source]
Reset the environment to a clean state. It will reload the next chronics if any. And reset the grid to a clean state.
This triggers a full reloading of both the chronics (if they are stored as files) and of the powergrid, to ensure the episode is fully over.
This method should be called only at the end of an episode.
- Parameters:
seed (int) – The seed to used (new in version 1.9.8), see examples for more details. Ignored if not set (meaning no seeds will be used, experiments might not be reproducible)
options (dict) –
Some options to “customize” the reset call. For example specifying the “time serie id” (grid2op >= 1.9.8) to use or the “initial state of the grid” (grid2op >= 1.10.2) or to start the episode at some specific time in the time series (grid2op >= 1.10.3) with the “init ts” key.
See examples for more information about this. Ignored if not set.
Examples
The standard “gym loop” can be done with the following code:
import grid2op # create the environment env_name = "l2rpn_case14_sandbox" env = grid2op.make(env_name) # start a new episode obs = env.reset() done = False reward = env.reward_range[0] while not done: action = agent.act(obs, reward, done) obs, reward, done, info = env.step(action)
New in version 1.9.8: It is now possible to set the seed and the time series you want to use at the new episode by calling env.reset(seed=…, options={“time serie id”: …})
Before version 1.9.8, if you wanted to use a fixed seed, you would need to (see doc of
grid2op.Environment.BaseEnv.seed()
):seed = ... env.seed(seed) obs = env.reset() ...
Starting from version 1.9.8 you can do this in one call:
seed = ... obs = env.reset(seed=seed)
For the “time series id” it is the same concept. Before you would need to do (see doc of
Environment.set_id()
for more information ):time_serie_id = ... env.set_id(time_serie_id) obs = env.reset() ...
And now (from version 1.9.8) you can more simply do:
time_serie_id = ... obs = env.reset(options={"time serie id": time_serie_id}) ...
New in version 1.10.2.
Another feature has been added in version 1.10.2, which is the possibility to set the grid to a given “topological” state at the first observation (before this version, you could only retrieve an observation with everything connected together).
In grid2op 1.10.2, you can do that by using the keys “init state” in the “options” kwargs of the reset function. The value associated to this key should be dictionnary that can be converted to a non ambiguous grid2op action using an “action space”.
Note
The “action space” used here is not the action space of the agent. It’s an “action space” that uses a
grid2op.Action.Action.BaseAction()
class meaning you can do any type of action, on shunts, on topology, on line status etc. even if the agent is not allowed to.Likewise, nothing check if this action is legal or not.
You can use it like this:
# to start an episode with a line disconnected, you can do: init_state_dict = {"set_line_status": [(0, -1)]} obs = env.reset(options={"init state": init_state_dict}) obs.line_status[0] is False # to start an episode with a different topolovy init_state_dict = {"set_bus": {"lines_or_id": [(0, 2)], "lines_ex_id": [(3, 2)]}} obs = env.reset(options={"init state": init_state_dict})
Note
Since grid2op version 1.10.2, there is also the possibility to set the “initial state” of the grid directly in the time series. The priority is always given to the argument passed in the “options” value.
Concretely if, in the “time series” (formelly called “chronics”) provides an action would change the topology of substation 1 and 2 (for example) and you provide an action that disable the line 6, then the initial state will see substation 1 and 2 changed (as in the time series) and line 6 disconnected.
Another example in this case: if the action you provide would change topology of substation 2 and 4 then the initial state (after env.reset) will give:
substation 1 as in the time serie
substation 2 as in “options”
substation 4 as in “options”
Note
Concerning the previously described behaviour, if you want to ignore the data in the time series, you can add : “method”: “ignore” in the dictionary describing the action. In this case the action in the time series will be totally ignored and the initial state will be fully set by the action passed in the “options” dict.
An example is:
init_state_dict = {"set_line_status": [(0, -1)], "method": "force"} obs = env.reset(options={"init state": init_state_dict}) obs.line_status[0] is False
New in version 1.10.3.
Another feature has been added in version 1.10.3, the possibility to skip the some steps of the time series and starts at some given steps.
The time series often always start at a given day of the week (eg Monday) and at a given time (eg midnight). But for some reason you notice that your agent performs poorly on other day of the week or time of the day. This might be because it has seen much more data from Monday at midnight that from any other day and hour of the day.
To alleviate this issue, you can now easily reset an episode and ask grid2op to start this episode after xxx steps have “passed”.
Concretely, you can do it with:
import grid2op env_name = "l2rpn_case14_sandbox" env = grid2op.make(env_name) obs = env.reset(options={"init ts": 1})
Doing that your agent will start its episode not at midnight (which is the case for this environment), but at 00:05
If you do:
obs = env.reset(options={"init ts": 12})
In this case, you start the episode at 01:00 and not at midnight (you start at what would have been the 12th steps)
If you want to start the “next day”, you can do:
obs = env.reset(options={"init ts": 288})
etc.
Note
On this feature, if a powerline is on soft overflow (meaning its flow is above the limit but below the
grid2op.Parameters.Parameters.HARD_OVERFLOW_THRESHOLD
* the limit) then it is still connected (of course) and the countergrid2op.Observation.BaseObservation.timestep_overflow
is at 0.If a powerline is on “hard overflow” (meaning its flow would be above
grid2op.Parameters.Parameters.HARD_OVERFLOW_THRESHOLD
* the limit), then, as it is the case for a “normal” (without options) reset, this line is disconnected, but can be reconnected directly (grid2op.Observation.BaseObservation.time_before_cooldown_line
== 0)See also
The function
Environment.fast_forward_chronics()
for an alternative usage (that will be deprecated at some point)Yet another feature has been added in grid2op version 1.10.3 in this env.reset function. It is the capacity to limit the duration of an episode.
import grid2op env_name = "l2rpn_case14_sandbox" env = grid2op.make(env_name) obs = env.reset(options={"max step": 288})
This will limit the duration to 288 steps (1 day), meaning your agent will have successfully managed the entire episode if it manages to keep the grid in a safe state for a whole day (depending on the environment you are using the default duration is either one week - roughly 2016 steps or 4 weeks)
Note
This option only affect the current episode. It will have no impact on the next episode (after reset)
For example:
obs = env.reset() obs.max_step == 8064 # default for this environment obs = env.reset(options={"max step": 288}) obs.max_step == 288 # specified by the option obs = env.reset() obs.max_step == 8064 # retrieve the default behaviour
See also
The function
Environment.set_max_iter()
for an alternative usage with the different that set_max_iter is permenanent: it impacts all the future episodes and not only the next one.
- reset_grid(init_act_opt: BaseAction | None = None, method: Literal['combine', 'ignore'] = 'combine')[source]
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\
This is automatically called when using env.reset
Reset the backend to a clean state by reloading the powergrid from the hard drive. This might takes some time.
If the thermal has been modified, it also modify them into the new backend.
- set_chunk_size(new_chunk_size)[source]
For an efficient data pipeline, it can be usefull to not read all part of the input data (for example for load_p, prod_p, load_q, prod_v). Grid2Op support the reading of large chronics by “chunk” of given size.
Reading data in chunk can also reduce the memory footprint, useful in case of multiprocessing environment while large chronics.
It is critical to set a small chunk_size in case of training machine learning algorithm (reinforcement learning agent) at the beginning when the agent performs poorly, the software might spend most of its time loading the data.
NB this has no effect if the chronics does not support this feature.
NB The environment need to be reset for this to take effect (it won’t affect the chronics already loaded)
- Parameters:
new_chunk_size (
int
orNone
) – The new chunk size (positive integer)
Examples
Here is an example on how to use this function
import grid2op # I create an environment env = grid2op.make("l2rpn_case14_sandbox", test=True) env.set_chunk_size(100) env.reset() # otherwise chunk size has no effect ! # and now data will be read from the hard drive 100 time steps per 100 time steps # instead of the whole episode at once.
- set_id(id_: int | str) None [source]
Set the id that will be used at the next call to
Environment.reset()
.NB this has no effect if the chronics does not support this feature.
NB The environment need to be reset for this to take effect.
Changed in version 1.6.4: id_ can now be a string instead of an integer. You can call something like env.set_id(“0000”) or env.set_id(“Scenario_april_000”) or env.set_id(“2050-01-03_0”) (depending on your environment) to use the right time series.
See also
function
Environment.reset()
for extra informationChanged in version 1.9.8: Starting from version 1.9.8 you can directly set the time serie id when calling reset.
Warning
If the “time serie generator” you use is on standard (eg it is random in some sense) and if you want fully reproducible results, you should first call env.set_id(…) and then call env.seed(…) (and of course env.reset())
Calling env.seed(…) and then env.set_id(…) might not behave the way you want.
In this case, it is much better to use the function reset(seed=…, options={“time serie id”: …}) directly.
- Parameters:
id (
int
) – the id of the chronics used.
Examples
Here an example that will loop 10 times through the same chronics (always using the same injection then):
import grid2op from grid2op import make from grid2op.BaseAgent import DoNothingAgent env = make("l2rpn_case14_sandbox") # create an environment agent = DoNothingAgent(env.action_space) # create an BaseAgent for i in range(10): env.set_id(0) # tell the environment you simply want to use the chronics with ID 0 obs = env.reset() # it is necessary to perform a reset reward = env.reward_range[0] done = False while not done: act = agent.act(obs, reward, done) obs, reward, done, info = env.step(act)
And here you have an example on how you can loop through the scenarios in a given order:
import grid2op from grid2op import make from grid2op.BaseAgent import DoNothingAgent env = make("l2rpn_case14_sandbox") # create an environment agent = DoNothingAgent(env.action_space) # create an BaseAgent scenario_order = [1,2,3,4,5,10,8,6,5,7,78, 8] for id_ in scenario_order: env.set_id(id_) # tell the environment you simply want to use the chronics with ID 0 obs = env.reset() # it is necessary to perform a reset reward = env.reward_range[0] done = False while not done: act = agent.act(obs, reward, done) obs, reward, done, info = env.step(act)
- set_max_iter(max_iter)[source]
Set the maximum duration of an episode for all the next episodes.
See also
The option max step when calling the
Environment.reset()
function used like obs = env.reset(options={“max step”: 288}) (see examples of env.reset for more information)Note
The real maximum duration of a duration depends on this parameter but also on the size of the time series used. For example, if you use an environment with time series lasting 8064 steps and you call env.set_max_iter(9000) the maximum number of iteration will still be 8064.
Warning
It only has an impact on future episode. Said differently it also has an impact AFTER env.reset has been called.
Danger
The usage of both
BaseEnv.fast_forward_chronics()
andEnvironment.set_max_iter()
is not recommended at all and might not behave correctly. Please use env.reset with obs = env.reset(options={“max step”: xxx, “init ts”: yyy}) for a correct behaviour.- Parameters:
max_iter (
int
) – The maximum number of iterations you can do before reaching the end of the episode. Set it to “-1” for possibly infinite episode duration.
Examples
It can be used like this:
import grid2op env_name = "l2rpn_case14_sandbox" env = grid2op.make(env_name) obs = env.reset() obs.max_step == 8064 # default for this environment env.set_max_iter(288) # no impact here obs = env.reset() obs.max_step == 288 # the limitation still applies to the next episode obs = env.reset() obs.max_step == 288
If you want to “unset” your limitation, you can do:
env.set_max_iter(-1) obs = env.reset() obs.max_step == 8064
Finally, you cannot limit it to something larger than the duration of the time series of the environment:
env.set_max_iter(9000) obs = env.reset() obs.max_step == 8064 # the call to env.set_max_iter has no impact here
Notes
Maximum length of the episode can depend on the chronics used. See
Environment.chronics_handler
for more information
- simulate(action)[source]
Another method to call obs.simulate to ensure compatibility between multi environment and regular one.
- Parameters:
action – A grid2op action
- Returns:
Same return type as
grid2op.Environment.BaseEnv.step()
or
Notes
Prefer using obs.simulate if possible, it will be faster than this function.
- train_val_split(val_scen_id, add_for_train='train', add_for_val='val', add_for_test=None, test_scen_id=None, remove_from_name=None, deep_copy=False)[source]
This function is used as
Environment.train_val_split_random()
.Please refer to this the help of
Environment.train_val_split_random()
for more information about this function.- Parameters:
val_scen_id (
list
) – List of the scenario names that will be placed in the validation settest_scen_id (
list
) –New in version 2.6.5.
List of the scenario names that will be placed in the test set (only used if add_for_test is not None - and mandatory in this case)
add_for_train (
str
) – SeeEnvironment.train_val_split_random()
for more informationadd_for_val (
str
) – SeeEnvironment.train_val_split_random()
for more informationadd_for_test (
str
) –New in version 2.6.5.
See
Environment.train_val_split_random()
for more informationremove_from_name (
str
) – SeeEnvironment.train_val_split_random()
for more informationdeep_copy (
bool
) –New in version 2.6.5.
See
Environment.train_val_split_random()
for more information
- Returns:
nm_train (
str
) – SeeEnvironment.train_val_split_random()
for more informationnm_val (
str
) – SeeEnvironment.train_val_split_random()
for more informationnm_test (
str
, optionnal) – .. versionadded:: 2.6.5See
Environment.train_val_split_random()
for more information
Examples
A full example on a training / validation / test split with explicit specification of which chronics goes in which scenarios is:
import grid2op import os env_name = "l2rpn_case14_sandbox" # or any other... env = grid2op.make(env_name) # retrieve the names of the chronics: full_path_data = env.chronics_handler.subpaths chron_names = [os.path.split(el)[-1] for el in full_path_data] # splitting into training / test, keeping the "last" 10 chronics to the test set nm_env_train, m_env_val, nm_env_test = env.train_val_split(test_scen_id=chron_names[-10:], # last 10 in test set add_for_test="test", val_scen_id=chron_names[-20:-10], # last 20 to last 10 in val test ) env_train = grid2op.make(env_name+"_train") env_val = grid2op.make(env_name+"_val") env_test = grid2op.make(env_name+"_test")
For a more simple example, with less parametrization and with random assignment (recommended), please refer to the help of
Environment.train_val_split_random()
NB read the “Notes” of this section for possible “unexpected” behaviour of the code snippet above.
On Some windows based platform, if you don’t have an admin account nor a “developer” account (see https://docs.python.org/3/library/os.html#os.symlink) you might need to do:
import grid2op import os env_name = "l2rpn_case14_sandbox" # or any other... env = grid2op.make(env_name) # retrieve the names of the chronics: full_path_data = env.chronics_handler.subpaths chron_names = [os.path.split(el)[-1] for el in full_path_data] # splitting into training / test, keeping the "last" 10 chronics to the test set nm_env_train, m_env_val, nm_env_test = env.train_val_split(test_scen_id=chron_names[-10:], # last 10 in test set add_for_test="test", val_scen_id=chron_names[-20:-10], # last 20 to last 10 in val test deep_copy=True)
Warning
The above code will use much more memory on your hard drive than the version using symbolic links. It will also be significantly slower !
As an “historical curiosity”, this is what you needed to do in grid2op version < 1.6.5:
import grid2op import os env_name = "l2rpn_case14_sandbox" # or any other... env = grid2op.make(env_name) # retrieve the names of the chronics: full_path_data = env.chronics_handler.subpaths chron_names = [os.path.split(el)[-1] for el in full_path_data] # splitting into training / test, keeping the "last" 10 chronics to the test set nm_env_trainval, nm_env_test = env.train_val_split(val_scen_id=chron_names[-10:], add_for_val="test", add_for_train="trainval") # now splitting again the training set into training and validation, keeping the last 10 chronics # of this environment for validation env_trainval = grid2op.make(nm_env_trainval) # create the "trainval" environment full_path_data = env_trainval.chronics_handler.subpaths chron_names = [os.path.split(el)[-1] for el in full_path_data] nm_env_train, nm_env_val = env_trainval.train_val_split(val_scen_id=chron_names[-10:], remove_from_name="_trainval$") # and now you can use the following code to load the environments: env_train = grid2op.make(env_name+"_train") env_val = grid2op.make(env_name+"_val") env_test = grid2op.make(env_name+"_test")
Notes
We don’t recommend you to use this function. It provides a great level of control on which scenarios goes into which dataset, which is nice, but “with great power comes great responsibilities”.
Keep in mind that scenarios might be “sorted” by having some “month” in their names. For example, the first k scenarios might be called “April_XXX” and the last k ones having names with “September_XXX”.
In general, we would not consider good practice to have all validation (or test) scenarios coming from the same months. Keep that in mind if you use the code snippet above.
- train_val_split_random(pct_val=10.0, add_for_train='train', add_for_val='val', add_for_test=None, pct_test=None, remove_from_name=None, deep_copy=False)[source]
By default a grid2op environment contains multiple “scenarios” containing values for all the producers and consumers representing multiple days. In a “game like” environment, you can think of the scenarios as being different “game levels”: different mazes in pacman, different levels in mario etc.
We recommend to train your agent on some of these “chroncis” (aka levels) and test the performance of your agent on some others, to avoid overfitting.
This function allows to easily split an environment into different part. This is most commonly used in machine learning where part of a dataset is used for training and another part is used for assessing the performance of the trained model.
This function rely on “symbolic link” and will not duplicate data.
New created environments will behave like regular grid2op environment and will be accessible with “make” just like any others (see the examples section for more information).
This function will make the split at random. If you want more control on the which scenarios to use for training and which for validation, use the
Environment.train_val_split()
that allows to specify which scenarios goes in the validation environment (and the others go in the training environment).- Parameters:
pct_val (
float
) – Percentage of chronics that will go to the validation set. For 10% of the chronics, set it to 10. and NOT to 0.1.add_for_train (
str
) – Suffix that will be added to the name of the environment for the training set. We don’t recommend to modify the default value (“train”)add_for_val (
str
) – Suffix that will be added to the name of the environment for the validation set. We don’t recommend to modify the default value (“val”)add_for_test (
str
, (optional)) –New in version 2.6.5.
Suffix that will be added to the name of the environment for the test set. By default, it only splits into training and validation, so this is ignored. We recommend to assign it to “test” if you want to split into training / validation and test. If it is set, then the pct_test must also be set.
pct_test (
float
, (optional)) –New in version 2.6.5.
Percentage of chronics that will go to the test set. For 10% of the chronics, set it to 10. and NOT to 0.1. (If you set it, you need to set the add_for_test argument.)
remove_from_name (
str
) – If you “split” an environment multiple times, this allows you to keep “short” names (for example you will be able to call grid2op.make(env_name+”_train”) instead of grid2op.make(env_name+”_train_train”))deep_copy (
bool
) –New in version 2.6.5.
A function to specify to “copy” the elements of the original environment to the created one. By default it will save as much memory as possible using symbolic links (rather than performing copies). By default it does use symbolic links (deep_copy=False).
Note
If set to
True
the new environment will take much more space on the hard drive, and the execution of this function will be much slower !Warning
On windows based system, you will most likely run into issues if you don’t set this parameters. Indeed, Windows does not link symbolink links (https://docs.python.org/3/library/os.html#os.symlink). In this case, you can use the
deep_copy=True
and it will work fine (examples in the functionEnvironment.train_val_split()
)
- Returns:
nm_train (
str
) – Complete name of the “training” environmentnm_val (
str
) – Complete name of the “validation” environmentnm_test (
str
, optionnal) – .. versionadded:: 2.6.5Complete name of the “test” environment. It is only returned if add_for_test and pct_test are not None.
Examples
This function can be used like:
import grid2op env_name = "l2rpn_case14_sandbox" # or any other... env = grid2op.make(env_name) # extract 1% of the "chronics" to be used in the validation environment. The other 99% will # be used for test nm_env_train, nm_env_val = env.train_val_split_random(pct_val=1.) # and now you can use the training set only to train your agent: print(f"The name of the training environment is \"{nm_env_train}\"") print(f"The name of the validation environment is \"{nm_env_val}\"") env_train = grid2op.make(nm_env_train)
And even after you close the python session, you can still use this environment for training. If you used the exact code above that will look like:
import grid2op env_name_train = "l2rpn_case14_sandbox_train" # depending on the option you passed above env_train = grid2op.make(env_name_train)
New in version 2.6.5: Possibility to create a training, validation AND test set.
If you have grid2op version >= 1.6.5, you can also use the following:
import grid2op env_name = "l2rpn_case14_sandbox" # or any other... env = grid2op.make(env_name) # extract 1% of the "chronics" to be used in the validation environment. The other 99% will # be used for test nm_env_train, nm_env_val, nm_env_test = env.train_val_split_random(pct_val=1., pct_test=1.) # and now you can use the training set only to train your agent: print(f"The name of the training environment is \"{nm_env_train}\"") print(f"The name of the validation environment is \"{nm_env_val}\"") print(f"The name of the test environment is \"{nm_env_test}\"") env_train = grid2op.make(nm_env_train)
Warning
In this case this function returns 3 elements and not 2 !
Notes
This function will fail if an environment already exists with one of the name that would be given to the training environment or the validation environment (or test environment).
- class grid2op.Environment.MaskedEnvironment(grid2op_env: Environment | dict, lines_of_interest)[source]
This class is the grid2op implementation of a “maked” environment: lines not in the lines_of_interest mask will NOT be deactivated by the environment is the flow is too high (or moderately high for too long.)
Warning
This class might not behave normally if used with TimeOutEnvironment, MultiEnv, MultiMixEnv etc.
Warning
At time of writing, the behaviour of “obs.simulate” is not modified
Examples
We recommend you build such an environment with:
import grid2op from grid2op.Environment import MaskedEnvironment env_name = "l2rpn_case14_sandbox" lines_of_interest = np.array([True, True, True, True, True, True, False, False, False, False, False, False, False, False, False, False, False, False, False, False]) env = MaskedEnvironment(grid2op.make(env_name), lines_of_interest=lines_of_interest)
In particular, make sure to use grid2op.make(…) when creating the MaskedEnvironment and not to use another environment.
Methods:
get_kwargs
([with_backend, with_chronics_handler])This function allows to make another Environment with the same parameters as the one that have been used to make this one.
This method is used to initialize a proper
grid2op.Runner.Runner
to use this specific environment.- get_kwargs(with_backend=True, with_chronics_handler=True)[source]
This function allows to make another Environment with the same parameters as the one that have been used to make this one.
This is useful especially in cases where Environment is not pickable (for example if some non pickable c++ code are used) but you still want to make parallel processing using “MultiProcessing” module. In that case, you can send this dictionary to each child process, and have each child process make a copy of
self
NB This function should not be used to make a copy of an environment. Prefer using
Environment.copy()
for such purpose.- Returns:
res – A dictionary that helps build an environment like
self
(which is NOT a copy of self) but rather an instance of an environment with the same properties.- Return type:
dict
Examples
It should be used as follow:
import grid2op from grid2op.Environment import Environment env = grid2op.make("l2rpn_case14_sandbox") # create the environment of your choice copy_of_env = Environment(**env.get_kwargs()) # And you can use this one as you would any other environment. # NB this is not a "proper" copy. for example it will not be at the same step, it will be possible # seeded with a different seed. # use `env.copy()` to make a proper copy of an environment.
- get_params_for_runner()[source]
This method is used to initialize a proper
grid2op.Runner.Runner
to use this specific environment.Examples
It should be used as followed:
import grid2op from grid2op.Runner import Runner from grid2op.Agent import DoNothingAgent # for example env = grid2op.make("l2rpn_case14_sandbox") # create the environment of your choice # create the proper runner runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent) # now you can run runner.run(nb_episode=1) # run for 1 episode
- class grid2op.Environment.MultiEnvMultiProcess(envs, nb_envs, obs_as_class=True, return_info=True, logger=None)[source]
This class allows to evaluate a single agent instance on multiple environments running in parrallel.
It is a kind of
BaseMultiProcessEnvironment
. For more information you can consult the documentation of this parent class. This class allows to interact at the same time with different copy of possibly different environments in parallel- envs
Al list of environments for which the evaluation will be made in parallel.
- Type:
list:grid2op.Environment.Environment
- nb_envs
Number of parallel underlying environment that will be handled. MUST be the same length as the parameter envs. The total number of subprocesses will be the sum of this list.
- Type:
list:int
Examples
This class can be used as:
import grid2op from grid2op.Environment import MultiEnvMultiProcess env0 = grid2op.make("l2rpn_case14_sandbox") # create an environment env1 = grid2op.make("l2rpn_case14_sandbox") # create a second environment, that can be similar, or not # it is recommended to filter or create the environment with different parameters, otherwise this class # is of little interest envs = [env0, env1] # list of all environments created nb_envs = [1, 7] # number of "copies" of each environment that will be made. # in this case the first one will be copied only once, and the second one 7 times. # the total number of environments used in the multi env will be the sum(nb_envs), here 8. multi_env = MultiEnvMultiProcess(envs=envs, nb_envs=nb_envs) # and now you can use it like any other grid2op environment (almost) observations = multi_env.reset()
- class grid2op.Environment.MultiMixEnvironment(envs_dir, logger=None, experimental_read_from_local_dir=None, n_busbar=2, _add_to_name='', _compat_glop_version=None, _test=False, **kwargs)[source]
This class represent a single powergrid configuration, backed by multiple environments parameters and chronics
It implements most of the
BaseEnv
public interface: so it can be used as a more classic environment.MultiMixEnvironment environments behave like a superset of the environment: they are made of sub environments (called mixes) that are grid2op regular
Environment
. You might think the MultiMixEnvironment as a dictionary ofEnvironment
that implements some of theBaseEnv
interface such asBaseEnv.step()
orBaseEnv.reset()
.By default, each time you call the “step” function a different mix is used. Mixes, by default are looped through always in the same order. You can see the Examples section for information about control of these
Examples
In this section we present some common use of the MultiMix environment.
Basic Usage
You can think of a MultiMixEnvironment as any
Environment
. So this is a perfectly valid way to use a MultiMix:import grid2op from grid2op.Agent import RandomAgent # we use an example of a multimix dataset attached with grid2op pacakage multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True) # define an agent like in any environment agent = RandomAgent(multimix_env.action_space) # and now you can do the open ai gym loop NB_EPISODE = 10 for i in range(NB_EPISODE): obs = multimix_env.reset() # each time "reset" is called, another mix is used. reward = multimix_env.reward_range[0] done = False while not done: act = agent.act(obs, reward, done) obs, reward, done, info = multimix_env.step(act)
Use each mix one after the other
In case you want to study each mix independently, you can iterate through the MultiMix in a pythonic way. This makes it easy to perform, for example, 10 episode for a given mix before passing to the next one.
import grid2op from grid2op.Agent import RandomAgent # we use an example of a multimix dataset attached with grid2op pacakage multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True) NB_EPISODE = 10 for mix in multimix_env: # mix is a regular environment, you can do whatever you want with it # for example for i in range(NB_EPISODE): obs = multimix_env.reset() # each time "reset" is called, another mix is used. reward = multimix_env.reward_range[0] done = False while not done: act = agent.act(obs, reward, done) obs, reward, done, info = multimix_env.step(act)
Selecting a given Mix
Sometimes it might be interesting to study only a given mix. For that you can use the [] operator to select only a given mix (which is a grid2op environment) and use it as you would.
This can be done with:
import grid2op from grid2op.Agent import RandomAgent # we use an example of a multimix dataset attached with grid2op pacakage multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True) # define an agent like in any environment agent = RandomAgent(multimix_env.action_space) # list all available mixes: mixes_names = list(multimix_env.keys()) # and now supposes we want to study only the first one mix = multimix_env[mixes_names[0]] # and now you can do the open ai gym loop, or anything you want with it NB_EPISODE = 10 for i in range(NB_EPISODE): obs = mix.reset() # each time "reset" is called, another mix is used. reward = mix.reward_range[0] done = False while not done: act = agent.act(obs, reward, done) obs, reward, done, info = mix.step(act)
Using the Runner
For MultiMixEnvironment using the
grid2op.Runner.Runner
cannot be done in a straightforward manner. Here we give an example on how to do it.import os import grid2op from grid2op.Agent import RandomAgent # we use an example of a multimix dataset attached with grid2op pacakage multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True) # you can use the runner as following PATH = "PATH/WHERE/YOU/WANT/TO/SAVE/THE/RESULTS" for mix in multimix_env: runner = Runner(**mix.get_params_for_runner(), agentClass=RandomAgent) runner.run(nb_episode=1, path_save=os.path.join(PATH,mix.name))
Methods:
attach_layout
(grid_layout)INTERNAL
Get the path that allows to create this environment.
seed
([seed])Set the seed of this
Environment
for a better control and to ease reproducible experiments.set_thermal_limit
(thermal_limit)Set the thermal limit effectively.
- attach_layout(grid_layout)[source]
INTERNAL
Warning
/!\ Internal, do not use unless you know what you are doing /!\ We do not recommend to “attach layout” outside of the environment. Please refer to the function
grid2op.Environment.BaseEnv.attach_layout()
for more information.grid layout is a dictionary with the keys the name of the substations, and the value the tuple of coordinates of each substations. No check are made it to ensure it is correct.
- Parameters:
grid_layout (
dict
) – See definition ofGridObjects.grid_layout
for more information.
- get_path_env()[source]
Get the path that allows to create this environment.
It can be used for example in grid2op.utils.underlying_statistics to save the information directly inside the environment data.
- seed(seed=None)[source]
Set the seed of this
Environment
for a better control and to ease reproducible experiments.- Parameters:
seed (
int
) – The seed to set.- Returns:
seeds – The seed used to set the prng (pseudo random number generator) for all environments, and each environment
tuple
seeds- Return type:
list
- class grid2op.Environment.SingleEnvMultiProcess(env, nb_env, obs_as_class=True, return_info=True, logger=None)[source]
This class allows to evaluate a single agent instance on multiple environments running in parallel.
It is a kind of
BaseMultiProcessEnvironment
. For more information you can consult the documentation of this parent class. It allows to interact at the same time with different copy of the (same) environment in parallel- env
Al list of environments for which the evaluation will be made in parallel.
- Type:
list::grid2op.Environment.Environment
- nb_env
Number of parallel underlying environment that will be handled. It is also the size of the list of actions that need to be provided in
MultiEnvironment.step()
and the return sizes of the list of this same function.- Type:
int
Examples
An example on how you can best leverage this class is given in the getting_started notebooks. Another simple example is:
from grid2op.BaseAgent import DoNothingAgent from grid2op.MakeEnv import make from grid2op.Environment import SingleEnvMultiProcess # create a simple environment env = make("l2rpn_case14_sandbox") # number of parrallel environment nb_env = 2 # change that to adapt to your system NB_STEP = 100 # number of step for each environment # create a simple agent agent = DoNothingAgent(env.action_space) # create the multi environment class multi_envs = SingleEnvMultiProcess(env=env, nb_env=nb_env) # making is usable obs = multi_envs.reset() rews = [env.reward_range[0] for i in range(nb_env)] dones = [False for i in range(nb_env)] # performs the appropriated steps for i in range(NB_STEP): acts = [None for _ in range(nb_env)] for env_act_id in range(nb_env): acts[env_act_id] = agent.act(obs[env_act_id], rews[env_act_id], dones[env_act_id]) obs, rews, dones, infos = multi_envs.step(acts) # DO SOMETHING WITH THE AGENT IF YOU WANT # close the environments multi_envs.close() # close the initial environment env.close()
- class grid2op.Environment.TimedOutEnvironment(grid2op_env: Environment | dict, time_out_ms: int = 1000.0)[source]
This class is the grid2op implementation of a “timed out environment” entity in the RL framework.
This class is very similar to the standard environment. They only differ in the behaivour of the step function.
For more information, see the documentation of
TimedOutEnvironment.step()
Warning
This class might not behave normally if used with MaskedEnvironment, MultiEnv, MultiMixEnv etc.
- name
The name of the environment
- Type:
str
- time_out_ms
maximum duration before performing a do_nothing action and updating to the next time_step.
- Type:
int
- action_space
Another name for
Environment.helper_action_player
for gym compatibility.
- observation_space
Another name for
Environment.helper_observation
for gym compatibility.
- reward_range
The range of the reward function
- Type:
(float, float)
- metadata
For gym compatibility, do not use
- Type:
dict
- spec
For Gym compatibility, do not use
- Type:
None
- _viewer
Used to display the powergrid. Currently properly supported.
- Type:
object
Methods:
get_kwargs
([with_backend, with_chronics_handler])This function allows to make another Environment with the same parameters as the one that have been used to make this one.
This method is used to initialize a proper
grid2op.Runner.Runner
to use this specific environment.reset
(*[, seed, options])Reset the environment.
step
(action)This function allows to pass to the next step for the action.
- get_kwargs(with_backend=True, with_chronics_handler=True)[source]
This function allows to make another Environment with the same parameters as the one that have been used to make this one.
This is useful especially in cases where Environment is not pickable (for example if some non pickable c++ code are used) but you still want to make parallel processing using “MultiProcessing” module. In that case, you can send this dictionary to each child process, and have each child process make a copy of
self
NB This function should not be used to make a copy of an environment. Prefer using
Environment.copy()
for such purpose.- Returns:
res – A dictionary that helps build an environment like
self
(which is NOT a copy of self) but rather an instance of an environment with the same properties.- Return type:
dict
Examples
It should be used as follow:
import grid2op from grid2op.Environment import Environment env = grid2op.make("l2rpn_case14_sandbox") # create the environment of your choice copy_of_env = Environment(**env.get_kwargs()) # And you can use this one as you would any other environment. # NB this is not a "proper" copy. for example it will not be at the same step, it will be possible # seeded with a different seed. # use `env.copy()` to make a proper copy of an environment.
- get_params_for_runner()[source]
This method is used to initialize a proper
grid2op.Runner.Runner
to use this specific environment.Examples
It should be used as followed:
import grid2op from grid2op.Runner import Runner from grid2op.Agent import DoNothingAgent # for example env = grid2op.make("l2rpn_case14_sandbox") # create the environment of your choice # create the proper runner runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent) # now you can run runner.run(nb_episode=1) # run for 1 episode
- reset(*, seed: int | None = None, options: Dict[str | Literal['time serie id'], int | str] | None = None) BaseObservation [source]
Reset the environment.
See also
The doc of
Environment.reset()
for more information- Returns:
The first observation of the new episode.
- Return type:
- step(action: BaseAction) Tuple[BaseObservation, float, bool, dict] [source]
This function allows to pass to the next step for the action.
Provided the action the agent wants to do, it will perform the action on the grid and resturn the typical “observation, reward, done, info” tuple.
Compared to
BaseEnvironment.step()
this function will emulate the “time that passes” supposing that the duration between each step should be time_out_ms. Indeed, in reality, there is only 5 mins to take an action between two grid states separated from 5 mins.More precisely:
If your agent takes less than time_out_ms to chose its action then this function behaves normally.
If your agent takes between time_out_ms and 2 x time_out_ms to provide an action then a “do nothing” action is performed and then the provided action is performed.
If your agent takes between 2 x time_out_ms and 3 x time_out_ms to provide an action, then 2 “do nothing” actions are performed before your action.
Note
It is possible that the environment “fails” before the action of the agent is implemented on the grid.
- Parameters:
action (grid2op.Action.BaseAction) – The action the agent wish to perform.
- Returns:
_description_
- Return type:
Tuple[BaseObservation, float, bool, dict]
If you still can’t find what you’re looking for, try in one of the following pages:
Still trouble finding the information ? Do not hesitate to send a github issue about the documentation at this link: Documentation issue template