Environment 

This page is organized as follow:

Objectives 

This module defines the Environment the higher level representation of the world with which an grid2op.Agent.BaseAgent will interact.

The environment receive an grid2op.Action.BaseAction from the grid2op.Agent.BaseAgent in the Environment.step() and returns an grid2op.Observation.BaseObservation that the grid2op.Agent.BaseAgent will use to perform the next action.

An environment is better used inside a grid2op.Runner.Runner, mainly because runners abstract the interaction between environment and agent, and ensure the environment are properly reset after each episode.

Usage 

In this section we present some way to use the Environment class.

Basic Usage 

This example is adapted from gymnasium documentation available at gym random_agent.py ):

import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments
episode_count = 100  # i want to make 100 episodes

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
for i in range(episode_count):
    obs = env.reset()
    while True:
       action = agent.act(obs, reward, done)
       obs, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

# Close the env and write monitor result info to disk
env.close()
print("The total reward was {:.2f}".format(total_reward))

What happens here is the following:

obs = env.reset() will reset the environment to be usable again. It will load, by default the next “chronics” (you can imagine chronics as the graphics of a video game: it tells where the enemies are located, where are the walls, the ground etc. - each chronics can be thought a different “game level”).
action = agent.act(obs, reward, done) will chose an action facing the observation ob. This action should be of type grid2op.Action.BaseAction (or one of its derivate class). In case of a video game that would be you receiving and observation (usually display on the screen) and action on a controller. For example you could chose to go “left” / “right” / “up” or “down”. Of course in the case of the powergrid the actions are more complicated that than.
obs, reward, done, info = env.step(action) is the call to go to the next steps. You can imagine it as being a the next “frame”. To continue the parallel with video games, at the previous line you asked “pacman” to go left (for example) and then the next frame is displayed (here returned as an new observation obs).

You might want to customize this general behaviour in multiple way:

you might want to study only one chronics (equivalent to only one level of a video game) see Study always the same time serie
you might want to loop through the chronics, but not always in the same order. If that is the case you might want to consult the section Shuffle the chronics order
you might also have spotted some chronics that have bad properties. In this case, you can “remove” them from the environment (they will be ignored). This is explained in Skipping some chronics
you might also want to select at random, the next chronic you will use. This allows some compromise between all the above solution. Instead of ignoring some chronics you might want to select them less frequently, instead of always using the same one, you can sampling it more often and of course, because the sampling is done randomly it’s unlikely that the order will remain the same. To use that you can check the Sampling the chronics

In a different scenarios, you might also want to skip the first time steps of the chronics, that would be equivalent to starting into the “middle” of a video game. If that is the case, the subsection Skipping some time steps is made for you.

Finally, you might have noticed that each call to “env.reset” might take a while. This can dramatically increase the training time, especially at the beginning. This is due to the fact that each time env.reset is called, the whole chronics is read from the hard drive. If you want to lower this impact then you might consult the Optimize the data pipeline page of the doc.

Go to the next scenario 

Starting grid2op 1.9.8 we attempt to make an easier user experience in the selection of time series, seed, initial state of the grid, etc.

All of the above can be done when calling env.reset() function.

For customizing the seed, you can for example do:

import grid2op
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)

obs = env.reset(seed=0)

For customizing the time series id you want to use:

import grid2op
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)

obs = env.reset(options={"time serie id": 1})  # time serie by id (sorted alphabetically)
# or
obs = env.reset(options={"time serie id": "0001"})  # time serie by name (folder name)

For customizing the initial state of the grid, for example forcing the powerline 0 to be disconnected in the initial observation:

import grid2op
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)

init_state_dict = {"set_line_status": [(0, -1)]}
obs = env.reset(options={"init state": init_state_dict})

Feel free to consult the documentation of the Environment.reset() function for more information (this doc might be outdated, the one of the function should be more up to date with the code).

Note

In the near future (next few releases) we will also attempt to make the customization of the parameters or the skip number of steps, maximum duration of the scenarios also available in env.reset() options.

Time series Customization 

Study always the same time serie

If you spotted a particularly interesting chronics, or if you want, for some reason your agent to see only one chronics, you can do this rather easily with grid2op.

All chronics are given a unique persistent ID (it means that as long as the data is not modified the same chronics will have always the same ID each time you load the environment). The environment has a “set_id” method that allows you to use it. Just add “env.set_id(THE\_ID\_YOU\_WANT)” before the call to “env.reset”. This gives the following code:

import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments
episode_count = 100  # i want to make 100 episodes

###################################
THE_CHRONIC_ID = 42
###################################

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
for i in range(episode_count):
    ###################################
    # with recent grid2op
    obs = env.reset(options={"time serie id": THE_CHRONIC_ID})
    ###################################

    ###################################
    # 'old method (oldest grid2op version)'
    # env.set_id(THE_CHRONIC_ID)
    # obs = env.reset()
    ###################################

    # now play the episode as usual
    while True:
       action = agent.act(obs, reward, done)
       obs, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

# Close the env and write monitor result info to disk
env.close()
print("The total reward was {:.2f}".format(total_reward))

(as always added line compared to the base code are highlighted: they are “circle” with #####)

Shuffle the chronics order

In some other usecase, you might want to go through the whole set of chronics, and then loop again through them, but in a different order (remember that by default it will always loop in the same order 0, 1, 2, 3, …, 0, 1, 2, 3, …, 0, 1, 2, 3, …).

Again, doing so with grid2op is rather easy. To that end you can use the chronics_handler.shuffle function that will do exactly that. You can use it like this:

import numpy as np
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments
episode_count = 10000  # i want to make lots of episode

# total number of episode
total_episode = len(env.chronics_handler.subpaths)

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
for i in range(episode_count):

    ###################################
    if i % total_episode == 0:
        # I shuffle each time i need to
        env.chronics_handler.shuffle()
    ###################################

    obs = env.reset()
    # now play the episode as usual
    while True:
       action = agent.act(obs, reward, done)
       obs, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

Skipping some chronics

Some chronics might be too hard to start a training (“learn to walk before running”) and conversely some chronics might be too easy after a while (you can solve them without doing nothing basically). This is why grid2op allows you to have some control about which chronics will be used by the environment.

For this purpose you can use the chronics_handler.set_filter function. This function takes a “filtering function” as argument. This “filtering function” takes as argument the full path of the chronics and should return True / False whether or not you want to keep the There is an example:

import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments


###################################
# this is the only line of code to add
# here i select only the chronics that start by "00"
env.chronics_handler.set_filter(lambda path: re.match(".*00[0-9].*", path) is not None)
kept = env.chronics_handler.reset()  # if you don't do that it will not have any effect
print(kept)  # i print the chronics kept
###################################

episode_count = 10000  # i want to make lots of episode

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):
    obs = env.reset()
    # now play the episode as usual
    while True:
       action = agent.act(obs, reward, done)
       obs, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

Sampling the chronics

Finally, for even more flexibility, you can choose to sample what will be the next used chronics. To achieve that you can call the chronics_handler.sample_next_chronics This function takes a vector of probabilities as input (if not provided it assumes all probabilities are equal) and will select an id based on this probability vector.

In the following example we assume that the vector of probabilities is always the same and that we want, for some reason oversampling the 10 first chronics, and under sample the last 10:

import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments

episode_count = 10000  # i want to make lots of episode

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

###################################
# total number of episode
total_episode = len(env.chronics_handler.subpaths)
probas = np.ones(total_episode)
# oversample the first 10 episode
probas[:10]*= 5
# undersample the last 10 episode
probas[-10:] /= 5
###################################

# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):

    ###################################
    _ = env.chronics_handler.sample_next_chronics(probas)  # this is added
    ###################################
    obs = env.reset()

    # now play the episode as usual
    while True:
       action = agent.act(obs, reward, done)
       obs, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

NB here we have a constant vector of probabilities, but you might imagine adapting it during the training, for example to oversample scenarios your agent is having trouble to solve during the training.

Skipping some time steps

Another way to customize which data your agent will face is to make as if the chronics started at different date and time. This might be handy in case a scenario is hard at the beginning but less hard at the end, or if you want your agent to learn to start controlling the grid at any date and time (in grid2op most of the chronics data provided start at midnight for example).

To achieve this goal, you can use the BaseEnv.fast_forward_chronics() function. This function skip a given number of steps. In the following example, we always skip the first 42 time steps before starting the episode:

import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make("l2rpn_case14_sandbox")
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments

episode_count = 10000  # i want to make lots of episode

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):
    obs = env.reset()

    ###################################
    # below are the two lines added
    env.fast_forward_chronics(42)
    obs = env.get_obs()
    ###################################

    # now play the episode as usual
    while True:
       action = agent.act(obs, reward, done)
       obs, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

Generating chronics that are always new

New in version 1.6.6: This functionality is only available for some environments, for example “l2rpn_wcci_2022”

Warning

A much better alternative to this class is to have a “process” generate the data, thanks to the grid2op.Environment.Environment.generate_data() and then to reload the data in a (separate) training script.

This is explained in section Generate and use an “infinite” data of the documentation.

Though it is not recommended at all (for performance reasons), you have, starting from grid2op 1.6.6 (and using a compatible environment eg “l2rpn_wcci_2022”) to generate a possibly infinite amount of data thanks to the grid2op.Chronics.FromChronix2grid class.

The data generation process is rather slow for different reasons. The main one is that the data need to meet a lot of “constraints” to be realistic, some of them are given in the Elements modeled in this environment and their main properties module. On our machines, it takes roughly 40-50 seconds to generate a weekly scenario for the l2rpn_wcci_2022 environment (usually an agent will fail in 1 or 2s… This is why we do not recommend to use it)

To generate data “on the fly” you simply need to create the environment with the right chronics class as follow:

import grid2op
from grid2op.Chronics import FromChronix2grid
env_nm = "l2rpn_wcci_2022"  # only compatible environment at time of writing

env = grid2op.make(env_nm,
                   chronics_class=FromChronix2grid,
                   data_feeding_kwargs={"env_path": os.path.join(grid2op.get_current_local_dir(), env_nm),
                                        "with_maintenance": True,  # whether to include maintenance (optional)
                                        "max_iter": 2 * 288,  # duration (in number of steps) of the data generated (optional)
                                        }
                   )

And this is it. Each time you call env.reset() it will internally call chronix2grid package to generate new data for this environment (this is why env.reset() will take roughly 50s…).

Warning

For this class to be available, you need to have the “chronix2grid” package installed and working.

Please install it with pip intall grid2op[chronix2grid] and make sure to have the coinor-cbc solver available on your system (more information at https://github.com/bdonnot/chronix2grid#installation)

Warning

Because I know from experience warnings are skipped half of the time: please consult Generate and use an “infinite” data for a better way to generate infinite data !

Generate and use an “infinite” data

New in version 1.6.6.

Warning

For this class to be available, you need to have the “chronix2grid” package installed and working.

Please install it with pip intall grid2op[chronix2grid] and make sure to have the coinor-cbc solver available on your system (more information at https://github.com/bdonnot/chronix2grid#installation)

In this section we present a new way to generate possibly an infinite amount of data for training your agent ( in case the data shipped with the environment are too limited).

One way to do this is to split the data “generation” process on one python script, and the data “consumption” process (for example by training an agent) on another one.

This is much more efficient than using the grid2op.Chronics.FromChronix2grid because you will not spend 50s waiting the data to be generated at each call to env.reset() after the episode is over.

First, create a script to generate all the data that you want. For example in the script “generation.py”:

import grid2op
env_name = "l2rpn_wcci_2022"  # only compatible with what comes next (at time of writing)
env = grid2op.make(env_name)
nb_year = 50  # or any "big" number...
env.generate_data(nb_year=nb_year)  # generates 50 years of data
# (takes roughly 50s per week, around 45mins per year, in this case 50 * 45 mins = 37.5 hours)

Then create a script to “consume” your data, for example by training an agent (say “train.py”) [we demonstrate it with l2rpn baselines but you can use whatever you want]:

import os
import grid2op
from lightsim2grid import LightSimBackend  # highly recommended for speed !

env_name = "l2rpn_wcci_2022"  # only compatible with what comes next (at time of writing)
env = grid2op.make(env_name, backend=LightSimBackend())

# now train an agent
# see l2rpn_baselines package for more information, for example
# l2rpn-baselines.readthedocs.io/
from l2rpn_baselines.PPO_SB3 import train
nb_iter = 10000  # train for that many iterations
agent_name = "WhaetverIWant"  # or any other name
agent_path = os.path.expand("~")  # or anywhere else on your computer
trained_agent = train(env,
                      iterations=nb_iter,
                      name=agent_name,
                      save_path=agent_path)
# this agent will be trained only on the data available at the creation of the environment

# the training loop will take some time, so more data will be generated when it's over
# reload them
env.chronics_handler.init_subpath()
env.chronics_handler.reset()

# and retrain your agent including the data you just generated
trained_agent = train(env,
                      iterations=nb_iter,
                      name=agent_name,
                      save_path=agent_path,
                      load_path=agent_path
                      )

# once it's over, more time has passed, and more data are available
# reload them
env.chronics_handler.init_subpath()
env.chronics_handler.reset()

# and retrain your agent
trained_agent = train(env,
                      iterations=nb_iter,
                      name=agent_name,
                      save_path=agent_path,
                      load_path=agent_path
                      )

# well you got the idea
# etc. etc.

Warning

This way of doing things will always increase the size of the data in your hard drive. We do recommend to somehow delete some of the data from time to time

Deleting the data you be done before the env.chronics_handler.init_subpath() for example:

### delete the folder you want to get rid off
names_folder_to_delete = ...
# To build `names_folder_to_delete`
# you could for examaple:
# - remove the `nth` oldest directories
#   see: https://stackoverflow.com/questions/47739262/find-remove-oldest-file-in-directory
# - or keep only the `kth`` most recent directories
# - or keep only `k` folder at random among the one in `grid2op.get_current_local_dir()`
# - or delete all the oldest files and keep your directory at a fixed size
#   see: https://gist.github.com/ginz/1ba7de8b911651cfc9c85a82a723f952
# etc.

for nm in names_folder_to_delete:
    shutil.rmtree(os.path.join(grid2op.get_current_local_dir(), nm))
####
# reload the remaining data:
env.chronics_handler.init_subpath()
env.chronics_handler.reset()

# continue normally

Splitting into raining, validation, test scenarios 

In machine learning the “training / validation / test” framework is particularly usefull to avoid overfitting and develop models as performant as possible.

Grid2op allows for such usage at the environment level. There is the possibility to “split” an environment into training / validation and test (ie using only some chronics for training, some others for validation and some others for testing).

This can be done with:

import grid2op
env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# extract 1% of the "chronics" to be used in the validation environment. The other 99% will
# be used for test
nm_env_train, nm_env_val, nm_env_test = env.train_val_split_random(pct_val=1., pct_test=1.)

# and now you can use the training set only to train your agent:
print(f"The name of the training environment is \\"{nm_env_train}\\"")
print(f"The name of the validation environment is \\"{nm_env_val}\\"")
print(f"The name of the test environment is \\"{nm_env_test}\\"")
env_train = grid2op.make(nm_env_train)

You can then use, in the above case:

import grid2op
env_name = "l2rpn_case14_sandbox"  # matching above

env_train = grid2op.make(env_name+"_train")  # to only use the "training chronics"
# do whatever you want with env_train

And then, at time of validation:

import grid2op
env_name = "l2rpn_case14_sandbox"  # matching above

env_val = grid2op.make(env_name+"_val") # to only use the "validation chronics"
# do whatever you want with env_val

# and of course
env_test = grid2op.make(env_name+"_test")

Customization 

Environments can be customized in three major ways:

Backend: you change the solver that computes the state of the power more or less faste or be more realistically
Parameters: you change the behaviour of the Environment. For example you can prevent the powerline to be disconnected when too much current flows on it etc.
Rules: you can affect the operational constraint that your agent must meet. For example you can affect more or less powerlines in the same action etc.

You can do these at creation time:

import grid2op
env_name = "l2rpn_case14_sandbox"  # or any other name

# create the regular environment:
env_reg = grid2op.make(env_name)

# to change the backend
# (here using the lightsim2grid faster backend)
from lightsim2grid import LightSimBackend
env_faster = grid2op.make(env_name, backend=LightSimBackend())

# to change the parameters, for example
# to prevent line disconnect when there is overflow
param = env_reg.parameters
param.NO_OVERFLOW_DISCONNECTION = True
env_easier = grid2op.make(env_name, param=param)

Of course you can combine everything. More examples are given in section Customize your environment.

Detailed Documentation by class 

Classes:

`BaseEnv`(init_env_path, init_grid_path, ...)	INTERNAL
`BaseMultiProcessEnvironment`(envs[, ...])	This class allows to evaluate a single agent instance on multiple environments running in parrallel.
`Environment`(init_env_path, init_grid_path, ...)	This class is the grid2op implementation of the "Environment" entity in the RL framework.
`MaskedEnvironment`(grid2op_env, lines_of_interest)	This class is the grid2op implementation of a "maked" environment: lines not in the lines_of_interest mask will NOT be deactivated by the environment is the flow is too high (or moderately high for too long.)
`MultiEnvMultiProcess`(envs, nb_envs[, ...])	This class allows to evaluate a single agent instance on multiple environments running in parrallel.
`MultiMixEnvironment`(envs_dir[, logger, ...])	This class represent a single powergrid configuration, backed by multiple environments parameters and chronics
`SingleEnvMultiProcess`(env, nb_env[, ...])	This class allows to evaluate a single agent instance on multiple environments running in parallel.
`TimedOutEnvironment`(grid2op_env[, time_out_ms])	This class is the grid2op implementation of a "timed out environment" entity in the RL framework.

class grid2op.Environment.BaseEnv(init_env_path: ~os.PathLike, init_grid_path: ~os.PathLike, parameters: ~grid2op.Parameters.Parameters, voltagecontrolerClass: type, name='unknown', thermal_limit_a: ~numpy.ndarray | None = None, epsilon_poly: float = 0.0001, tol_poly: float = 0.01, other_rewards: dict | None = None, with_forecast: bool = True, opponent_space_type: type = <class 'grid2op.Opponent.opponentSpace.OpponentSpace'>, opponent_action_class: type = <class 'grid2op.Action.dontAct.DontAct'>, opponent_class: type = <class 'grid2op.Opponent.baseOpponent.BaseOpponent'>, opponent_init_budget: float = 0.0, opponent_budget_per_ts: float = 0.0, opponent_budget_class: type = <class 'grid2op.Opponent.neverAttackBudget.NeverAttackBudget'>, opponent_attack_duration: int = 0, opponent_attack_cooldown: int = 99999, kwargs_opponent: dict | None = None, has_attention_budget: bool = False, attention_budget_cls: type = <class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget: dict | None = None, logger: ~logging.Logger | None = None, kwargs_observation: dict | None = None, observation_bk_class=None, observation_bk_kwargs=None, highres_sim_counter=None, update_obs_after_reward=False, n_busbar=2, _is_test: bool = False, _init_obs: ~grid2op.Observation.baseObservation.BaseObservation | None = None, _local_dir_cls=None, _read_from_local_dir=None, _raw_backend_class=None)[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

This class represent some usefull abstraction that is re used by Environment and grid2op.Observation._Obsenv for example.

The documentation is showed here to document the common attributes of an “BaseEnvironment”.

Notes

Note en environment data ownership

Danger

A non pythonic decision has been implemented in grid2op for various reasons: an environment owns everything created from it.

This means that if you (or the python interpreter) deletes the environment, you might not use some data generate with this environment.

More precisely, you cannot do something like:

import grid2op
env = grid2op.make("l2rpn_case14_sandbox")

saved_obs = []

obs = env.reset()
saved_obs.append(obs)
obs2, reward, done, info = env.step(env.action_space())
saved_obs.append(obs2)

saved_obs[0].simulate(env.action_space())  # works
del env
saved_obs[0].simulate(env.action_space())  # DOES NOT WORK

It will raise an error like Grid2OpException EnvError “This environment is closed. You cannot use it anymore.”

This will also happen if you do things inside functions, for example like this:

import grid2op

def foo(manager):
    env = grid2op.make("l2rpn_case14_sandbox")
    obs = env.reset()
    manager.append(obs)
    obs2, reward, done, info = env.step(env.action_space())
    manager.append(obs2)
    manager[0].simulate(env.action_space())  # works
    return manager

manager = []
manager = foo(manager)
manager[0].simulate(env.action_space())  # DOES NOT WORK

The same error is raised because the environment env is automatically deleted by python when the function foo ends (well it might work on some cases, if the function is called before the variable env is actually deleted but you should not rely on this behaviour.)

parameters

The parameters of the game (to expose more control on what is being simulated)

Type:: grid2op.Parameters.Parameters

with_forecast

Whether the chronics allow to have some kind of “forecast”. See BaseEnv.activate_forceast() for more information

Type:: bool

logger: TO BE DONE: a way to log what is happening (currently not implemented)

time_stamp

The actual time stamp of the current observation.

Type:: datetime.datetime

nb_time_step

Number of time steps played in the current environment

Type:: int

current_obs

The current observation (or None if it’s not intialized)

Type:: grid2op.Observation.BaseObservation

backend

The backend used to compute the powerflows.

Type:: grid2op.Backend.Backend

done

Whether the environment is “done”. If True you need to call Environment.reset() in order to continue.

Type:: bool

current_reward

The last computed reward (reward of the current step)

Type:: float

other_rewards

Dictionary with key being the name (identifier) and value being some RewardHelper. At each time step, all the values will be computed by the Environment and the information about it will be returned in the “reward” key of the “info” dictionnary of the Environment.step().

Type:: dict

chronics_handler

The object in charge managing the “chronics”, which store the information about load and generator for example.

Type:: grid2op.Chronics.ChronicsHandler

reward_range

For open ai gym compatibility. It represents the range of the rewards: reward min, reward max

Type:: tuple

_viewer: For open ai gym compatibility.

viewer_fig: For open ai gym compatibility.

_gen_activeprod_t: Warning

/!\ Internal, do not use unless you know what you are doing /!\

Should be initialized at 0. for “step” to properly recognize it’s the first time step of the game

_no_overflow_disconnection

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Whether or not cascading failures are computed or not (TRUE = the powerlines above their thermal limits will not be disconnected). This is initialized based on the attribute grid2op.Parameters.Parameters.NO_OVERFLOW_DISCONNECTION.

Type:: bool

_timestep_overflow

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Number of consecutive timesteps each powerline has been on overflow.

Type:: numpy.ndarray, dtype: int

_nb_timestep_overflow_allowed

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Number of consecutive timestep each powerline can be on overflow. It is usually read from grid2op.Parameters.Parameters.NB_TIMESTEP_POWERFLOW_ALLOWED.

Type:: numpy.ndarray, dtype: int

_hard_overflow_threshold

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Number of timestep before an grid2op.BaseAgent.BaseAgent can reconnet a powerline that has been disconnected by the environment due to an overflow.

Type:: float

_env_dc

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Whether the environment computes the powerflow using the DC approximation or not. It is usually read from grid2op.Parameters.Parameters.ENV_DC.

Type:: bool

_names_chronics_to_backend

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Configuration file used to associated the name of the objects in the backend (both extremities of powerlines, load or production for example) with the same object in the data (Environment.chronics_handler). The idea is that, usually data generation comes from a different software that does not take into account the powergrid infrastructure. Hence, the same “object” can have a different name. This mapping is present to avoid the need to rename the “object” when providing data. A more detailed description is available at grid2op.ChronicsHandler.GridValue.initialize().

Type:: dict

_env_modification

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Representation of the actions of the environment for the modification of the powergrid.

Type:: grid2op.Action.Action

_rewardClass

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Type of reward used. Should be a subclass of grid2op.BaseReward.BaseReward

Type:: type

_init_grid_path

Warning

/!\ Internal, do not use unless you know what you are doing /!\

The path where the description of the powergrid is located.

Type:: str

_game_rules

Warning

/!\ Internal, do not use unless you know what you are doing /!\

The rules of the game (define which actions are legal and which are not)

Type:: grid2op.Rules.RulesChecker

_action_space

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Helper used to manipulate more easily the actions given to / provided by the grid2op.Agent.BaseAgent (player)

Type:: grid2op.Action.ActionSpace

_helper_action_env

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Helper used to manipulate more easily the actions given to / provided by the environment to the backend.

Type:: grid2op.Action.ActionSpace

_observation_space

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Helper used to generate the observation that will be given to the grid2op.BaseAgent

Type:: grid2op.Observation.ObservationSpace

_reward_helper

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Helper that is called to compute the reward at each time step.

Type:: grid2p.BaseReward.RewardHelper

kwargs_observation

TODO

Type:: dict

# TODO add the units (eg MW, MWh, MW/time step,etc.) in the redispatching related attributes

Attributes:

`KEYS_RESET_OPTIONS`	this are the keys of the dictionnary options that can be used when calling env.reset(..., options={})
`action_space`	this represent a view on the action space
`observation_space`	this represent a view on the action space
`parameters`	Return a deepcopy of the parameters used by the environment

Methods:

`attach_layout`(grid_layout)	Compare to the method of the base class, this one performs a check.
`change_forecast_parameters`(new_parameters)	Allows to change the parameters of a "forecast environment" that is for the method `grid2op.Observation.BaseObservation.simulate()` and `grid2op.Observation.BaseObservation.get_forecast_env()`
`change_parameters`(new_parameters)	Allows to change the parameters of an environment.
`change_reward`(new_reward_func)	Change the reward function used for the environment.
`classes_are_in_files`()	Whether the classes created when this environment has been made are store on the hard drive (will return True) or not.
`close`()	close an environment: this will attempt to free as much memory as possible.
`deactivate_forecast`()	This function will have the effect to deactivate the obs.simulate, the forecast will not be updated in the observation space.
`fast_forward_chronics`(nb_timestep)	This method allows you to skip some time step at the beginning of the chronics.
`generate_classes`(*[, local_dir_id, _guard, ...])	Use with care, but can be incredibly useful !
`get_current_line_status`()	INTERNAL
`get_obs`([_update_state, _do_copy])	Return the observations of the current environment made by the `grid2op.Agent.BaseAgent`.
`get_path_env`()	Get the path that allows to create this environment.
`get_reward_instance`()	INTERNAL
`get_thermal_limit`()	Get the current thermal limit in amps registered for the environment.
`load_alarm_data`()	Internal
`load_alert_data`()	Internal
`reactivate_forecast`()	This function will have the effect to reactivate the obs.simulate, the forecast will be updated in the observation space.
`reset`(*[, seed, options])	Reset the base environment (set the appropriate variables to correct initialization).
`seed`([seed, _seed_me])	Set the seed of this `Environment` for a better control and to ease reproducible experiments.
`set_thermal_limit`(thermal_limit)	Set the thermal limit effectively.
`step`(action)	Run one timestep of the environment's dynamics.

KEYS_RESET_OPTIONS = {'init state', 'init ts', 'max step', 'time serie id'}: this are the keys of the dictionnary options that can be used when calling env.reset(…, options={})

property action_space: ActionSpace: this represent a view on the action space

attach_layout(grid_layout)[source]

Compare to the method of the base class, this one performs a check. This method must be called after initialization.

Parameters:: grid_layout (dict) – The layout of the grid (i.e the coordinates (x,y) of all substations). The keys should be the substation names, and the values a tuple (with two float) representing the coordinate of the substation.

Examples

Here is an example on how to attach a layout for an environment:

import grid2op

# create the environment
env = grid2op.make("l2rpn_case14_sandbox")

# assign coordinates (0., 0.) to all substations (this is a dummy thing to do here!)
layout = {sub_name: (0., 0.) for sub_name in env.name_sub}
env.attach_layout(layout)

change_forecast_parameters(new_parameters)[source]

Allows to change the parameters of a “forecast environment” that is for the method grid2op.Observation.BaseObservation.simulate() and grid2op.Observation.BaseObservation.get_forecast_env()

Notes

This only affects the environment AFTER env.reset() has been called.

This only affects the “forecast env” and NOT the env itself.

Parameters:: new_parameters (grid2op.Parameters.Parameters) – The new parameters you want the environment to get.

Examples

This can be used like:

import grid2op
env_name = "l2rpn_case14_sandbox"  # or any other name
env = grid2op.make(env_name)

param = env.parameters
param.NO_OVERFLOW_DISCONNECTION = True  # or any other properties of the environment
env.change_forecast_parameters(param)
# at this point this has no impact.

obs = env.reset()
# now, after the reset, the right parameters are used
sim_obs, sim_reward, sim_done, sim_info = obs.simulate(env.action_space())
# the new parameters `param` are used for this
# and also for
forecasted_env = obs.get_forecast_env()

change_parameters(new_parameters)[source]

Allows to change the parameters of an environment.

Notes

This only affects the environment AFTER env.reset() has been called.

This only affects the environment and NOT the forecast.

Parameters:: new_parameters (grid2op.Parameters.Parameters) – The new parameters you want the environment to get.

Examples

You can use this function like:

import grid2op
from grid2op.Parameters import Parameters
env_name = "l2rpn_case14_sandbox"  # or any other name

env = grid2op.make(env_name)
env.parameters.NO_OVERFLOW_DISCONNECTION  # -> False

new_param = Parameters()
new_param.A_MEMBER = A_VALUE  # eg new_param.NO_OVERFLOW_DISCONNECTION = True
env.change_parameters(new_param)
obs = env.reset()
env.parameters.NO_OVERFLOW_DISCONNECTION  # -> True

change_reward(new_reward_func)[source]

Change the reward function used for the environment.

TODO examples !

Parameters:: new_reward_func – Either an object of class BaseReward, or a subclass of BaseReward: the new reward function to use

Notes

This only affects the environment AFTER env.reset() has been called.

classes_are_in_files() → bool[source]

Whether the classes created when this environment has been made are store on the hard drive (will return True) or not.

Note

This will become the default behaviour in future grid2op versions.

See Pickle issues for more information.

close()[source]

close an environment: this will attempt to free as much memory as possible. Note that after an environment is closed, you will not be able to use anymore.

Any attempt to use a closed environment might result in non deterministic behaviour.

deactivate_forecast()[source]

This function will have the effect to deactivate the obs.simulate, the forecast will not be updated in the observation space.

This will most likely lead to some performance increase (~10-15% faster) if you don’t use the obs.simulate function.

Notes

If you really don’t want to use the obs.simulate functionality, you should rather disable it at the creation of the environment. For example, if you use the recommended make function, you can pass an argument that will ignore the chronics even when reading it (using GridStateFromFile instead of GridStateFromFileWithForecast for example) this would give something like:

import grid2op
from grid2op.Chronics import GridStateFromFile
# tell grid2op not to read the "forecast"
env = grid2op.make("l2rpn_case14_sandbox", data_feeding_kwargs={"gridvalueClass": GridStateFromFile})

do_nothing_action = env.action_space()

# improve speed ups to not even try to use forecast
env.deactivate_forecast()

# this is normal behavior
obs = env.reset()

# but this will make the programm stop working
# obs.simulate(do_nothing_action)  # DO NOT RUN IT RAISES AN ERROR

fast_forward_chronics(nb_timestep)[source]

This method allows you to skip some time step at the beginning of the chronics.

This is usefull at the beginning of the training, if you want your agent to learn on more diverse scenarios. Indeed, the data provided in the chronics usually starts always at the same date time (for example Jan 1st at 00:00). This can lead to suboptimal exploration, as during this phase, only a few time steps are managed by the agent, so in general these few time steps will correspond to grid state around Jan 1st at 00:00.

See also

From grid2op version 1.10.3, a similar objective can be obtained directly by calling grid2op.Environment.Environment.reset() with “init ts” as option, for example like obs = env.reset(options={“init ts”: 12})

Danger

The usage of both BaseEnv.fast_forward_chronics() and Environment.set_max_iter() is not recommended at all and might not behave correctly. Please use env.reset with obs = env.reset(options={“max step”: xxx, “init ts”: yyy}) for a correct behaviour.

Parameters:: nb_timestep (int) – Number of time step to “fast forward”

Examples

From grid2op version 1.10.3 we recommend not to use this function (which will be deprecated) but to use the grid2op.Environment.Environment.reset() functon with the “init ts” option.

import grid2op
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)

obs = env.reset(options={"init ts": 123})

For the legacy usave, this can be used like this:

import grid2op

# create the environment
env = grid2op.make("l2rpn_case14_sandbox")

# skip the first 150 steps of the chronics
env.fast_forward_chronics(150)
done = env.is_done
if not done:
    obs = env.get_obs()
    # do something
else:
    # there was a "game over"
    # you need to reset the env (which will "cancel" the fast_forward)
    pass
    # do something else

Notes

This method can set the state of the environment in a ‘game over’ state (done=True) for example if the chronics last xxx time steps and you ask to “fast foward” more than xxx steps. This is why we advise to check the state of the environment after the call to this method if you use it (see the “Examples” paragaph)

generate_classes(*, local_dir_id=None, _guard=None, _is_base_env__=True, sys_path=None)[source]

Use with care, but can be incredibly useful !

If you get into trouble like :

AttributeError: Can't get attribute 'ActionSpace_l2rpn_icaps_2021_small'
on <module 'grid2op.Space.GridObjects' from
/home/user/Documents/grid2op_dev/grid2op/Space/GridObjects.py'>

You might want to call this function and that MIGHT solve your problem.

This function will create a subdirectory ino the env directory, that will be accessed when loading the classes used for the environment.

The default behaviour is to build the class on the fly which can cause some issues when using pickle or multiprocessing for example.

Examples

Here is how to best leverage this functionality:

First step, generated the classes once and for all.

Warning

You need to redo this step each time you customize the environment. This customization includes, but is not limited to:

change the backend type: grid2op.make(…, backend=…)
change the action class: grid2op.make(…, action_class=…)
change observation class: grid2op.make(…, observation_class=…)
change the volagecontroler_class
change the grid_path
change the opponent_action_class
etc.

import grid2op
env_name = "l2rpn_case14_sandbox"  # or any other name

env = grid2op.make(env_name, ...)  # again: redo this step each time you customize "..."
# for example if you change the `action_class` or the `backend` etc.

env.generate_classes()

Then, next time you want to use the SAME environment, you can do:

import grid2op
env_name = SAME NAME AS ABOVE
env = grid2op.make(env_name,
                   experimental_read_from_local_dir=True,
                   SAME ENV CUSTOMIZATION AS ABOVE)

And it should (this is experimerimental for now, and we expect feedback on the matter) solve the issues involving pickle.

Again, if you customize your environment (see above for more information) you’ll have to redo this step !

get_current_line_status()[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

prefer using grid2op.Observation.BaseObservation.line_status

This method allows to retrieve the line status.

get_obs(_update_state=True, _do_copy=True)[source]

Return the observations of the current environment made by the grid2op.Agent.BaseAgent.

Note

This function is called twice when the env is reset, otherwise once per step

_do_copy :: Whether or not to make a copy of the returned observation. By default it will do one. Be aware that this might cause trouble if used incorrectly.

Returns:: res – The current observation usually given to the grid2op.Agent.BaseAgent / bot / controler.
Return type:: grid2op.Observation.BaseObservation

Examples

This function can be use at any moment, even if the actual observation is not present.

import grid2op

# I create an environment
env = grid2op.make("l2rpn_case14_sandbox")

obs = env.reset()

# have a big piece of code
obs2 = env.get_obs()

# obs2 and obs are identical.

get_path_env()[source]

Get the path that allows to create this environment.

It can be used for example in grid2op.utils.EpisodeStatistics() to save the information directly inside the environment data.

get_reward_instance()[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Returns the instance of the object that is used to compute the reward.

get_thermal_limit()[source]

Get the current thermal limit in amps registered for the environment.

Examples

It can be used like this:

import grid2op

# I create an environment
env = grid2op.make("l2rpn_case14_sandbox")

thermal_limits = env.get_thermal_limit()

load_alarm_data()[source]

Internal

Warning

/!\ Only valid with “l2rpn_icaps_2021” environment /!\

Notes

This is called when the environment class is not created, so i need to read the data of the grid from the backend.

I cannot use “self.name_line” for example.

This function update the backend INSTANCE. The backend class is then updated in the BaseEnv._init_backend() function with a call to self.backend.assert_grid_correct()

load_alert_data()[source]

Internal

Notes

This is called to get the alertable lines when the warning is raised “by line”

property observation_space: ObservationSpace: this represent a view on the action space

property parameters

Return a deepcopy of the parameters used by the environment

It is a deepcopy, so modifying it will have absolutely no effect on the environment.

If you want to change the parameters of an environment, please use either grid2op.Environment.BaseEnv.change_parameters() to change the parameters of this environment or grid2op.Environment.BaseEnv.change_forecast_parameters() to change the parameter of the environment used by grid2op.Observation.BaseObservation.simulate() or grid2op.Observation.BaseObservation.get_forecast_env()

Danger

To modify the environment parameters you need to do:

params = env.parameters
params.WHATEVER = NEW_VALUE
env.change_parameters(params)
env.reset()

If you simply do:

env.params.WHATEVER = NEW_VALUE  # no effet !

This will have absolutely no impact.

reactivate_forecast()[source]

This function will have the effect to reactivate the obs.simulate, the forecast will be updated in the observation space.

This will most likely lead to some performance decrease but you will be able to use obs.simulate function.

Warning

Forecast are deactivated by default (and cannot be reactivated) if the backend cannot be copied.

Warning

You need to call ‘env.reset()’ for this function to work properly. It is NOT recommended to reactivate forecasts in the middle of an episode.

Notes

You can use this function as followed:

import grid2op
from grid2op.Chronics import GridStateFromFile
# tell grid2op not to read the "forecast"
env = grid2op.make("l2rpn_case14_sandbox", data_feeding_kwargs={"gridvalueClass": GridStateFromFile})

do_nothing_action = env.action_space()

# improve speed ups to not even try to use forecast
env.deactivate_forecast()

# this is normal behavior
obs = env.reset()

# but this will make the programm stop working
# obs.simulate(do_nothing_action)  # DO NOT RUN IT RAISES AN ERROR

env.reactivate_forecast()
obs = env.reset()  # you need to reset the env for this function to have any effects
obs, reward, done, info = env.step(do_nothing_action)

# and now forecast are available again
simobs, sim_r, sim_d, sim_info = obs.simulate(do_nothing_action)

reset(*, seed: int | None = None, options: Dict[Literal['time serie id'], int] | Dict[Literal['init state'], Dict[Literal['set_line_status', 'change_line_status', 'set_bus', 'change_bus', 'redispatch', 'set_storage', 'curtail', 'raise_alarm', 'raise_alert', 'injection', 'hazards', 'maintenance', 'shunt'], Any]] | Dict[Literal['init ts'], int] | Dict[Literal['max step'], int] | None = None)[source]: Reset the base environment (set the appropriate variables to correct initialization). It is (and must be) overloaded in other grid2op.Environment

seed(seed=None, _seed_me=True)[source]

Set the seed of this Environment for a better control and to ease reproducible experiments.

See also

function Environment.reset() for extra information

Changed in version 1.9.8: Starting from version 1.9.8 you can directly set the seed when calling reset.

Warning

It is preferable to call this function just before a call to env.reset() otherwise the seeding might not work properly (especially if some non standard “time serie generators” aka chronics are used)

Parameters:

seed (int) – The seed to set.
_seed_me (bool) – Whether to seed this instance or just the other things. Used internally only.

Returns:

seed (tuple) – The seed used to set the prng (pseudo random number generator) for the environment
seed_chron (tuple) – The seed used to set the prng for the chronics_handler (if any), otherwise None
seed_obs (tuple) – The seed used to set the prng for the observation space (if any), otherwise None
seed_action_space (tuple) – The seed used to set the prng for the action space (if any), otherwise None
seed_env_modif (tuple) – The seed used to set the prng for the modification of th environment (if any otherwise None)
seed_volt_cont (tuple) – The seed used to set the prng for voltage controler (if any otherwise None)
seed_opponent (tuple) – The seed used to set the prng for the opponent (if any otherwise None)

Examples

Seeding an environment should be done with:

import grid2op
env = grid2op.make("l2rpn_case14_sandbox")
env.seed(0)
obs = env.reset()

As long as the environment instance (variable env in the above code) is not reset the env.seed has no real effect (but can have side effect).

For a full control on the seed mechanism it is more than advised to reset it after it has been seeded.

set_thermal_limit(thermal_limit)[source]

Set the thermal limit effectively.

Parameters:

thermal_limit (numpy.ndarray) –

The new thermal limit. It must be a numpy ndarray vector (or convertible to it). For each powerline it gives the new thermal limit.

Alternatively, this can be a dictionary mapping the line names (keys) to its thermal limits (values). In that case, all thermal limits for all powerlines should be specified (this is a safety measure to reduce the odds of misuse).

Examples

This function can be used like this:

import grid2op

# I create an environment
env = grid2op.make(""l2rpn_case14_sandbox"", test=True)

# i set the thermal limit of each powerline to 20000 amps
env.set_thermal_limit([20000 for _ in range(env.n_line)])

Notes

As of grid2op > 1.5.0, it is possible to set the thermal limit by using a dictionary with the keys being the name of the powerline and the values the thermal limits.

step(action: BaseAction) → Tuple[BaseObservation, float, bool, Dict[Literal['disc_lines', 'is_illegal', 'is_ambiguous', 'is_dispatching_illegal', 'is_illegal_reco', 'reason_alarm_illegal', 'reason_alert_illegal', 'opponent_attack_line', 'opponent_attack_sub', 'exception', 'detailed_infos_for_cascading_failures', 'rewards', 'time_series_id'], Any]][source]

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state. Accepts an action and returns a tuple (observation, reward, done, info).

If the grid2op.BaseAction.BaseAction is illegal or ambiguous, the step is performed, but the action is replaced with a “do nothing” action.

Parameters:

action (grid2op.Action.Action) – an action provided by the agent that is applied on the underlying through the backend.

Returns:

observation (grid2op.Observation.Observation) – agent’s observation of the current environment
reward (float) – amount of reward returned after previous action
done (bool) – whether the episode has ended, in which case further step() calls will return undefined results
info (dict) – contains auxiliary diagnostic information (helpful for debugging, and sometimes learning). It is a dictionary with keys:
- ”disc_lines”: a numpy array (or None) saying, for each powerline if it has been disconnected due to overflow (if not disconnected it will be -1, otherwise it will be a positive integer: 0 meaning that is one of the cause of the cascading failure, 1 means that it is disconnected just after, 2 that it’s disconnected just after etc.)
- ”is_illegal” (bool) whether the action given as input was illegal
- ”is_ambiguous” (bool) whether the action given as input was ambiguous.
- ”is_dispatching_illegal” (bool) was the action illegal due to redispatching
- ”is_illegal_reco” (bool) was the action illegal due to a powerline reconnection
- ”reason_alarm_illegal” (None or Exception) reason for which the alarm is illegal (it’s None if no alarm are raised or if the alarm feature is not used)
- ”reason_alert_illegal” (None or Exception) reason for which the alert is illegal (it’s None if no alert are raised or if the alert feature is not used)
- ”opponent_attack_line” (np.ndarray, bool) for each powerline, say if the opponent attacked it (True) or not (False).
- ”opponent_attack_sub” (np.ndarray, bool) for each substation, say if the opponent attacked it (True) or not (False).
- ”opponent_attack_duration” (int) the duration of the current attack (if any)
- ”exception” (list of Exceptions.Exceptions.Grid2OpException if an exception was raised or [] if everything was fine.)
- ”detailed_infos_for_cascading_failures” (optional, only if the backend has been create with detailed_infos_for_cascading_failures=True) the list of the intermediate steps computed during the simulation of the “cascading failures”.
- ”rewards”: dictionary of all “other_rewards” provided when the env was built.
- ”time_series_id”: id of the time series used (if any, similar to a call to env.chronics_handler.get_id())

Examples

This is used like:

import grid2op
from grid2op.Agent import RandomAgent

# I create an environment
env = grid2op.make("l2rpn_case14_sandbox")

# define an agent here, this is an example
agent = RandomAgent(env.action_space)

# environment need to be "reset" before usage:
obs = env.reset()
reward = env.reward_range[0]
done = False

# now run through each steps like this
while not done:
    action = agent.act(obs, reward, done)
    obs, reward, done, info = env.step(action)

Notes

If the flag done=True is raised (ie this is the end of the episode) then the observation is NOT properly updated and should not be used at all.

Actually, it will be in a “game over” state (see grid2op.Observation.BaseObservation.set_game_over).

class grid2op.Environment.BaseMultiProcessEnvironment(envs, obs_as_class=True, return_info=True, logger=None)[source]

This class allows to evaluate a single agent instance on multiple environments running in parrallel.

It uses the python “multiprocessing” framework to work, and thus is suitable only on a single machine with multiple cores (cpu / thread). We do not recommend to use this method on a cluster of different machines.

This class uses the following representation:

an grid2op.BaseAgent.BaseAgent: lives in a main process
different environments lives into different processes
a call to MultiEnv.step() will perform one step per environment, in parallel using a Pipe to transfer data to and from the main process from each individual environment process. It is a synchronous function. It means it will wait for every environment to finish the step before returning all the information.

There are some limitations. For example, even if forecast are available, it’s not possible to use forecast of the observations. This imply that grid2op.Observation.BaseObservation.simulate() is not available when using MultiEnvironment

Compare to regular Environments, MultiEnvironment simply stack everything. You need to send not a single grid2op.Action.BaseAction but as many actions as there are underlying environments. You receive not one single grid2op.Observation.BaseObservation but as many observations as the number of underlying environments.

A broader support of regular grid2op environment capabilities as well as support for grid2op.Observation.BaseObservation.simulate() call might be added in the future.

NB As opposed to Environment.step() a call to BaseMultiProcessEnvironment.step() or any of its derived class (SingleEnvMultiProcess or MultiEnvMultiProcess) if a sub environment is “done” then it is automatically reset. This means entails that you can call BaseMultiProcessEnvironment.step() without worrying about having to reset.

envs

Al list of environments for which the evaluation will be made in parallel.

Type:: list::grid2op.Environment.Environment

nb_env

Number of parallel underlying environment that will be handled. It is also the size of the list of actions that need to be provided in MultiEnvironment.step() and the return sizes of the list of this same function.

Type:: int

obs_as_class

Whether to convert the observations back to grid2op.Observation object to to leave them as numpy array. Default (obs_as_class=True) to send them as observation object, but it’s slower.

Type:: bool

return_info

Whether to return the information dictionary or not (might speed up computation)

Type:: bool

Methods:

`close`()	Close all the environments and all the processes.
`get_comp_time`()	Get the computation time (only of the step part, corresponds to sub_env.comp_time) of each sub environments
`get_obs`()	implement the get_obs function that is "broken" if you use the __getattr__
`get_parameters`()	Get the parameters of each sub environments
`get_powerflow_time`()	Get the computation time (corresponding to sub_env.backend.comp_time) of each sub environments
`get_seeds`()	Get the seeds used to initialize each sub environments.
`get_step_time`()	Get the computation time (corresponding to sub_env._time_step) of each sub environments
`reset`()	Reset all the environments, and return all the associated observation.
`set_chunk_size`(new_chunk_size)	Dynamically adapt the amount of data read from the hard drive.
`set_ff`([ff_max])	This method is primarily used for training.
`set_filter`(filter_funs)	Set a filter_fun for each of the underlying environment.
`set_id`(id_)	Set a chronics id for each of the underlying environment to be used for each of the sub_env.
`simulate`(actions)	Perform the equivalent of obs.simulate in all the underlying environment
`step`(actions)	Perform a step in all the underlying environments.

close()[source]: Close all the environments and all the processes.

get_comp_time()[source]: Get the computation time (only of the step part, corresponds to sub_env.comp_time) of each sub environments

get_obs()[source]: implement the get_obs function that is “broken” if you use the __getattr__

get_parameters()[source]: Get the parameters of each sub environments

get_powerflow_time()[source]: Get the computation time (corresponding to sub_env.backend.comp_time) of each sub environments

get_seeds()[source]: Get the seeds used to initialize each sub environments.

get_step_time()[source]: Get the computation time (corresponding to sub_env._time_step) of each sub environments

reset()[source]

Reset all the environments, and return all the associated observation.

NB Except in some specific occasion, there is no need to call this function reset. Indeed, when a sub environment is “done” then it is automatically restarted in the :func:BaseMultiEnvMultiProcess.step` function.

Returns:: res – The list of all observations. This list counts MultiEnvironment.nb_env elements, each one being an grid2op.Observation.BaseObservation.
Return type:: list

set_chunk_size(new_chunk_size)[source]

Dynamically adapt the amount of data read from the hard drive. Usefull to set it to a low integer value (eg 10 or 100) at the beginning of the learning process, when agent fails pretty quickly.

This takes effect only after a reset has been performed.

Parameters:: new_chunk_size (int) – The new chunk size (positive integer)

set_ff(ff_max=2016.0)[source]

This method is primarily used for training.

The problem this method aims at solving is the following: most of grid2op environments starts a Monday at 00:00. This method will “fast forward” an environment for a random number of timestep between 0 and ff_max

set_filter(filter_funs)[source]

Set a filter_fun for each of the underlying environment.

See grid2op.Chronis.MultiFolder.set_filter() for more information

Examples

TODO usage example

set_id(id_)[source]

Set a chronics id for each of the underlying environment to be used for each of the sub_env.

See grid2op.Environment.Environment.set_id() for more information

Examples

TODO usage example

simulate(actions)[source]

Perform the equivalent of obs.simulate in all the underlying environment

Parameters:

actions (list) – List of all action to simulate

Returns:

sim_obs – The observation resulting from the simulation
sim_rews – The reward resulting from the simulation
sim_dones – For each simulation, whether or not this the simulated action lead to a game over
sim_infos – Additional information for each simulated actions.

Examples

You can use this feature like:

import grid2op
from grid2op.Environment import BaseMultiProcessEnvironment

env_name = "l2rpn_case14_sandbox"  # or any other name
env1 = grid2op.make(env_name)
env2 = grid2op.make(env_name)

multi_env = BaseMultiProcessEnvironment([env1, env2])
obss = multi_env.reset()

# simulate
actions = [env1.action_space(), env2.action_space()]
sim_obss, sim_rs, sim_ds, sim_is = multi_env.simulate(actions)

step(actions)[source]

Perform a step in all the underlying environments. If one or more of the underlying environments encounters a game over, it is automatically restarted.

The observation sent back to the user is the observation after the grid2op.Environment.Environment.reset() has been called.

As opposed to Environment.step a call to this function will automatically reset any of the underlying environments in case one of them is “done”. This is performed the following way. In the case one underlying environment is over (due to game over or due to end of the chronics), then:

the corresponding “done” is returned as True
the corresponding observation returned is not the observation of the last time step (corresponding to the underlying environment that is game over) but is the first observation after reset.

At the next call to step, the flag done will be (if not game over arise) set to False and the corresponding observation is the next observation of this underlying environment: every thing works as usual in this case.

We did that because restarting the game over environment added un necessary complexity.

Parameters:

actions (list) – List of MultiEnvironment.nb_env grid2op.Action.BaseAction. Each action will be executed in the corresponding underlying environment.

Returns:

obs (list) – List all the observations returned by each underlying environment.
rews (list) – List all the rewards returned by each underlying environment.
dones (list) – List all the “done” returned by each underlying environment. If one of this value is “True” this means the environment encounter a game over.
infos (list) – List of dictionaries corresponding

Examples

You can use this class as followed:

import grid2op
from grid2op.Environment import BaseMultiProcessEnv
env1 = grid2op.make("l2rpn_case14_sandbox")  # create an environment of your choosing
env2 = grid2op.make("l2rpn_case14_sandbox")  # create another environment of your choosing

multi_env = BaseMultiProcessEnv([env1, env2])
obss = multi_env.reset()
obs1, obs2 = obss  # here i extract the observation of the first environment and of the second one
# note that you cannot do obs1.simulate().
# this is equivalent to a call to
# obs1 = env1.reset(); obs2 = env2.reset()

# then you can do regular steps
action_env1 = env1.action_space()
action_env2 = env2.action_space()
obss, rewards, dones, infos = env.step([action_env1, action_env2])
# if you define
# obs1, obs2 = obss
# r1, r2 = rewards
# done1, done2 = dones
# info1, info2 = infos
# in this case, it is equivalent to calling
# obs1, r1, done1, info1 = env1.step(action_env1)
# obs2, r2, done2, info2 = env2.step(action_env2)

Let us now focus on the “automatic” reset part.

# see above for the creation of a multi_env and the proper imports
multi_env = BaseMultiProcessEnv([env1, env2])
action_env1 = env1.action_space()
action_env2 = env2.action_space()
obss, rewards, dones, infos = env.step([action_env1, action_env2])

# say dones[0] is ``True``
# in this case if you define
# obs1 = obss[0]
# r1=rewards[0]
# done1=done[0]
# info1=info[0]
# in that case it is equivalent to the "single processed" code
# obs1_tmp, r1_tmp, done1_tmp, info1_tmp = env1.step(action_env1)
# done1 = done1_tmp
# r1 = r1_tmp
# info1 = info1_tmp
# obs1_aux = env1.reset()
# obs1 = obs1_aux
# CAREFULLL in this case, obs1 is NOT obs1_tmp but is really

class grid2op.Environment.Environment(init_env_path: str, init_grid_path: str, chronics_handler, backend, parameters, name='unknown', n_busbar: int | ~typing.List[int] | ~typing.Dict[str, int] = 2, names_chronics_to_backend=None, actionClass=<class 'grid2op.Action.topologyAction.TopologyAction'>, observationClass=<class 'grid2op.Observation.completeObservation.CompleteObservation'>, rewardClass=<class 'grid2op.Reward.flatReward.FlatReward'>, legalActClass=<class 'grid2op.Rules.AlwaysLegal.AlwaysLegal'>, voltagecontrolerClass=<class 'grid2op.VoltageControler.ControlVoltageFromFile.ControlVoltageFromFile'>, other_rewards={}, thermal_limit_a=None, with_forecast=True, epsilon_poly=0.0001, tol_poly=0.01, opponent_space_type=<class 'grid2op.Opponent.opponentSpace.OpponentSpace'>, opponent_action_class=<class 'grid2op.Action.dontAct.DontAct'>, opponent_class=<class 'grid2op.Opponent.baseOpponent.BaseOpponent'>, opponent_init_budget=0.0, opponent_budget_per_ts=0.0, opponent_budget_class=<class 'grid2op.Opponent.neverAttackBudget.NeverAttackBudget'>, opponent_attack_duration=0, opponent_attack_cooldown=99999, kwargs_opponent={}, attention_budget_cls=<class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget={}, has_attention_budget=False, logger=None, kwargs_observation=None, observation_bk_class=None, observation_bk_kwargs=None, highres_sim_counter=None, _update_obs_after_reward=True, _init_obs=None, _raw_backend_class=None, _compat_glop_version=None, _read_from_local_dir=None, _is_test=False, _allow_loaded_backend=False, _local_dir_cls=None, _overload_name_multimix=None)[source]

This class is the grid2op implementation of the “Environment” entity in the RL framework.

Danger

Long story short, once a environment is deleted, you cannot use anything it “holds” including, but not limited to the capacity to perform obs.simulate(…) even if the obs is still referenced.

See Notes (first danger block).

name

The name of the environment

Type:: str

action_space

Another name for Environment.helper_action_player for gym compatibility.

Type:: grid2op.Action.ActionSpace

observation_space

Another name for Environment.helper_observation for gym compatibility.

Type:: grid2op.Observation.ObservationSpace

reward_range

The range of the reward function

Type:: (float, float)

metadata

For gym compatibility, do not use

Type:: dict

spec

For Gym compatibility, do not use

Type:: None

_viewer

Used to display the powergrid. Currently properly supported.

Type:: object

Methods:

`add_text_logger`([logger])	Add a text logger to this `Environment`
`attach_renderer`([graph_layout])	This function will attach a renderer, necessary to use for plotting capabilities.
`copy`()	Performs a deep copy of the environment
`generate_data`([nb_year, nb_core, seed])	This function uses the chronix2grid package to generate more data that will then be available locally.
`get_kwargs`([with_backend, ...])	This function allows to make another Environment with the same parameters as the one that have been used to make this one.
`get_params_for_runner`()	This method is used to initialize a proper `grid2op.Runner.Runner` to use this specific environment.
`max_episode_duration`()	Return the maximum duration (in number of steps) of the current episode.
`render`([mode])	Render the state of the environment on the screen, using matplotlib Also returns the Matplotlib figure
`reset`(*[, seed, options])	Reset the environment to a clean state.
`reset_grid`([init_act_opt, method])	INTERNAL
`set_chunk_size`(new_chunk_size)	For an efficient data pipeline, it can be usefull to not read all part of the input data (for example for load_p, prod_p, load_q, prod_v).
`set_id`(id_)	Set the id that will be used at the next call to `Environment.reset()`.
`set_max_iter`(max_iter)	Set the maximum duration of an episode for all the next episodes.
`simulate`(action)	Another method to call obs.simulate to ensure compatibility between multi environment and regular one.
`train_val_split`(val_scen_id[, ...])	This function is used as `Environment.train_val_split_random()`.
`train_val_split_random`([pct_val, ...])	By default a grid2op environment contains multiple "scenarios" containing values for all the producers and consumers representing multiple days.

add_text_logger(logger=None)[source]

Add a text logger to this Environment

Logging is for now an incomplete feature, really incomplete (not used)

Parameters:: logger – The logger to use

attach_renderer(graph_layout=None)[source]

This function will attach a renderer, necessary to use for plotting capabilities.

Parameters:

graph_layout (dict) –

Here for backward compatibility. Currently not used.

If you want to set a specific layout call BaseEnv.attach_layout()

If None this class will use the default substations layout provided when the environment was created. Otherwise it will use the data provided.

Examples

Here is how to use the function

import grid2op

# create the environment
env = grid2op.make("l2rpn_case14_sandbox")

if False:
    # if you want to change the default layout of the powergrid
    # assign coordinates (0., 0.) to all substations (this is a dummy thing to do here!)
    layout = {sub_name: (0., 0.) for sub_name in env.name_sub}
    env.attach_layout(layout)
    # NB again, this code will make everything look super ugly !!!! Don't change the
    # default layout unless you have a reason to.

# and if you want to use the renderer
env.attach_renderer()

# and now you can "render" (plot) the state of the grid
obs = env.reset()
done = False
reward = env.reward_range[0]
while not done:
    env.render()
    action = agent.act(obs, reward, done)
    obs, reward, done, info = env.step(action)

copy() → Environment[source]

Performs a deep copy of the environment

Unless you have a reason to, it is not advised to make copy of an Environment.

Examples

It should be used as follow:

import grid2op
env = grid2op.make("l2rpn_case14_sandbox")
cpy_of_env = env.copy()

generate_data(nb_year=1, nb_core=1, seed=None, **kwargs)[source]

This function uses the chronix2grid package to generate more data that will then be available locally. You need to install it independently (see https://github.com/BDonnot/ChroniX2Grid#installation for more information)

I also requires the lightsim2grid simulator.

This is only available for some environment (only the environment after 2022).

Generating data takes some time (around 1 - 2 minutes to generate a weekly scenario) and this why we recommend to do it “offline” and then use the generated data for training or evaluation.

Warning

You should not start this function twice. Before starting a new run, make sure the previous one has terminated (otherwise you might erase some previously generated scenario)

Examples

The recommended process when you want to use this function is to first generate some more data:

import grid2op
env = grid2op.make("l2rpn_wcci_2022")
env.generate_data(nb_year=XXX)  # replace XXX by the amount of data you want. If you put 1 you will have 52 different
# scenarios

Then, later on, you can use it as you please, transparently:

import grid2op
env = grid2op.make("l2rpn_wcci_2022")

obs = env.reset()  # obs might come from the data you have generated

Parameters:

nb_year (int, optional) – the number of “year” you want to generate. Each “year” is made of 52 weeks meaning that if you ask to generate one year, you have 52 more scenarios, by default 1
nb_core (int, optional) – number of computer cores to use, by default 1.
seed (int, optional) – If the same seed is given, then the same data will be generated.
**kwargs – key word arguments passed to add_data function of chronix2grid.grid2op_utils module

get_kwargs(with_backend=True, with_chronics_handler=True, with_backend_kwargs=False)[source]

This function allows to make another Environment with the same parameters as the one that have been used to make this one.

This is useful especially in cases where Environment is not pickable (for example if some non pickable c++ code are used) but you still want to make parallel processing using “MultiProcessing” module. In that case, you can send this dictionary to each child process, and have each child process make a copy of self

NB This function should not be used to make a copy of an environment. Prefer using Environment.copy() for such purpose.

Returns:: res – A dictionary that helps build an environment like self (which is NOT a copy of self) but rather an instance of an environment with the same properties.
Return type:: dict

Examples

It should be used as follow:

import grid2op
from grid2op.Environment import Environment
env = grid2op.make("l2rpn_case14_sandbox")  # create the environment of your choice
copy_of_env = Environment(**env.get_kwargs())
# And you can use this one as you would any other environment.
# NB this is not a "proper" copy. for example it will not be at the same step, it will be possible
# seeded with a different seed.
# use `env.copy()` to make a proper copy of an environment.

get_params_for_runner()[source]

This method is used to initialize a proper grid2op.Runner.Runner to use this specific environment.

Examples

It should be used as followed:

import grid2op
from grid2op.Runner import Runner
from grid2op.Agent import DoNothingAgent  # for example
env = grid2op.make("l2rpn_case14_sandbox")  # create the environment of your choice

# create the proper runner
runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent)

# now you can run
runner.run(nb_episode=1)  # run for 1 episode

max_episode_duration()[source]

Return the maximum duration (in number of steps) of the current episode.

Notes

For possibly infinite episode, the duration is returned as np.iinfo(np.int32).max which corresponds to the maximum 32 bit integer (usually 2147483647)

render(mode='rgb_array')[source]

Render the state of the environment on the screen, using matplotlib Also returns the Matplotlib figure

Examples

Rendering need first to define a “renderer” which can be done with the following code:

import grid2op

# create the environment
env = grid2op.make("l2rpn_case14_sandbox")

# if you want to use the renderer
env.attach_renderer()

# and now you can "render" (plot) the state of the grid
obs = env.reset()
done = False
reward = env.reward_range[0]
while not done:
    env.render()  # this piece of code plot the grid
    action = agent.act(obs, reward, done)
    obs, reward, done, info = env.step(action)

reset(*, seed: int | None = None, options: Dict[Literal['time serie id'], int] | Dict[Literal['init state'], Dict[Literal['set_line_status', 'change_line_status', 'set_bus', 'change_bus', 'redispatch', 'set_storage', 'curtail', 'raise_alarm', 'raise_alert', 'injection', 'hazards', 'maintenance', 'shunt'], Any]] | Dict[Literal['init ts'], int] | Dict[Literal['max step'], int] | None = None) → BaseObservation[source]

Reset the environment to a clean state. It will reload the next chronics if any. And reset the grid to a clean state.

This triggers a full reloading of both the chronics (if they are stored as files) and of the powergrid, to ensure the episode is fully over.

This method should be called only at the end of an episode.

Parameters:

seed (int) – The seed to used (new in version 1.9.8), see examples for more details. Ignored if not set (meaning no seeds will be used, experiments might not be reproducible)
options (dict) –
Some options to “customize” the reset call. For example specifying the “time serie id” (grid2op >= 1.9.8) to use or the “initial state of the grid” (grid2op >= 1.10.2) or to start the episode at some specific time in the time series (grid2op >= 1.10.3) with the “init ts” key.

See examples for more information about this. Ignored if not set.

Examples

The standard “gym loop” can be done with the following code:

import grid2op

# create the environment
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)

# start a new episode
obs = env.reset()
done = False
reward = env.reward_range[0]
while not done:
    action = agent.act(obs, reward, done)
    obs, reward, done, info = env.step(action)

New in version 1.9.8: It is now possible to set the seed and the time series you want to use at the new episode by calling env.reset(seed=…, options={“time serie id”: …})

Before version 1.9.8, if you wanted to use a fixed seed, you would need to (see doc of grid2op.Environment.BaseEnv.seed() ):

seed = ...
env.seed(seed)
obs = env.reset()
...

Starting from version 1.9.8 you can do this in one call:

seed = ...
obs = env.reset(seed=seed)

For the “time series id” it is the same concept. Before you would need to do (see doc of Environment.set_id() for more information ):

time_serie_id = ...
env.set_id(time_serie_id)
obs = env.reset()
...

And now (from version 1.9.8) you can more simply do:

time_serie_id = ...
obs = env.reset(options={"time serie id": time_serie_id})
...

New in version 1.10.2.

Another feature has been added in version 1.10.2, which is the possibility to set the grid to a given “topological” state at the first observation (before this version, you could only retrieve an observation with everything connected together).

In grid2op 1.10.2, you can do that by using the keys “init state” in the “options” kwargs of the reset function. The value associated to this key should be dictionnary that can be converted to a non ambiguous grid2op action using an “action space”.

Note

The “action space” used here is not the action space of the agent. It’s an “action space” that uses a grid2op.Action.Action.BaseAction() class meaning you can do any type of action, on shunts, on topology, on line status etc. even if the agent is not allowed to.

Likewise, nothing check if this action is legal or not.

You can use it like this:

# to start an episode with a line disconnected, you can do:
init_state_dict = {"set_line_status": [(0, -1)]}
obs = env.reset(options={"init state": init_state_dict})
obs.line_status[0] is False

# to start an episode with a different topolovy
init_state_dict = {"set_bus": {"lines_or_id": [(0, 2)], "lines_ex_id": [(3, 2)]}}
obs = env.reset(options={"init state": init_state_dict})

Note

Since grid2op version 1.10.2, there is also the possibility to set the “initial state” of the grid directly in the time series. The priority is always given to the argument passed in the “options” value.

Concretely if, in the “time series” (formelly called “chronics”) provides an action would change the topology of substation 1 and 2 (for example) and you provide an action that disable the line 6, then the initial state will see substation 1 and 2 changed (as in the time series) and line 6 disconnected.

Another example in this case: if the action you provide would change topology of substation 2 and 4 then the initial state (after env.reset) will give:

substation 1 as in the time serie
substation 2 as in “options”
substation 4 as in “options”

Note

Concerning the previously described behaviour, if you want to ignore the data in the time series, you can add : “method”: “ignore” in the dictionary describing the action. In this case the action in the time series will be totally ignored and the initial state will be fully set by the action passed in the “options” dict.

An example is:

init_state_dict = {"set_line_status": [(0, -1)], "method": "force"}
obs = env.reset(options={"init state": init_state_dict})
obs.line_status[0] is False

New in version 1.10.3.

Another feature has been added in version 1.10.3, the possibility to skip the some steps of the time series and starts at some given steps.

The time series often always start at a given day of the week (eg Monday) and at a given time (eg midnight). But for some reason you notice that your agent performs poorly on other day of the week or time of the day. This might be because it has seen much more data from Monday at midnight that from any other day and hour of the day.

To alleviate this issue, you can now easily reset an episode and ask grid2op to start this episode after xxx steps have “passed”.

Concretely, you can do it with:

import grid2op
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)

obs = env.reset(options={"init ts": 1})

Doing that your agent will start its episode not at midnight (which is the case for this environment), but at 00:05

If you do:

obs = env.reset(options={"init ts": 12})

In this case, you start the episode at 01:00 and not at midnight (you start at what would have been the 12th steps)

If you want to start the “next day”, you can do:

obs = env.reset(options={"init ts": 288})

etc.

Note

On this feature, if a powerline is on soft overflow (meaning its flow is above the limit but below the grid2op.Parameters.Parameters.HARD_OVERFLOW_THRESHOLD * the limit) then it is still connected (of course) and the counter grid2op.Observation.BaseObservation.timestep_overflow is at 0.

If a powerline is on “hard overflow” (meaning its flow would be above grid2op.Parameters.Parameters.HARD_OVERFLOW_THRESHOLD * the limit), then, as it is the case for a “normal” (without options) reset, this line is disconnected, but can be reconnected directly (grid2op.Observation.BaseObservation.time_before_cooldown_line == 0)

See also

The function Environment.fast_forward_chronics() for an alternative usage (that will be deprecated at some point)

Yet another feature has been added in grid2op version 1.10.3 in this env.reset function. It is the capacity to limit the duration of an episode.

import grid2op
env_name = "l2rpn_case14_sandbox"
env = grid2op.make(env_name)

obs = env.reset(options={"max step": 288})

This will limit the duration to 288 steps (1 day), meaning your agent will have successfully managed the entire episode if it manages to keep the grid in a safe state for a whole day (depending on the environment you are using the default duration is either one week - roughly 2016 steps or 4 weeks)

Note

This option only affect the current episode. It will have no impact on the next episode (after reset)

For example:

obs = env.reset()
obs.max_step == 8064  # default for this environment

obs = env.reset(options={"max step": 288})
obs.max_step == 288  # specified by the option

obs = env.reset()
obs.max_step == 8064  # retrieve the default behaviour

See also

The function Environment.set_max_iter() for an alternative usage with the different that set_max_iter is permenanent: it impacts all the future episodes and not only the next one.

reset_grid(init_act_opt: BaseAction | None = None, method: Literal['combine', 'ignore'] = 'combine')[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

This is automatically called when using env.reset

Reset the backend to a clean state by reloading the powergrid from the hard drive. This might takes some time.

If the thermal has been modified, it also modify them into the new backend.

set_chunk_size(new_chunk_size)[source]

For an efficient data pipeline, it can be usefull to not read all part of the input data (for example for load_p, prod_p, load_q, prod_v). Grid2Op support the reading of large chronics by “chunk” of given size.

Reading data in chunk can also reduce the memory footprint, useful in case of multiprocessing environment while large chronics.

It is critical to set a small chunk_size in case of training machine learning algorithm (reinforcement learning agent) at the beginning when the agent performs poorly, the software might spend most of its time loading the data.

NB this has no effect if the chronics does not support this feature.

NB The environment need to be reset for this to take effect (it won’t affect the chronics already loaded)

Parameters:: new_chunk_size (int or None) – The new chunk size (positive integer)

Examples

Here is an example on how to use this function

import grid2op

# I create an environment
env = grid2op.make("l2rpn_case14_sandbox", test=True)
env.set_chunk_size(100)
env.reset()  # otherwise chunk size has no effect !
# and now data will be read from the hard drive 100 time steps per 100 time steps
# instead of the whole episode at once.

set_id(id_: int | str) → None[source]

Set the id that will be used at the next call to Environment.reset().

NB this has no effect if the chronics does not support this feature.

NB The environment need to be reset for this to take effect.

Changed in version 1.6.4: id_ can now be a string instead of an integer. You can call something like env.set_id(“0000”) or env.set_id(“Scenario_april_000”) or env.set_id(“2050-01-03_0”) (depending on your environment) to use the right time series.

See also

function Environment.reset() for extra information

Changed in version 1.9.8: Starting from version 1.9.8 you can directly set the time serie id when calling reset.

Warning

If the “time serie generator” you use is on standard (eg it is random in some sense) and if you want fully reproducible results, you should first call env.set_id(…) and then call env.seed(…) (and of course env.reset())

Calling env.seed(…) and then env.set_id(…) might not behave the way you want.

In this case, it is much better to use the function reset(seed=…, options={“time serie id”: …}) directly.

Parameters:: id (int) – the id of the chronics used.

Examples

Here an example that will loop 10 times through the same chronics (always using the same injection then):

import grid2op
from grid2op import make
from grid2op.BaseAgent import DoNothingAgent

env = make("l2rpn_case14_sandbox")  # create an environment
agent = DoNothingAgent(env.action_space)  # create an BaseAgent

for i in range(10):
    env.set_id(0)  # tell the environment you simply want to use the chronics with ID 0
    obs = env.reset()  # it is necessary to perform a reset
    reward = env.reward_range[0]
    done = False
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = env.step(act)

And here you have an example on how you can loop through the scenarios in a given order:

import grid2op
from grid2op import make
from grid2op.BaseAgent import DoNothingAgent

env = make("l2rpn_case14_sandbox")  # create an environment
agent = DoNothingAgent(env.action_space)  # create an BaseAgent
scenario_order = [1,2,3,4,5,10,8,6,5,7,78, 8]
for id_ in scenario_order:
    env.set_id(id_)  # tell the environment you simply want to use the chronics with ID 0
    obs = env.reset()  # it is necessary to perform a reset
    reward = env.reward_range[0]
    done = False
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = env.step(act)

set_max_iter(max_iter)[source]

Set the maximum duration of an episode for all the next episodes.

See also

The option max step when calling the Environment.reset() function used like obs = env.reset(options={“max step”: 288}) (see examples of env.reset for more information)

Note

The real maximum duration of a duration depends on this parameter but also on the size of the time series used. For example, if you use an environment with time series lasting 8064 steps and you call env.set_max_iter(9000) the maximum number of iteration will still be 8064.

Warning

It only has an impact on future episode. Said differently it also has an impact AFTER env.reset has been called.

Danger

The usage of both BaseEnv.fast_forward_chronics() and Environment.set_max_iter() is not recommended at all and might not behave correctly. Please use env.reset with obs = env.reset(options={“max step”: xxx, “init ts”: yyy}) for a correct behaviour.

Parameters:: max_iter (int) – The maximum number of iterations you can do before reaching the end of the episode. Set it to “-1” for possibly infinite episode duration.

Examples

It can be used like this:

import grid2op
env_name = "l2rpn_case14_sandbox"

env = grid2op.make(env_name)

obs = env.reset()
obs.max_step == 8064  # default for this environment

env.set_max_iter(288)
# no impact here

obs = env.reset()
obs.max_step == 288

# the limitation still applies to the next episode
obs = env.reset()
obs.max_step == 288

If you want to “unset” your limitation, you can do:

env.set_max_iter(-1)
obs = env.reset()
obs.max_step == 8064

Finally, you cannot limit it to something larger than the duration of the time series of the environment:

env.set_max_iter(9000)
obs = env.reset()
obs.max_step == 8064
# the call to env.set_max_iter has no impact here

Notes

Maximum length of the episode can depend on the chronics used. See Environment.chronics_handler for more information

simulate(action)[source]

Another method to call obs.simulate to ensure compatibility between multi environment and regular one.

Parameters:

action – A grid2op action

Returns:

Same return type as grid2op.Environment.BaseEnv.step() or
grid2op.Observation.BaseObservation.simulate()

Notes

Prefer using obs.simulate if possible, it will be faster than this function.

train_val_split(val_scen_id, add_for_train='train', add_for_val='val', add_for_test=None, test_scen_id=None, remove_from_name=None, deep_copy=False)[source]

This function is used as Environment.train_val_split_random().

Please refer to this the help of Environment.train_val_split_random() for more information about this function.

Parameters:

val_scen_id (list) – List of the scenario names that will be placed in the validation set
test_scen_id (list) –

New in version 2.6.5.

List of the scenario names that will be placed in the test set (only used if add_for_test is not None - and mandatory in this case)
add_for_train (str) – See Environment.train_val_split_random() for more information
add_for_val (str) – See Environment.train_val_split_random() for more information
add_for_test (str) –

New in version 2.6.5.

See Environment.train_val_split_random() for more information
remove_from_name (str) – See Environment.train_val_split_random() for more information
deep_copy (bool) –

New in version 2.6.5.

See Environment.train_val_split_random() for more information

Returns:

nm_train (str) – See Environment.train_val_split_random() for more information
nm_val (str) – See Environment.train_val_split_random() for more information
nm_test (str, optionnal) – .. versionadded:: 2.6.5

See Environment.train_val_split_random() for more information

Examples

A full example on a training / validation / test split with explicit specification of which chronics goes in which scenarios is:

import grid2op
import os

env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# retrieve the names of the chronics:
full_path_data = env.chronics_handler.subpaths
chron_names = [os.path.split(el)[-1] for el in full_path_data]

# splitting into training / test, keeping the "last" 10 chronics to the test set
nm_env_train, m_env_val, nm_env_test = env.train_val_split(test_scen_id=chron_names[-10:],  # last 10 in test set
                                                           add_for_test="test",
                                                           val_scen_id=chron_names[-20:-10],  # last 20 to last 10 in val test
                                                           )

env_train = grid2op.make(env_name+"_train")
env_val = grid2op.make(env_name+"_val")
env_test = grid2op.make(env_name+"_test")

For a more simple example, with less parametrization and with random assignment (recommended), please refer to the help of Environment.train_val_split_random()

NB read the “Notes” of this section for possible “unexpected” behaviour of the code snippet above.

On Some windows based platform, if you don’t have an admin account nor a “developer” account (see https://docs.python.org/3/library/os.html#os.symlink) you might need to do:

import grid2op
import os

env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# retrieve the names of the chronics:
full_path_data = env.chronics_handler.subpaths
chron_names = [os.path.split(el)[-1] for el in full_path_data]


# splitting into training / test, keeping the "last" 10 chronics to the test set
nm_env_train, m_env_val, nm_env_test = env.train_val_split(test_scen_id=chron_names[-10:],  # last 10 in test set
                                                           add_for_test="test",
                                                           val_scen_id=chron_names[-20:-10],  # last 20 to last 10 in val test
                                                           deep_copy=True)

Warning

The above code will use much more memory on your hard drive than the version using symbolic links. It will also be significantly slower !

As an “historical curiosity”, this is what you needed to do in grid2op version < 1.6.5:

import grid2op
import os

env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# retrieve the names of the chronics:
full_path_data = env.chronics_handler.subpaths
chron_names = [os.path.split(el)[-1] for el in full_path_data]

# splitting into training / test, keeping the "last" 10 chronics to the test set
nm_env_trainval, nm_env_test = env.train_val_split(val_scen_id=chron_names[-10:],
                                                   add_for_val="test",
                                                   add_for_train="trainval")

# now splitting again the training set into training and validation, keeping the last 10 chronics
# of this environment for validation
env_trainval = grid2op.make(nm_env_trainval)  # create the "trainval" environment
full_path_data = env_trainval.chronics_handler.subpaths
chron_names = [os.path.split(el)[-1] for el in full_path_data]
nm_env_train, nm_env_val = env_trainval.train_val_split(val_scen_id=chron_names[-10:],
                                                        remove_from_name="_trainval$")

# and now you can use the following code to load the environments:
env_train = grid2op.make(env_name+"_train")
env_val = grid2op.make(env_name+"_val")
env_test = grid2op.make(env_name+"_test")

Notes

We don’t recommend you to use this function. It provides a great level of control on which scenarios goes into which dataset, which is nice, but “with great power comes great responsibilities”.

Keep in mind that scenarios might be “sorted” by having some “month” in their names. For example, the first k scenarios might be called “April_XXX” and the last k ones having names with “September_XXX”.

In general, we would not consider good practice to have all validation (or test) scenarios coming from the same months. Keep that in mind if you use the code snippet above.

train_val_split_random(pct_val=10.0, add_for_train='train', add_for_val='val', add_for_test=None, pct_test=None, remove_from_name=None, deep_copy=False)[source]

By default a grid2op environment contains multiple “scenarios” containing values for all the producers and consumers representing multiple days. In a “game like” environment, you can think of the scenarios as being different “game levels”: different mazes in pacman, different levels in mario etc.

We recommend to train your agent on some of these “chroncis” (aka levels) and test the performance of your agent on some others, to avoid overfitting.

This function allows to easily split an environment into different part. This is most commonly used in machine learning where part of a dataset is used for training and another part is used for assessing the performance of the trained model.

This function rely on “symbolic link” and will not duplicate data.

New created environments will behave like regular grid2op environment and will be accessible with “make” just like any others (see the examples section for more information).

This function will make the split at random. If you want more control on the which scenarios to use for training and which for validation, use the Environment.train_val_split() that allows to specify which scenarios goes in the validation environment (and the others go in the training environment).

Parameters:

pct_val (float) – Percentage of chronics that will go to the validation set. For 10% of the chronics, set it to 10. and NOT to 0.1.
add_for_train (str) – Suffix that will be added to the name of the environment for the training set. We don’t recommend to modify the default value (“train”)
add_for_val (str) – Suffix that will be added to the name of the environment for the validation set. We don’t recommend to modify the default value (“val”)
add_for_test (str, (optional)) –

New in version 2.6.5.

Suffix that will be added to the name of the environment for the test set. By default, it only splits into training and validation, so this is ignored. We recommend to assign it to “test” if you want to split into training / validation and test. If it is set, then the pct_test must also be set.
pct_test (float, (optional)) –

New in version 2.6.5.

Percentage of chronics that will go to the test set. For 10% of the chronics, set it to 10. and NOT to 0.1. (If you set it, you need to set the add_for_test argument.)
remove_from_name (str) – If you “split” an environment multiple times, this allows you to keep “short” names (for example you will be able to call grid2op.make(env_name+”_train”) instead of grid2op.make(env_name+”_train_train”))
deep_copy (bool) –

New in version 2.6.5.

A function to specify to “copy” the elements of the original environment to the created one. By default it will save as much memory as possible using symbolic links (rather than performing copies). By default it does use symbolic links (deep_copy=False).

Note

If set to True the new environment will take much more space on the hard drive, and the execution of this function will be much slower !

Warning

On windows based system, you will most likely run into issues if you don’t set this parameters. Indeed, Windows does not link symbolink links (https://docs.python.org/3/library/os.html#os.symlink). In this case, you can use the deep_copy=True and it will work fine (examples in the function Environment.train_val_split())

Returns:

nm_train (str) – Complete name of the “training” environment
nm_val (str) – Complete name of the “validation” environment
nm_test (str, optionnal) – .. versionadded:: 2.6.5

Complete name of the “test” environment. It is only returned if add_for_test and pct_test are not None.

Examples

This function can be used like:

import grid2op
env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# extract 1% of the "chronics" to be used in the validation environment. The other 99% will
# be used for test
nm_env_train, nm_env_val = env.train_val_split_random(pct_val=1.)

# and now you can use the training set only to train your agent:
print(f"The name of the training environment is \"{nm_env_train}\"")
print(f"The name of the validation environment is \"{nm_env_val}\"")
env_train = grid2op.make(nm_env_train)

And even after you close the python session, you can still use this environment for training. If you used the exact code above that will look like:

import grid2op
env_name_train = "l2rpn_case14_sandbox_train"  # depending on the option you passed above
env_train = grid2op.make(env_name_train)

New in version 2.6.5: Possibility to create a training, validation AND test set.

If you have grid2op version >= 1.6.5, you can also use the following:

import grid2op
env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# extract 1% of the "chronics" to be used in the validation environment. The other 99% will
# be used for test
nm_env_train, nm_env_val, nm_env_test = env.train_val_split_random(pct_val=1., pct_test=1.)

# and now you can use the training set only to train your agent:
print(f"The name of the training environment is \"{nm_env_train}\"")
print(f"The name of the validation environment is \"{nm_env_val}\"")
print(f"The name of the test environment is \"{nm_env_test}\"")
env_train = grid2op.make(nm_env_train)

Warning

In this case this function returns 3 elements and not 2 !

Notes

This function will fail if an environment already exists with one of the name that would be given to the training environment or the validation environment (or test environment).

class grid2op.Environment.MaskedEnvironment(grid2op_env: Environment | dict, lines_of_interest)[source]

This class is the grid2op implementation of a “maked” environment: lines not in the lines_of_interest mask will NOT be deactivated by the environment is the flow is too high (or moderately high for too long.)

Warning

This class might not behave normally if used with TimeOutEnvironment, MultiEnv, MultiMixEnv etc.

Warning

At time of writing, the behaviour of “obs.simulate” is not modified

Examples

We recommend you build such an environment with:

import grid2op
from grid2op.Environment import MaskedEnvironment

env_name = "l2rpn_case14_sandbox"
lines_of_interest = np.array([True, True, True, True, True, True,
                              False, False, False, False, False, False,
                              False, False, False, False, False, False,
                              False, False])
env = MaskedEnvironment(grid2op.make(env_name),
                        lines_of_interest=lines_of_interest)

In particular, make sure to use grid2op.make(…) when creating the MaskedEnvironment and not to use another environment.

Methods:

`get_kwargs`([with_backend, with_chronics_handler])	This function allows to make another Environment with the same parameters as the one that have been used to make this one.
`get_params_for_runner`()	This method is used to initialize a proper `grid2op.Runner.Runner` to use this specific environment.

get_kwargs(with_backend=True, with_chronics_handler=True)[source]

This function allows to make another Environment with the same parameters as the one that have been used to make this one.

This is useful especially in cases where Environment is not pickable (for example if some non pickable c++ code are used) but you still want to make parallel processing using “MultiProcessing” module. In that case, you can send this dictionary to each child process, and have each child process make a copy of self

NB This function should not be used to make a copy of an environment. Prefer using Environment.copy() for such purpose.

Returns:: res – A dictionary that helps build an environment like self (which is NOT a copy of self) but rather an instance of an environment with the same properties.
Return type:: dict

Examples

It should be used as follow:

import grid2op
from grid2op.Environment import Environment
env = grid2op.make("l2rpn_case14_sandbox")  # create the environment of your choice
copy_of_env = Environment(**env.get_kwargs())
# And you can use this one as you would any other environment.
# NB this is not a "proper" copy. for example it will not be at the same step, it will be possible
# seeded with a different seed.
# use `env.copy()` to make a proper copy of an environment.

get_params_for_runner()[source]

This method is used to initialize a proper grid2op.Runner.Runner to use this specific environment.

Examples

It should be used as followed:

import grid2op
from grid2op.Runner import Runner
from grid2op.Agent import DoNothingAgent  # for example
env = grid2op.make("l2rpn_case14_sandbox")  # create the environment of your choice

# create the proper runner
runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent)

# now you can run
runner.run(nb_episode=1)  # run for 1 episode

class grid2op.Environment.MultiEnvMultiProcess(envs, nb_envs, obs_as_class=True, return_info=True, logger=None)[source]

This class allows to evaluate a single agent instance on multiple environments running in parrallel.

It is a kind of BaseMultiProcessEnvironment. For more information you can consult the documentation of this parent class. This class allows to interact at the same time with different copy of possibly different environments in parallel

envs

Al list of environments for which the evaluation will be made in parallel.

Type:: list:grid2op.Environment.Environment

nb_envs

Number of parallel underlying environment that will be handled. MUST be the same length as the parameter envs. The total number of subprocesses will be the sum of this list.

Type:: list:int

Examples

This class can be used as:

import grid2op
from grid2op.Environment import MultiEnvMultiProcess
env0 = grid2op.make("l2rpn_case14_sandbox")  # create an environment
env1 = grid2op.make("l2rpn_case14_sandbox")  # create a second environment, that can be similar, or not
# it is recommended to filter or create the environment with different parameters, otherwise this class
# is of little interest
envs = [env0, env1]  # list of all environments created
nb_envs = [1, 7]  # number of "copies" of each environment that will be made.
# in this case the first one will be copied only once, and the second one 7 times.
# the total number of environments used in the multi env will be the sum(nb_envs), here 8.

multi_env = MultiEnvMultiProcess(envs=envs, nb_envs=nb_envs)
# and now you can use it like any other grid2op environment (almost)
observations = multi_env.reset()

class grid2op.Environment.MultiMixEnvironment(envs_dir, logger=None, experimental_read_from_local_dir=None, n_busbar=2, _add_to_name='', _compat_glop_version=None, _test=False, **kwargs)[source]

This class represent a single powergrid configuration, backed by multiple environments parameters and chronics

It implements most of the BaseEnv public interface: so it can be used as a more classic environment.

MultiMixEnvironment environments behave like a superset of the environment: they are made of sub environments (called mixes) that are grid2op regular Environment. You might think the MultiMixEnvironment as a dictionary of Environment that implements some of the BaseEnv interface such as BaseEnv.step() or BaseEnv.reset().

By default, each time you call the “step” function a different mix is used. Mixes, by default are looped through always in the same order. You can see the Examples section for information about control of these

Examples

In this section we present some common use of the MultiMix environment.

Basic Usage

You can think of a MultiMixEnvironment as any Environment. So this is a perfectly valid way to use a MultiMix:

import grid2op
from grid2op.Agent import RandomAgent

# we use an example of a multimix dataset attached with grid2op pacakage
multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True)

# define an agent like in any environment
agent = RandomAgent(multimix_env.action_space)

# and now you can do the open ai gym loop
NB_EPISODE = 10
for i in range(NB_EPISODE):
    obs = multimix_env.reset()
    # each time "reset" is called, another mix is used.
    reward = multimix_env.reward_range[0]
    done = False
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = multimix_env.step(act)

Use each mix one after the other

In case you want to study each mix independently, you can iterate through the MultiMix in a pythonic way. This makes it easy to perform, for example, 10 episode for a given mix before passing to the next one.

import grid2op
from grid2op.Agent import RandomAgent

# we use an example of a multimix dataset attached with grid2op pacakage
multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True)

NB_EPISODE = 10
for mix in multimix_env:
    # mix is a regular environment, you can do whatever you want with it
    # for example
    for i in range(NB_EPISODE):
        obs = multimix_env.reset()
        # each time "reset" is called, another mix is used.
        reward = multimix_env.reward_range[0]
        done = False
        while not done:
            act = agent.act(obs, reward, done)
            obs, reward, done, info = multimix_env.step(act)

Selecting a given Mix

Sometimes it might be interesting to study only a given mix. For that you can use the [] operator to select only a given mix (which is a grid2op environment) and use it as you would.

This can be done with:

import grid2op
from grid2op.Agent import RandomAgent

# we use an example of a multimix dataset attached with grid2op pacakage
multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True)

# define an agent like in any environment
agent = RandomAgent(multimix_env.action_space)

# list all available mixes:
mixes_names = list(multimix_env.keys())

# and now supposes we want to study only the first one
mix = multimix_env[mixes_names[0]]

# and now you can do the open ai gym loop, or anything you want with it
NB_EPISODE = 10
for i in range(NB_EPISODE):
    obs = mix.reset()
    # each time "reset" is called, another mix is used.
    reward = mix.reward_range[0]
    done = False
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = mix.step(act)

Using the Runner

For MultiMixEnvironment using the grid2op.Runner.Runner cannot be done in a straightforward manner. Here we give an example on how to do it.

import os
import grid2op
from grid2op.Agent import RandomAgent

# we use an example of a multimix dataset attached with grid2op pacakage
multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True)

# you can use the runner as following
PATH = "PATH/WHERE/YOU/WANT/TO/SAVE/THE/RESULTS"
for mix in multimix_env:
    runner = Runner(**mix.get_params_for_runner(), agentClass=RandomAgent)
    runner.run(nb_episode=1,
               path_save=os.path.join(PATH,mix.name))

Methods:

`attach_layout`(grid_layout)	INTERNAL
`get_path_env`()	Get the path that allows to create this environment.
`seed`([seed])	Set the seed of this `Environment` for a better control and to ease reproducible experiments.
`set_thermal_limit`(thermal_limit)	Set the thermal limit effectively.

attach_layout(grid_layout)[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\ We do not recommend to “attach layout” outside of the environment. Please refer to the function grid2op.Environment.BaseEnv.attach_layout() for more information.

grid layout is a dictionary with the keys the name of the substations, and the value the tuple of coordinates of each substations. No check are made it to ensure it is correct.

Parameters:: grid_layout (dict) – See definition of GridObjects.grid_layout for more information.

get_path_env()[source]

Get the path that allows to create this environment.

It can be used for example in grid2op.utils.underlying_statistics to save the information directly inside the environment data.

seed(seed=None)[source]

Set the seed of this Environment for a better control and to ease reproducible experiments.

Parameters:: seed (int) – The seed to set.
Returns:: seeds – The seed used to set the prng (pseudo random number generator) for all environments, and each environment tuple seeds
Return type:: list

set_thermal_limit(thermal_limit)[source]: Set the thermal limit effectively. Will propagate to all underlying mixes

class grid2op.Environment.SingleEnvMultiProcess(env, nb_env, obs_as_class=True, return_info=True, logger=None)[source]

This class allows to evaluate a single agent instance on multiple environments running in parallel.

It is a kind of BaseMultiProcessEnvironment. For more information you can consult the documentation of this parent class. It allows to interact at the same time with different copy of the (same) environment in parallel

env

Al list of environments for which the evaluation will be made in parallel.

Type:: list::grid2op.Environment.Environment

nb_env

Number of parallel underlying environment that will be handled. It is also the size of the list of actions that need to be provided in MultiEnvironment.step() and the return sizes of the list of this same function.

Type:: int

Examples

An example on how you can best leverage this class is given in the getting_started notebooks. Another simple example is:

from grid2op.BaseAgent import DoNothingAgent
from grid2op.MakeEnv import make
from grid2op.Environment import SingleEnvMultiProcess

# create a simple environment
env = make("l2rpn_case14_sandbox")
# number of parrallel environment
nb_env = 2  # change that to adapt to your system
NB_STEP = 100  # number of step for each environment

# create a simple agent
agent = DoNothingAgent(env.action_space)

# create the multi environment class
multi_envs = SingleEnvMultiProcess(env=env, nb_env=nb_env)

# making is usable
obs = multi_envs.reset()
rews = [env.reward_range[0] for i in range(nb_env)]
dones = [False for i in range(nb_env)]

# performs the appropriated steps
for i in range(NB_STEP):
    acts = [None for _ in range(nb_env)]
    for env_act_id in range(nb_env):
        acts[env_act_id] = agent.act(obs[env_act_id], rews[env_act_id], dones[env_act_id])
    obs, rews, dones, infos = multi_envs.step(acts)

    # DO SOMETHING WITH THE AGENT IF YOU WANT

# close the environments
multi_envs.close()
# close the initial environment
env.close()

class grid2op.Environment.TimedOutEnvironment(grid2op_env: Environment | dict, time_out_ms: int = 1000.0)[source]

This class is the grid2op implementation of a “timed out environment” entity in the RL framework.

This class is very similar to the standard environment. They only differ in the behaivour of the step function.

For more information, see the documentation of TimedOutEnvironment.step()

Warning

This class might not behave normally if used with MaskedEnvironment, MultiEnv, MultiMixEnv etc.

name

The name of the environment

Type:: str

time_out_ms

maximum duration before performing a do_nothing action and updating to the next time_step.

Type:: int

action_space

Another name for Environment.helper_action_player for gym compatibility.

Type:: grid2op.Action.ActionSpace

observation_space

Another name for Environment.helper_observation for gym compatibility.

Type:: grid2op.Observation.ObservationSpace

reward_range

The range of the reward function

Type:: (float, float)

metadata

For gym compatibility, do not use

Type:: dict

spec

For Gym compatibility, do not use

Type:: None

_viewer

Used to display the powergrid. Currently properly supported.

Type:: object

Methods:

`get_kwargs`([with_backend, with_chronics_handler])	This function allows to make another Environment with the same parameters as the one that have been used to make this one.
`get_params_for_runner`()	This method is used to initialize a proper `grid2op.Runner.Runner` to use this specific environment.
`reset`(*[, seed, options])	Reset the environment.
`step`(action)	This function allows to pass to the next step for the action.

get_kwargs(with_backend=True, with_chronics_handler=True)[source]

This function allows to make another Environment with the same parameters as the one that have been used to make this one.

This is useful especially in cases where Environment is not pickable (for example if some non pickable c++ code are used) but you still want to make parallel processing using “MultiProcessing” module. In that case, you can send this dictionary to each child process, and have each child process make a copy of self

NB This function should not be used to make a copy of an environment. Prefer using Environment.copy() for such purpose.

Returns:: res – A dictionary that helps build an environment like self (which is NOT a copy of self) but rather an instance of an environment with the same properties.
Return type:: dict

Examples

It should be used as follow:

import grid2op
from grid2op.Environment import Environment
env = grid2op.make("l2rpn_case14_sandbox")  # create the environment of your choice
copy_of_env = Environment(**env.get_kwargs())
# And you can use this one as you would any other environment.
# NB this is not a "proper" copy. for example it will not be at the same step, it will be possible
# seeded with a different seed.
# use `env.copy()` to make a proper copy of an environment.

get_params_for_runner()[source]

This method is used to initialize a proper grid2op.Runner.Runner to use this specific environment.

Examples

It should be used as followed:

import grid2op
from grid2op.Runner import Runner
from grid2op.Agent import DoNothingAgent  # for example
env = grid2op.make("l2rpn_case14_sandbox")  # create the environment of your choice

# create the proper runner
runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent)

# now you can run
runner.run(nb_episode=1)  # run for 1 episode

reset(*, seed: int | None = None, options: Dict[str | Literal['time serie id'], int | str] | None = None) → BaseObservation[source]

Reset the environment.

See also

The doc of Environment.reset() for more information

Returns:: The first observation of the new episode.
Return type:: BaseObservation

step(action: BaseAction) → Tuple[BaseObservation, float, bool, dict][source]

This function allows to pass to the next step for the action.

Provided the action the agent wants to do, it will perform the action on the grid and resturn the typical “observation, reward, done, info” tuple.

Compared to BaseEnvironment.step() this function will emulate the “time that passes” supposing that the duration between each step should be time_out_ms. Indeed, in reality, there is only 5 mins to take an action between two grid states separated from 5 mins.

More precisely:

If your agent takes less than time_out_ms to chose its action then this function behaves normally.

If your agent takes between time_out_ms and 2 x time_out_ms to provide an action then a “do nothing” action is performed and then the provided action is performed.

If your agent takes between 2 x time_out_ms and 3 x time_out_ms to provide an action, then 2 “do nothing” actions are performed before your action.

Note

It is possible that the environment “fails” before the action of the agent is implemented on the grid.

Parameters:: action (grid2op.Action.BaseAction) – The action the agent wish to perform.
Returns:: _description_
Return type:: Tuple[BaseObservation, float, bool, dict]

If you still can’t find what you’re looking for, try in one of the following pages:

Still trouble finding the information ? Do not hesitate to send a github issue about the documentation at this link: Documentation issue template