Environment

This page is organized as follow:

Objectives

This module defines the Environment the higher level representation of the world with which an grid2op.Agent.BaseAgent will interact.

The environment receive an grid2op.Action.BaseAction from the grid2op.Agent.BaseAgent in the Environment.step() and returns an grid2op.Observation.BaseObservation that the grid2op.Agent.BaseAgent will use to perform the next action.

An environment is better used inside a grid2op.Runner.Runner, mainly because runners abstract the interaction between environment and agent, and ensure the environment are properly reset after each episode.

Usage

In this section we present some way to use the Environment class.

Basic Usage

This example is adapted from gym documentation available at gym random_agent.py ):

import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make()
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments
episode_count = 100  # i want to make 100 episodes

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
for i in range(episode_count):
    ob = env.reset()
    while True:
       action = agent.act(ob, reward, done)
       ob, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

# Close the env and write monitor result info to disk
env.close()
print("The total reward was {:.2f}".format(total_reward))

What happens here is the following:

  • ob = env.reset() will reset the environment to be usable again. It will load, by default the next “chronics” (you can imagine chronics as the graphics of a video game: it tells where the enemies are located, where are the walls, the ground etc. - each chronics can be thought a different “game level”).

  • action = agent.act(ob, reward, done) will chose an action facing the observation ob. This action should be of type grid2op.Action.BaseAction (or one of its derivate class). In case of a video game that would be you receiving and observation (usually display on the screen) and action on a controller. For example you could chose to go “left” / “right” / “up” or “down”. Of course in the case of the powergrid the actions are more complicated that than.

  • ob, reward, done, info = env.step(action) is the call to go to the next steps. You can imagine it as being a the next “frame”. To continue the parallel with video games, at the previous line you asked “pacman” to go left (for example) and then the next frame is displayed (here returned as an new observation ob).

You might want to customize this general behaviour in multiple way:

  • you might want to study only one chronics (equivalent to only one level of a video game) see Study always the same chronics

  • you might want to loop through the chronics, but not always in the same order. If that is the case you might want to consult the section Shuffle the chronics order

  • you might also have spotted some chronics that have bad properties. In this case, you can “remove” them from the environment (they will be ignored). This is explained in Skipping some chronics

  • you might also want to select at random, the next chronic you will use. This allows some compromise between all the above solution. Instead of ignoring some chronics you might want to select them less frequently, instead of always using the same one, you can sampling it more often and of course, because the sampling is done randomly it’s unlikely that the order will remain the same. To use that you can check the Sampling the chronics

In a different scenarios, you might also want to skip the first time steps of the chronics, that would be equivalent to starting into the “middle” of a video game. If that is the case, the subsection Skipping some time steps is made for you.

Finally, you might have noticed that each call to “env.reset” might take a while. This can dramatically increase the training time, especially at the beginning. This is due to the fact that each time env.reset is called, the whole chronics is read from the hard drive. If you want to lower this impact then you might consult the Optimize the data pipeline section.

Chronics Customization

Study always the same chronics

If you spotted a particularly interesting chronics, or if you want, for some reason your agent to see only one chronics, you can do this rather easily with grid2op.

All chronics are given a unique persistent ID (it means that as long as the data is not modified the same chronics will have always the same ID each time you load the environment). The environment has a “set_id” method that allows you to use it. Just add “env.set_id(THE\_ID\_YOU\_WANT)” before the call to “env.reset”. This gives the following code:

import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make()
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments
episode_count = 100  # i want to make 100 episodes

###################################
THE_CHRONIC_ID = 42
###################################

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
for i in range(episode_count):
    ###################################
    env.set_id(THE_CHRONIC_ID)
    ###################################

    ob = env.reset()

    # now play the episode as usual
    while True:
       action = agent.act(ob, reward, done)
       ob, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

# Close the env and write monitor result info to disk
env.close()
print("The total reward was {:.2f}".format(total_reward))

(as always added line compared to the base code are highlighted: they are “circle” with #####)

Shuffle the chronics order

In some other usecase, you might want to go through the whole set of chronics, and then loop again through them, but in a different order (remember that by default it will always loop in the same order 0, 1, 2, 3, …, 0, 1, 2, 3, …, 0, 1, 2, 3, …).

Again, doing so with grid2op is rather easy. To that end you can use the chronics_handler.shuffle function that will do exactly that. You can use it like this:

import numpy as np
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make()
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments
episode_count = 10000  # i want to make lots of episode

# total number of episode
total_episode = len(env.chronics_handler.subpaths)

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
for i in range(episode_count):

    ###################################
    if i % total_episode == 0:
        # I shuffle each time i need to
        env.chronics_handler.shuffle()
    ###################################

    ob = env.reset()
    # now play the episode as usual
    while True:
       action = agent.act(ob, reward, done)
       ob, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

Skipping some chronics

Some chronics might be too hard to start a training (“learn to walk before running”) and conversely some chronics might be too easy after a while (you can solve them without doing nothing basically). This is why grid2op allows you to have some control about which chronics will be used by the environment.

For this purpose you can use the chronics_handler.set_filter function. This function takes a “filtering function” as argument. This “filtering function” takes as argument the full path of the chronics and should return True / False whether or not you want to keep the There is an example:

import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make()
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments


###################################
# this is the only line of code to add
# here i select only the chronics that start by "00"
env.chronics_handler.set_filter(lambda path: re.match(".*00[0-9].*", path) is not None)
kept = env.chronics_handler.reset()  # if you don't do that it will not have any effect
print(kept)  # i print the chronics kept
###################################

episode_count = 10000  # i want to make lots of episode

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):
    ob = env.reset()
    # now play the episode as usual
    while True:
       action = agent.act(ob, reward, done)
       ob, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

Sampling the chronics

Finally, for even more flexibility, you can choose to sample what will be the next used chronics. To achieve that you can call the chronics_handler.sample_next_chronics This function takes a vector of probabilities as input (if not provided it assumes all probabilities are equal) and will select an id based on this probability vector.

In the following example we assume that the vector of probabilities is always the same and that we want, for some reason oversampling the 10 first chronics, and under sample the last 10:

import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make()
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments

episode_count = 10000  # i want to make lots of episode

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

###################################
# total number of episode
total_episode = len(env.chronics_handler.subpaths)
probas = np.ones(total_episode)
# oversample the first 10 episode
probas[:10]*= 5
# undersample the last 10 episode
probas[-10:] /= 5
###################################

# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):

    ###################################
    _ = env.chronics_handler.sample_next_chronics(probas)  # this is added
    ###################################
    ob = env.reset()

    # now play the episode as usual
    while True:
       action = agent.act(ob, reward, done)
       ob, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

NB here we have a constant vector of probabilities, but you might imagine adapting it during the training, for example to oversample scenarios your agent is having trouble to solve during the training.

Skipping some time steps

Another way to customize which data your agent will face is to make as if the chronics started at different date and time. This might be handy in case a scenario is hard at the beginning but less hard at the end, or if you want your agent to learn to start controlling the grid at any date and time (in grid2op most of the chronics data provided start at midnight for example).

To achieve this goal, you can use the BaseEnv.fast_forward_chronics() function. This function skip a given number of steps. In the following example, we always skip the first 42 time steps before starting the episode:

import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make()
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments

episode_count = 10000  # i want to make lots of episode

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):
    ob = env.reset()

    ###################################
    # below are the two lines added
    env.fast_forward_chronics(42)
    ob = env.get_obs()
    ###################################

    # now play the episode as usual
    while True:
       action = agent.act(ob, reward, done)
       ob, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

Optimize the data pipeline

Optimizing the data pipeline can be crucial if you want to learn fast, especially at the beginning of the training. There exists multiple way to perform this task.

First, let’s start with a summary of the timing. For this test, i ran, on my personal computer, the following code to compare the different method.

import time
import grid2op
from grid2op.Chronics import MultifolderWithCache


##############################
# this part changes depending on the method
env = grid2op.make("l2rpn_neurips_2020_track1_small")
env.chronics_handler.set_filter(lambda path: re.match(".*37.*", path) is not None)
kept = env.chronics_handler.reset()  # if you don't do that it will not have any effect
##############################

episode_count = 100
reward = 0
done = False
total_reward = 0

# only the time of the following loop is measured
%%time
for i in range(episode_count):
    ob = env.reset()
    if i % 10 == 0:
        print("10 more")
    while True:
        action = env.action_space.sample()
        ob, reward, done, info = env.step(action)
        total_reward += reward
        if done:
           # in this case the episode is over
           break

Results are reported in the table below:

Method used

memory footprint

time to perform (s)

Nothing (see Basic Usage)

low

44.6

set_chunk (see Chunk size)

ultra low

26.8

MultifolderWithCache

high

11.0

As you can see, the default usage uses relatively little memory but takes a while to compute (almost 45s to perform the 100 episode.) On the contrary, the Chunk size method uses less memory and is about 40% faster. Storing all data in memory using the MultifolderWithCache leads to a large memory footprint, but is also significantly faster. On this benchmark, it is 75% faster (it takes only 25% of the initial time) than the original method.

Chunk size

The first think you can do, without changing anything to the code, is to ask grid2op to read the input grid data by “chunk”. This means that, when you call “env.reset” instead of reading all the data representing a full month, you will read only a subset of it, thus speeding up the IO time by a large amount. In the following example we read data by “chunk” of 100 (if you want hard drive is accessed to read data 100 time steps by 100 time steps (instead of reading the full dataset at once) Note that this “technique” can also be used to reduce the memory footprint (less RAM taken).

import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
env = grid2op.make()
agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments

###################################
env.chronics_handler.set_chunk_size(100)
###################################

episode_count = 10000  # i want to make lots of episode

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):
    ob = env.reset()

    # now play the episode as usual
    while True:
       action = agent.act(ob, reward, done)
       ob, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

MultifolderWithCache

Another way is to use a dedicated class that stores the data in memory. This is particularly useful to avoid long and inefficient I/O that are replaced by reading the the complete dataset once and store it into memory.

This can be achieved with:

import numpy as np
import re
import grid2op
from grid2op.Agent import RandomAgent
from grid2op.Chronics import MultifolderWithCache

###################################
env = grid2op.make(chronics_class=MultifolderWithCache)
# I select only part of the data, it's unlikely the whole dataset can fit into memory...
env.chronics_handler.set_filter(lambda path: re.match(".*00[0-9].*", path) is not None)
# you need to do that
kept = env.chronics_handler.real_data.reset()
###################################

agent = RandomAgent(env.action_space)
env.seed(0)  # for reproducible experiments

episode_count = 10000  # i want to make lots of episode

# i initialize some useful variables
reward = 0
done = False
total_reward = 0

# and now the loop starts
# it will only used the chronics selected
for i in range(episode_count):
    ob = env.reset()

    # now play the episode as usual
    while True:
       action = agent.act(ob, reward, done)
       ob, reward, done, info = env.step(action)
       total_reward += reward
       if done:
           # in this case the episode is over
           break

(as always added line compared to the base code are highlighted: they are “circle” with #####)

Note that by default the MultifolderWithCache class will only load the first chronics it sees. You need to filter it and call env.chronics_handler.real_data.reset() for it to work properly.

Splitting into raining, validation, test scenarios

In machine learning the “training / validation / test” framework is particularly usefull to avoid overfitting and develop models as performant as possible.

Grid2op allows for such usage at the environment level. There is the possibility to “split” an environment into training / validation and test (ie using only some chronics for training, some others for validation and some others for testing).

This can be done with:

import grid2op
env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# extract 1% of the "chronics" to be used in the validation environment. The other 99% will
# be used for test
nm_env_train, nm_env_val = env.train_val_split_random(pct_val=1.)

# and now you can use the training set only to train your agent:
print(f"The name of the training environment is \\"{nm_env_train}\\"")
print(f"The name of the validation environment is \\"{nm_env_val}\\"")
env_train = grid2op.make(nm_env_train)

You can then use, in the above case:

import grid2op
env_name = "l2rpn_case14_sandbox"  # matching above

env_train = grid2op.make(env_name+"_train")  # to only use the "training chronics"
# do whatever you want with env_train

And then, at time of validation:

import grid2op
env_name = "l2rpn_case14_sandbox"  # matching above

env_val = grid2op.make(env_name+"_val") # to only use the "validation chronics"
# do whatever you want with env_val

As of now, grid2op do not support “from the API” the possibility to split with convenient names a environment a second times. If you want to do a “train / validation / test” split we recommend you to:

  1. make a training / test split (see below)

  2. split again the training set into training / validation (see below)

  3. you will have locally an environment named “trainval” on your computer. This directory will not weight more than a few kilobytes.

The example, not really convenient at the moment, please find a feature request if that is a problem for you:

import grid2op
import os

env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# retrieve the names of the chronics:
full_path_data = env.chronics_handler.subpaths
chron_names = [os.path.split(el)[-1] for el in full_path_data]

# splitting into training / test, keeping the "last" 10 chronics to the test set
nm_env_trainval, nm_env_test = env.train_val_split(val_scen_id=chron_names[-10:],
                                                   add_for_val="test",
                                                   add_for_train="trainval")

# now splitting again the training set into training and validation, keeping the last 10 chronics
# of this environment for validation
env_trainval = grid2op.make(nm_env_trainval)  # create the "trainval" environment
full_path_data = env_trainval.chronics_handler.subpaths
chron_names = [os.path.split(el)[-1] for el in full_path_data]
nm_env_train, nm_env_val = env_trainval.train_val_split(val_scen_id=chron_names[-10:],
                                                        remove_from_name="_trainval$")

And later on, you can do, if you followed the names above:

import grid2op
import os

env_name = "l2rpn_case14_sandbox"  # or any other...
env_train = grid2op.make(env_name+"_train")
env_val = grid2op.make(env_name+"_val")
env_test = grid2op.make(env_name+"_test")

And you can also, if you want, delete the folder “l2rpn_case14_sandbox_trainval” from your machine:

import grid2op
import os

env_name = "l2rpn_case14_sandbox"  # or any other...
env_trainval = grid2op.make(env_name+"_trainval")
print(f"You can safely delete, if you want, the folder: \n\t\"{env_trainval.get_path_env()}\" \nnow useless.")

Customization

Environments can be customized in three major ways:

  • Backend: you change the solver that computes the state of the power more or less faste or be more realistically

  • Parameters: you change the behaviour of the Environment. For example you can prevent the powerline to be disconnected when too much current flows on it etc.

  • Rules: you can affect the operational constraint that your agent must meet. For example you can affect more or less powerlines in the same action etc.

TODO

Detailed Documentation by class

Classes:

BaseEnv(init_grid_path, parameters, ...[, ...])

INTERNAL

BaseMultiProcessEnvironment(envs[, ...])

This class allows to evaluate a single agent instance on multiple environments running in parrallel.

Environment(init_grid_path, ...[, name, ...])

This class is the grid2op implementation of the "Environment" entity in the RL framework.

MultiEnvMultiProcess(envs, nb_envs[, ...])

This class allows to evaluate a single agent instance on multiple environments running in parrallel.

MultiMixEnvironment(envs_dir[, logger, ...])

This class represent a single powergrid configuration, backed by multiple environments parameters and chronics

SingleEnvMultiProcess(env, nb_env[, ...])

This class allows to evaluate a single agent instance on multiple environments running in parallel.

class grid2op.Environment.BaseEnv(init_grid_path, parameters, voltagecontrolerClass, thermal_limit_a=None, epsilon_poly=0.0001, tol_poly=0.01, other_rewards={}, with_forecast=True, opponent_action_class=<class 'grid2op.Action.DontAct.DontAct'>, opponent_class=<class 'grid2op.Opponent.BaseOpponent.BaseOpponent'>, opponent_init_budget=0.0, opponent_budget_per_ts=0.0, opponent_budget_class=<class 'grid2op.Opponent.NeverAttackBudget.NeverAttackBudget'>, opponent_attack_duration=0, opponent_attack_cooldown=99999, kwargs_opponent={}, has_attention_budget=False, attention_budget_cls=<class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget={}, logger=None)[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

This class represent some usefull abstraction that is re used by Environment and grid2op.Observation._Obsenv for example.

The documentation is showed here to document the common attributes of an “BaseEnvironment”.

parameters

The parameters of the game (to expose more control on what is being simulated)

Type

grid2op.Parameters.Parameters

with_forecast

Whether the chronics allow to have some kind of “forecast”. See BaseEnv.activate_forceast() for more information

Type

bool

logger

TO BE DONE: a way to log what is happening (currently not implemented)

time_stamp

The actual time stamp of the current observation.

Type

datetime.datetime

nb_time_step

Number of time steps played in the current environment

Type

int

current_obs

The current observation (or None if it’s not intialized)

Type

grid2op.Observation.BaseObservation

backend

The backend used to compute the powerflows and cascading failures.

Type

grid2op.Backend.Backend

done

Whether the environment is “done”. If True you need to call Environment.reset() in order to continue.

Type

bool

current_reward

The last computed reward (reward of the current step)

Type

float

other_rewards

Dictionary with key being the name (identifier) and value being some RewardHelper. At each time step, all the values will be computed by the Environment and the information about it will be returned in the “reward” key of the “info” dictionnary of the Environment.step().

Type

dict

chronics_handler

The object in charge managing the “chronics”, which store the information about load and generator for example.

Type

grid2op.Chronics.ChronicsHandler

reward_range

For open ai gym compatibility. It represents the range of the rewards: reward min, reward max

Type

tuple

viewer

For open ai gym compatibility.

viewer_fig

For open ai gym compatibility.

_gen_activeprod_t

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Should be initialized at 0. for “step” to properly recognize it’s the first time step of the game

_no_overflow_disconnection

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Whether or not cascading failures are computed or not (TRUE = the powerlines above their thermal limits will not be disconnected). This is initialized based on the attribute grid2op.Parameters.Parameters.NO_OVERFLOW_DISCONNECTION.

Type

bool

_timestep_overflow

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Number of consecutive timesteps each powerline has been on overflow.

Type

numpy.ndarray, dtype: int

_nb_timestep_overflow_allowed

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Number of consecutive timestep each powerline can be on overflow. It is usually read from grid2op.Parameters.Parameters.NB_TIMESTEP_POWERFLOW_ALLOWED.

Type

numpy.ndarray, dtype: int

_hard_overflow_threshold

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Number of timestep before an grid2op.BaseAgent.BaseAgent can reconnet a powerline that has been disconnected by the environment due to an overflow.

Type

float

_env_dc

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Whether the environment computes the powerflow using the DC approximation or not. It is usually read from grid2op.Parameters.Parameters.ENV_DC.

Type

bool

_names_chronics_to_backend

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Configuration file used to associated the name of the objects in the backend (both extremities of powerlines, load or production for example) with the same object in the data (Environment.chronics_handler). The idea is that, usually data generation comes from a different software that does not take into account the powergrid infrastructure. Hence, the same “object” can have a different name. This mapping is present to avoid the need to rename the “object” when providing data. A more detailed description is available at grid2op.ChronicsHandler.GridValue.initialize().

Type

dict

_env_modification

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Representation of the actions of the environment for the modification of the powergrid.

Type

grid2op.Action.Action

_rewardClass

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Type of reward used. Should be a subclass of grid2op.BaseReward.BaseReward

Type

type

_init_grid_path

Warning

/!\ Internal, do not use unless you know what you are doing /!\

The path where the description of the powergrid is located.

Type

str

_game_rules

Warning

/!\ Internal, do not use unless you know what you are doing /!\

The rules of the game (define which actions are legal and which are not)

Type

grid2op.Rules.RulesChecker

_action_space

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Helper used to manipulate more easily the actions given to / provided by the grid2op.Agent.BaseAgent (player)

Type

grid2op.Action.ActionSpace

_helper_action_env

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Helper used to manipulate more easily the actions given to / provided by the environment to the backend.

Type

grid2op.Action.ActionSpace

_observation_space

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Helper used to generate the observation that will be given to the grid2op.BaseAgent

Type

grid2op.Observation.ObservationSpace

_reward_helper

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Helper that is called to compute the reward at each time step.

Type

grid2p.BaseReward.RewardHelper

# TODO add the units (eg MW, MWh, MW/time step,etc.) in the redispatching related attributes

Attributes:

Methods:

property action_space

this represent a view on the action space

attach_layout(grid_layout)[source]

Compare to the method of the base class, this one performs a check. This method must be called after initialization.

Parameters

grid_layout (dict) – The layout of the grid (i.e the coordinates (x,y) of all substations). The keys should be the substation names, and the values a tuple (with two float) representing the coordinate of the substation.

Examples

Here is an example on how to attach a layout for an environment:

import grid2op

# create the environment
env = grid2op.make()

# assign coordinates (0., 0.) to all substations (this is a dummy thing to do here!)
layout = {sub_name: (0., 0.) for sub_name in env.name_sub}
env.attach_layout(layout)
change_forecast_parameters(new_parameters)[source]

Allows to change the parameters of a “forecast environment”.

Notes

This only affects the environment AFTER env.reset() has been called.

This only affects the “forecast env” and NOT the env itself.

Parameters

new_parameters (grid2op.Parameters.Parameters) – The new parameters you want the environment to get.

change_parameters(new_parameters)[source]

Allows to change the parameters of an environment.

Notes

This only affects the environment AFTER env.reset() has been called.

This only affects the environment and NOT the forecast.

Parameters

new_parameters (grid2op.Parameters.Parameters) – The new parameters you want the environment to get.

Examples

You can use this function like:

import grid2op
from grid2op.Parameters import Parameters
env_name = ...

env = grid2op.make(env_name)
env.parameters.NO_OVERFLOW_DISCONNECTION  # -> False

new_param = Parameters()
new_param.A_MEMBER = A_VALUE  # eg new_param.NO_OVERFLOW_DISCONNECTION = True
env.change_parameters(new_param)
obs = env.reset()
env.parameters.NO_OVERFLOW_DISCONNECTION  # -> True
change_reward(new_reward_func)[source]

Change the reward function used for the environment.

Parameters

new_reward_func – Either an object of class BaseReward, or a subclass of BaseReward: the new reward function to use

Notes

This only affects the environment AFTER env.reset() has been called.

close()[source]

close an environment: this will attempt to free as much memory as possible. Note that after an environment is closed, you will not be able to use anymore.

Any attempt to use a closed environment might result in non deterministic behaviour.

deactivate_forecast()[source]

This function will have the effect to deactivate the obs.simulate, the forecast will not be updated in the observation space.

This will most likely lead to some performance increase (~10-15% faster) if you don’t use the obs.simulate function.

Notes

If you really don’t want to use the obs.simulate functionality, you should rather disable it at the creation of the environment. For example, if you use the recommended make function, you can pass an argument that will ignore the chronics even when reading it (using GridStateFromFile instead of GridStateFromFileWithForecast for example) this would give something like:

import grid2op
from grid2op.Chronics import GridStateFromFile
# tell grid2op not to read the "forecast"
env = grid2op.make("rte_case14_realistic", data_feeding_kwargs={"gridvalueClass": GridStateFromFile})

do_nothing_action = env.action_space()

# improve speed ups to not even try to use forecast
env.deactivate_forecast()

# this is normal behavior
obs = env.reset()

# but this will make the programm stop working
# obs.simulate(do_nothing_action)  # DO NOT RUN IT RAISES AN ERROR
fast_forward_chronics(nb_timestep)[source]

This method allows you to skip some time step at the beginning of the chronics.

This is usefull at the beginning of the training, if you want your agent to learn on more diverse scenarios. Indeed, the data provided in the chronics usually starts always at the same date time (for example Jan 1st at 00:00). This can lead to suboptimal exploration, as during this phase, only a few time steps are managed by the agent, so in general these few time steps will correspond to grid state around Jan 1st at 00:00.

Parameters

nb_timestep (int) – Number of time step to “fast forward”

Examples

This can be used like this:

import grid2op

# create the environment
env = grid2op.make()

# skip the first 150 steps of the chronics
env.fast_forward_chronics(150)
done = env.is_done
if not done:
    obs = env.get_obs()
    # do something
else:
    # there was a "game over"
    # you need to reset the env (which will "cancel" the fast_forward)
    pass
    # do something else

Notes

This method can set the state of the environment in a ‘game over’ state (done=True) for example if the chronics last xxx time steps and you ask to “fast foward” more than xxx steps. This is why we advise to check the state of the environment after the call to this method if you use it (see the “Examples” paragaph)

generate_classes(_guard=None, _is_base_env__=True)[source]

Use with extra care ! If you get into trouble like :

AttributeError: Can’t get attribute ‘ActionSpace_l2rpn_icaps_2021_small’ on <module ‘grid2op.Space.GridObjects’ from ‘/home/benjamin/Documents/grid2op_dev/grid2op/Space/GridObjects.py’>

You might want to call this function and that MIGHT solve your problem.

This function will create a subdirectory ino the env directory, that will be accessed when loadin the class used for the environment. The default behaviour is to build the class on the fly.

Examples

Here is how to best leverage this functionality:

First step, generated the classes once and for all.

NB you need to redo this step each time you customize the environment. This customization includes, but is not limited to:

  • change the backend type: grid2op.make(…, backend=…)

  • change the action class: grid2op.make(…, action_class=…)

  • change observation class: grid2op.make(…, observation_class=…)

  • change the volagecontroler_class

  • change the grid_path

  • change the opponent_action_class

  • etc.

import grid2op
env_name = ...

env = grid2op.make(env_name, ...)  # again: redo this step each time you customize "..."
# for example if you change the `action_class` or the `backend` etc.

env.generate_classes()

Then, next time you want to use the SAME environment, you can do:

import grid2op
env_name = SAME NAME AS ABOVE
env = grid2op.make(env_name,
                   experimental_read_from_local_dir=True,
                   SAME ENV CUSTOMIZATION AS ABOVE)

And it should (this is experimerimental for now, and we expect feedback on the matter) solve the issues involving pickle.

Again, if you customize your environment (see above for more information) you’ll have to redo this step !

get_current_line_status()[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

prefer using grid2op.BaseObservation.line_status

This method allows to retrieve the line status.

get_obs(_update_state=True)[source]

Return the observations of the current environment made by the grid2op.BaseAgent.BaseAgent.

Returns

res – The current BaseObservation given to the grid2op.BaseAgent.BaseAgent / bot / controler.

Return type

grid2op.Observation.Observation

Examples

This function can be use at any moment, even if the actual observation is not present.

import grid2op

# I create an environment
env = grid2op.make()

obs = env.reset()

# have a big piece of code
obs2 = env.get_obs()

# obs2 and obs are identical.
get_path_env()[source]

Get the path that allows to create this environment.

It can be used for example in grid2op.utils.underlying_statistics to save the information directly inside the environment data.

get_reward_instance()[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

Returns the instance of the object that is used to compute the reward.

get_thermal_limit()[source]

Get the current thermal limit in amps registered for the environment.

Examples

It can be used like this:

import grid2op

# I create an environment
env = grid2op.make()

thermal_limits = env.get_thermal_limit()
load_alarm_data()[source]

Internal

Notes

This is called when the environment class is not created, so i need to read the data of the grid from the backend.

I cannot use “self.name_line” for example.

This function update the backend INSTANCE. The backend class is then updated in the env._init_backend(…) function with a call to self.backend.assert_grid_correct()

property observation_space

this represent a view on the action space

property parameters

return a deepcopy of the parameters used by the environment

It is a deepcopy, so modifying it will have absolutely no effect.

If you want to change the parameters of an environment, please use either grid2op.Environment.BaseEnv.change_parameters() to change the parameters of this environment or grid2op.Environment.BaseEnv.change_forecast_parameters() to change the parameter of the environment used by simulate.

reactivate_forecast()[source]

This function will have the effect to reactivate the obs.simulate, the forecast will be updated in the observation space.

This will most likely lead to some performance decrease but you will be able to use obs.simulate function.

Notes

You can use this function as followed:

import grid2op
from grid2op.Chronics import GridStateFromFile
# tell grid2op not to read the "forecast"
env = grid2op.make("rte_case14_realistic", data_feeding_kwargs={"gridvalueClass": GridStateFromFile})

do_nothing_action = env.action_space()

# improve speed ups to not even try to use forecast
env.deactivate_forecast()

# this is normal behavior
obs = env.reset()

# but this will make the programm stop working
# obs.simulate(do_nothing_action)  # DO NOT RUN IT RAISES AN ERROR

env.reactivate_forecast()
obs, reward, done, info = env.step(do_nothing_action)

# and now forecast are available again
simobs, sim_r, sim_d, sim_info = obs.simulate(do_nothing_action)
reset()[source]

Reset the base environment (set the appropriate variables to correct initialization). It is (and must be) overloaded in other grid2op.Environment

seed(seed=None)[source]

Set the seed of this Environment for a better control and to ease reproducible experiments.

Parameters

seed (int) – The seed to set.

Returns

  • seed (tuple) – The seed used to set the prng (pseudo random number generator) for the environment

  • seed_chron (tuple) – The seed used to set the prng for the chronics_handler (if any), otherwise None

  • seed_obs (tuple) – The seed used to set the prng for the observation space (if any), otherwise None

  • seed_action_space (tuple) – The seed used to set the prng for the action space (if any), otherwise None

  • seed_env_modif (tuple) – The seed used to set the prng for the modification of th environment (if any otherwise None)

  • seed_volt_cont (tuple) – The seed used to set the prng for voltage controler (if any otherwise None)

  • seed_opponent (tuple) – The seed used to set the prng for the opponent (if any otherwise None)

Examples

Seeding an environment should be done with:

import grid2op
env = grid2op.make()
env.seed(0)
obs = env.reset()

As long as the environment instance (variable env in the above code) is not reset the env.seed has no real effect (but can have side effect).

For a full control on the seed mechanism it is more than advised to reset it after it has been seeded.

set_thermal_limit(thermal_limit)[source]

Set the thermal limit effectively.

Parameters

thermal_limit (numpy.ndarray) –

The new thermal limit. It must be a numpy ndarray vector (or convertible to it). For each powerline it gives the new thermal limit.

Alternatively, this can be a dictionary mapping the line names (keys) to its thermal limits (values). In that case, all thermal limits for all powerlines should be specified (this is a safety measure to reduce the odds of misuse).

Examples

This function can be used like this:

import grid2op

# I create an environment
env = grid2op.make("rte_case5_example", test=True)

# i set the thermal limit of each powerline to 20000 amps
env.set_thermal_limit([20000 for _ in range(env.n_line)])

Notes

As of grid2op > 1.5.0, it is possible to set the thermal limit by using a dictionary with the keys being the name of the powerline and the values the thermal limits.

step(action)[source]

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state. Accepts an action and returns a tuple (observation, reward, done, info).

If the grid2op.BaseAction.BaseAction is illegal or ambiguous, the step is performed, but the action is replaced with a “do nothing” action.

Parameters

action (grid2op.Action.Action) – an action provided by the agent that is applied on the underlying through the backend.

Returns

  • observation (grid2op.Observation.Observation) – agent’s observation of the current environment

  • reward (float) – amount of reward returned after previous action

  • done (bool) – whether the episode has ended, in which case further step() calls will return undefined results

  • info (dict) – contains auxiliary diagnostic information (helpful for debugging, and sometimes learning). It is a dictionary with keys:

    • ”disc_lines”: a numpy array (or None) saying, for each powerline if it has been disconnected due to overflow (if not disconnected it will be -1, otherwise it will be a positive integer: 0 meaning that is one of the cause of the cascading failure, 1 means that it is disconnected just after, 2 that it’s disconnected just after etc.)

    • ”is_illegal” (bool) whether the action given as input was illegal

    • ”is_ambiguous” (bool) whether the action given as input was ambiguous.

    • ”is_dispatching_illegal” (bool) was the action illegal due to redispatching

    • ”is_illegal_reco” (bool) was the action illegal due to a powerline reconnection

    • ”reason_alarm_illegal” (None or Exception) reason for which the alarm is illegal (it’s None if no alarm are raised or if the alarm feature is not used)

    • ”opponent_attack_line” (np.ndarray, bool) for each powerline, say if the opponent attacked it (True) or not (False).

    • ”opponent_attack_sub” (np.ndarray, bool) for each substation, say if the opponent attacked it (True) or not (False).

    • ”opponent_attack_duration” (int) the duration of the current attack (if any)

    • ”exception” (list of Exceptions.Exceptions.Grid2OpException if an exception was raised or [] if everything was fine.)

    • ”detailed_infos_for_cascading_failures” (optional, only if the backend has been create with detailed_infos_for_cascading_failures=True) the list of the intermediate steps computed during the simulation of the “cascading failures”.

Examples

As any openAI gym environment, this is used like:

import grid2op
from grid2op.Agent import RandomAgent

# I create an environment
env = grid2op.make()

# define an agent here, this is an example
agent = RandomAgent(env.action_space)

# environment need to be "reset" before usage:
obs = env.reset()
reward = env.reward_range[0]
done = False

# now run through each steps like this
while not done:
    action = agent.act(obs, reward, done)
    obs, reward, done, info = env.step(action)

Notes

If the flag done=True is raised (ie this is the end of the episode) then the observation is NOT properly updated and should not be used at all.

Actually, it will be in a “game over” state (see grid2op.Observation.BaseObservation.set_game_over).

class grid2op.Environment.BaseMultiProcessEnvironment(envs, obs_as_class=True, return_info=True, logger=None)[source]

This class allows to evaluate a single agent instance on multiple environments running in parrallel.

It uses the python “multiprocessing” framework to work, and thus is suitable only on a single machine with multiple cores (cpu / thread). We do not recommend to use this method on a cluster of different machines.

This class uses the following representation:

  • an grid2op.BaseAgent.BaseAgent: lives in a main process

  • different environments lives into different processes

  • a call to MultiEnv.step() will perform one step per environment, in parallel using a Pipe to transfer data to and from the main process from each individual environment process. It is a synchronous function. It means it will wait for every environment to finish the step before returning all the information.

There are some limitations. For example, even if forecast are available, it’s not possible to use forecast of the observations. This imply that grid2op.Observation.BaseObservation.simulate() is not available when using MultiEnvironment

Compare to regular Environments, MultiEnvironment simply stack everything. You need to send not a single grid2op.Action.BaseAction but as many actions as there are underlying environments. You receive not one single grid2op.Observation.BaseObservation but as many observations as the number of underlying environments.

A broader support of regular grid2op environment capabilities as well as support for grid2op.Observation.BaseObservation.simulate() call might be added in the future.

NB As opposed to Environment.step() a call to BaseMultiProcessEnvironment.step() or any of its derived class (SingleEnvMultiProcess or MultiEnvMultiProcess) if a sub environment is “done” then it is automatically reset. This means entails that you can call BaseMultiProcessEnvironment.step() without worrying about having to reset.

envs

Al list of environments for which the evaluation will be made in parallel.

Type

list::grid2op.Environment.Environment

nb_env

Number of parallel underlying environment that will be handled. It is also the size of the list of actions that need to be provided in MultiEnvironment.step() and the return sizes of the list of this same function.

Type

int

obs_as_class

Whether to convert the observations back to grid2op.Observation object to to leave them as numpy array. Default (obs_as_class=True) to send them as observation object, but it’s slower.

Type

bool

return_info

Whether to return the information dictionary or not (might speed up computation)

Type

bool

Methods:

close()

Close all the environments and all the processes.

get_comp_time()

Get the computation time (only of the step part, corresponds to sub_env.comp_time) of each sub environments

get_obs()

implement the get_obs function that is "broken" if you use the __getattr__

get_parameters()

Get the parameters of each sub environments

get_powerflow_time()

Get the computation time (corresponding to sub_env.backend.comp_time) of each sub environments

get_seeds()

Get the seeds used to initialize each sub environments.

get_step_time()

Get the computation time (corresponding to sub_env._time_step) of each sub environments

reset()

Reset all the environments, and return all the associated observation.

set_chunk_size(new_chunk_size)

Dynamically adapt the amount of data read from the hard drive.

set_ff([ff_max])

This method is primarily used for training.

set_filter(filter_funs)

Set a filter_fun for each of the underlying environment.

set_id(id_)

Set a chronics id for each of the underlying environment to be used for each of the sub_env.

simulate(actions)

Perform the equivalent of obs.simulate in all the underlying environment

step(actions)

Perform a step in all the underlying environments.

close()[source]

Close all the environments and all the processes.

get_comp_time()[source]

Get the computation time (only of the step part, corresponds to sub_env.comp_time) of each sub environments

get_obs()[source]

implement the get_obs function that is “broken” if you use the __getattr__

get_parameters()[source]

Get the parameters of each sub environments

get_powerflow_time()[source]

Get the computation time (corresponding to sub_env.backend.comp_time) of each sub environments

get_seeds()[source]

Get the seeds used to initialize each sub environments.

get_step_time()[source]

Get the computation time (corresponding to sub_env._time_step) of each sub environments

reset()[source]

Reset all the environments, and return all the associated observation.

NB Except in some specific occasion, there is no need to call this function reset. Indeed, when a sub environment is “done” then it is automatically restarted in the :func:BaseMultiEnvMultiProcess.step` function.

Returns

res – The list of all observations. This list counts MultiEnvironment.nb_env elements, each one being an grid2op.Observation.BaseObservation.

Return type

list

set_chunk_size(new_chunk_size)[source]

Dynamically adapt the amount of data read from the hard drive. Usefull to set it to a low integer value (eg 10 or 100) at the beginning of the learning process, when agent fails pretty quickly.

This takes effect only after a reset has been performed.

Parameters

new_chunk_size (int) – The new chunk size (positive integer)

set_ff(ff_max=2016.0)[source]

This method is primarily used for training.

The problem this method aims at solving is the following: most of grid2op environments starts a Monday at 00:00. This method will “fast forward” an environment for a random number of timestep between 0 and ff_max

set_filter(filter_funs)[source]

Set a filter_fun for each of the underlying environment.

See grid2op.Chronis.MultiFolder.set_filter() for more information

Examples

TODO usage example

set_id(id_)[source]

Set a chronics id for each of the underlying environment to be used for each of the sub_env.

See grid2op.Environment.Environment.set_id() for more information

Examples

TODO usage example

simulate(actions)[source]

Perform the equivalent of obs.simulate in all the underlying environment

Parameters

actions (list) – List of all action to simulate

Returns

  • sim_obs – The observation resulting from the simulation

  • sim_rews – The reward resulting from the simulation

  • sim_dones – For each simulation, whether or not this the simulated action lead to a game over

  • sim_infos – Additional information for each simulated actions.

Examples

You can use this feature like:

import grid2op
from grid2op.Environment import BaseMultiProcessEnvironment

env_name = ...  # for example "l2rpn_case14_sandbox"
env1 = grid2op.make(env_name)
env2 = grid2op.make(env_name)

multi_env = BaseMultiProcessEnvironment([env1, env2])
obss = multi_env.reset()

# simulate
actions = [env1.action_space(), env2.action_space()]
sim_obss, sim_rs, sim_ds, sim_is = multi_env.simulate(actions)
step(actions)[source]

Perform a step in all the underlying environments. If one or more of the underlying environments encounters a game over, it is automatically restarted.

The observation sent back to the user is the observation after the grid2op.Environment.Environment.reset() has been called.

As opposed to Environment.step a call to this function will automatically reset any of the underlying environments in case one of them is “done”. This is performed the following way. In the case one underlying environment is over (due to game over or due to end of the chronics), then:

  • the corresponding “done” is returned as True

  • the corresponding observation returned is not the observation of the last time step (corresponding to the underlying environment that is game over) but is the first observation after reset.

At the next call to step, the flag done will be (if not game over arise) set to False and the corresponding observation is the next observation of this underlying environment: every thing works as usual in this case.

We did that because restarting the game over environment added un necessary complexity.

Parameters

actions (list) – List of MultiEnvironment.nb_env grid2op.Action.BaseAction. Each action will be executed in the corresponding underlying environment.

Returns

  • obs (list) – List all the observations returned by each underlying environment.

  • rews (list) – List all the rewards returned by each underlying environment.

  • dones (list) – List all the “done” returned by each underlying environment. If one of this value is “True” this means the environment encounter a game over.

  • infos (list) – List of dictionaries corresponding

Examples

You can use this class as followed:

import grid2op
from grid2op.Environment import BaseMultiProcessEnv
env1 = grid2op.make()  # create an environment of your choosing
env2 = grid2op.make()  # create another environment of your choosing

multi_env = BaseMultiProcessEnv([env1, env2])
obss = multi_env.reset()
obs1, obs2 = obss  # here i extract the observation of the first environment and of the second one
# note that you cannot do obs1.simulate().
# this is equivalent to a call to
# obs1 = env1.reset(); obs2 = env2.reset()

# then you can do regular steps
action_env1 = env1.action_space()
action_env2 = env2.action_space()
obss, rewards, dones, infos = env.step([action_env1, action_env2])
# if you define
# obs1, obs2 = obss
# r1, r2 = rewards
# done1, done2 = dones
# info1, info2 = infos
# in this case, it is equivalent to calling
# obs1, r1, done1, info1 = env1.step(action_env1)
# obs2, r2, done2, info2 = env2.step(action_env2)

Let us now focus on the “automatic” reset part.

# see above for the creation of a multi_env and the proper imports
multi_env = BaseMultiProcessEnv([env1, env2])
action_env1 = env1.action_space()
action_env2 = env2.action_space()
obss, rewards, dones, infos = env.step([action_env1, action_env2])

# say dones[0] is ``True``
# in this case if you define
# obs1 = obss[0]
# r1=rewards[0]
# done1=done[0]
# info1=info[0]
# in that case it is equivalent to the "single processed" code
# obs1_tmp, r1_tmp, done1_tmp, info1_tmp = env1.step(action_env1)
# done1 = done1_tmp
# r1 = r1_tmp
# info1 = info1_tmp
# obs1_aux = env1.reset()
# obs1 = obs1_aux
# CAREFULLL in this case, obs1 is NOT obs1_tmp but is really
class grid2op.Environment.Environment(init_grid_path: str, chronics_handler, backend, parameters, name='unknown', names_chronics_to_backend=None, actionClass=<class 'grid2op.Action.TopologyAction.TopologyAction'>, observationClass=<class 'grid2op.Observation.CompleteObservation.CompleteObservation'>, rewardClass=<class 'grid2op.Reward.FlatReward.FlatReward'>, legalActClass=<class 'grid2op.Rules.AlwaysLegal.AlwaysLegal'>, voltagecontrolerClass=<class 'grid2op.VoltageControler.ControlVoltageFromFile.ControlVoltageFromFile'>, other_rewards={}, thermal_limit_a=None, with_forecast=True, epsilon_poly=0.0001, tol_poly=0.01, opponent_action_class=<class 'grid2op.Action.DontAct.DontAct'>, opponent_class=<class 'grid2op.Opponent.BaseOpponent.BaseOpponent'>, opponent_init_budget=0.0, opponent_budget_per_ts=0.0, opponent_budget_class=<class 'grid2op.Opponent.NeverAttackBudget.NeverAttackBudget'>, opponent_attack_duration=0, opponent_attack_cooldown=99999, kwargs_opponent={}, attention_budget_cls=<class 'grid2op.operator_attention.attention_budget.LinearAttentionBudget'>, kwargs_attention_budget={}, has_attention_budget=False, logger=None, _raw_backend_class=None, _compat_glop_version=None, _read_from_local_dir=True)[source]

This class is the grid2op implementation of the “Environment” entity in the RL framework.

name

The name of the environment

Type

str

action_space

Another name for Environment.helper_action_player for gym compatibility.

Type

grid2op.Action.ActionSpace

observation_space

Another name for Environment.helper_observation for gym compatibility.

Type

grid2op.Observation.ObservationSpace

reward_range

The range of the reward function

Type

(float, float)

metadata

For gym compatibility, do not use

Type

dict

spec

For Gym compatibility, do not use

Type

None

viewer

Used to display the powergrid. Currently not supported.

Type

object

Methods:

add_text_logger([logger])

Add a text logger to this Environment

attach_renderer([graph_layout])

This function will attach a renderer, necessary to use for plotting capabilities.

copy()

Performs a deep copy of the environment

get_kwargs([with_backend])

This function allows to make another Environment with the same parameters as the one that have been used to make this one.

get_params_for_runner()

This method is used to initialize a proper grid2op.Runner.Runner to use this specific environment.

max_episode_duration()

Return the maximum duration (in number of steps) of the current episode.

render([mode])

Render the state of the environment on the screen, using matplotlib Also returns the Matplotlib figure

reset()

Reset the environment to a clean state.

reset_grid()

INTERNAL

set_chunk_size(new_chunk_size)

For an efficient data pipeline, it can be usefull to not read all part of the input data (for example for load_p, prod_p, load_q, prod_v).

set_id(id_)

Set the id that will be used at the next call to Environment.reset().

set_max_iter(max_iter)

param max_iter

The maximum number of iteration you can do before reaching the end of the episode. Set it to "-1" for

simulate(action)

Another method to call obs.simulate to ensure compatibility between multi environment and regular one.

train_val_split(val_scen_id[, ...])

This function is used as Environment.train_val_split_random().

train_val_split_random([pct_val, ...])

By default a grid2op environment contains multiple "scenarios" containing values for all the producers and consumers representing multiple days.

add_text_logger(logger=None)[source]

Add a text logger to this Environment

Logging is for now an incomplete feature, really incomplete (not used)

Parameters

logger – The logger to use

attach_renderer(graph_layout=None)[source]

This function will attach a renderer, necessary to use for plotting capabilities.

Parameters

graph_layout (dict) –

Here for backward compatibility. Currently not used.

If you want to set a specific layout call BaseEnv.attach_layout()

If None this class will use the default substations layout provided when the environment was created. Otherwise it will use the data provided.

Examples

Here is how to use the function

import grid2op

# create the environment
env = grid2op.make()

if False:
    # if you want to change the default layout of the powergrid
    # assign coordinates (0., 0.) to all substations (this is a dummy thing to do here!)
    layout = {sub_name: (0., 0.) for sub_name in env.name_sub}
    env.attach_layout(layout)
    # NB again, this code will make everything look super ugly !!!! Don't change the
    # default layout unless you have a reason to.

# and if you want to use the renderer
env.attach_renderer()

# and now you can "render" (plot) the state of the grid
obs = env.reset()
done = False
reward = env.reward_range[0]
while not done:
    env.render()
    action = agent.act(obs, reward, done)
    obs, reward, done, info = env.step(action)
copy()[source]

Performs a deep copy of the environment

Unless you have a reason to, it is not advised to make copy of an Environment.

Examples

It should be used as follow:

import grid2op
env = grid2op.make()
cpy_of_env = env.copy()
get_kwargs(with_backend=True)[source]

This function allows to make another Environment with the same parameters as the one that have been used to make this one.

This is useful especially in cases where Environment is not pickable (for example if some non pickable c++ code are used) but you still want to make parallel processing using “MultiProcessing” module. In that case, you can send this dictionary to each child process, and have each child process make a copy of self

NB This function should not be used to make a copy of an environment. Prefer using Environment.copy() for such purpose.

Returns

res – A dictionary that helps build an environment like self (which is NOT a copy of self) but rather an instance of an environment with the same properties.

Return type

dict

Examples

It should be used as follow:

import grid2op
from grid2op.Environment import Environment
env = grid2op.make()  # create the environment of your choice
copy_of_env = Environment(**env.get_kwargs())
# And you can use this one as you would any other environment.
# NB this is not a "proper" copy. for example it will not be at the same step, it will be possible
# seeded with a different seed.
# use `env.copy()` to make a proper copy of an environment.
get_params_for_runner()[source]

This method is used to initialize a proper grid2op.Runner.Runner to use this specific environment.

Examples

It should be used as followed:

import grid2op
from grid2op.Runner import Runner
from grid2op.Agent import DoNothingAgent  # for example
env = grid2op.make()  # create the environment of your choice

# create the proper runner
runner = Runner(**env.get_params_for_runner(), agentClass=DoNothingAgent)

# now you can run
runner.run(nb_episode=1)  # run for 1 episode
max_episode_duration()[source]

Return the maximum duration (in number of steps) of the current episode.

Notes

For possibly infinite episode, the duration is returned as np.iinfo(np.int32).max which corresponds to the maximum 32 bit integer (usually 2147483647)

render(mode='human')[source]

Render the state of the environment on the screen, using matplotlib Also returns the Matplotlib figure

Examples

Rendering need first to define a “renderer” which can be done with the following code:

import grid2op

# create the environment
env = grid2op.make()

# if you want to use the renderer
env.attach_renderer()

# and now you can "render" (plot) the state of the grid
obs = env.reset()
done = False
reward = env.reward_range[0]
while not done:
    env.render()  # this piece of code plot the grid
    action = agent.act(obs, reward, done)
    obs, reward, done, info = env.step(action)
reset()[source]

Reset the environment to a clean state. It will reload the next chronics if any. And reset the grid to a clean state.

This triggers a full reloading of both the chronics (if they are stored as files) and of the powergrid, to ensure the episode is fully over.

This method should be called only at the end of an episode.

Examples

The standard “gym loop” can be done with the following code:

import grid2op

# create the environment
env = grid2op.make()

# and now you can "render" (plot) the state of the grid
obs = env.reset()
done = False
reward = env.reward_range[0]
while not done:
    action = agent.act(obs, reward, done)
    obs, reward, done, info = env.step(action)
reset_grid()[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

This is automatically called when using env.reset

Reset the backend to a clean state by reloading the powergrid from the hard drive. This might takes some time.

If the thermal has been modified, it also modify them into the new backend.

set_chunk_size(new_chunk_size)[source]

For an efficient data pipeline, it can be usefull to not read all part of the input data (for example for load_p, prod_p, load_q, prod_v). Grid2Op support the reading of large chronics by “chunk” of given size.

Reading data in chunk can also reduce the memory footprint, useful in case of multiprocessing environment while large chronics.

It is critical to set a small chunk_size in case of training machine learning algorithm (reinforcement learning agent) at the beginning when the agent performs poorly, the software might spend most of its time loading the data.

NB this has no effect if the chronics does not support this feature.

NB The environment need to be reset for this to take effect (it won’t affect the chronics already loaded)

Parameters

new_chunk_size (int or None) – The new chunk size (positive integer)

Examples

Here is an example on how to use this function

import grid2op

# I create an environment
env = grid2op.make("rte_case5_example", test=True)
env.set_chunk_size(100)
env.reset()  # otherwise chunk size has no effect !
# and now data will be read from the hard drive 100 time steps per 100 time steps
# instead of the whole episode at once.
set_id(id_)[source]

Set the id that will be used at the next call to Environment.reset().

NB this has no effect if the chronics does not support this feature.

NB The environment need to be reset for this to take effect.

Parameters

id (int) – the id of the chronics used.

Examples

Here an example that will loop 10 times through the same chronics (always using the same injection then):

import grid2op
from grid2op import make
from grid2op.BaseAgent import DoNothingAgent

env = make("rte_case14_realistic")  # create an environment
agent = DoNothingAgent(env.action_space)  # create an BaseAgent

for i in range(10):
    env.set_id(0)  # tell the environment you simply want to use the chronics with ID 0
    obs = env.reset()  # it is necessary to perform a reset
    reward = env.reward_range[0]
    done = False
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = env.step(act)

And here you have an example on how you can loop through the scenarios in a given order:

import grid2op
from grid2op import make
from grid2op.BaseAgent import DoNothingAgent

env = make("rte_case14_realistic")  # create an environment
agent = DoNothingAgent(env.action_space)  # create an BaseAgent
scenario_order = [1,2,3,4,5,10,8,6,5,7,78, 8]
for id_ in scenario_order:
    env.set_id(id_)  # tell the environment you simply want to use the chronics with ID 0
    obs = env.reset()  # it is necessary to perform a reset
    reward = env.reward_range[0]
    done = False
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = env.step(act)
set_max_iter(max_iter)[source]
Parameters

max_iter (int) – The maximum number of iteration you can do before reaching the end of the episode. Set it to “-1” for possibly infinite episode duration.

Notes

Maximum length of the episode can depend on the chronics used. See Environment.chronics_handler for more information

simulate(action)[source]

Another method to call obs.simulate to ensure compatibility between multi environment and regular one.

Parameters

action – A grid2op action

Returns

Notes

Prefer using obs.simulate if possible, it will be faster than this function.

train_val_split(val_scen_id, add_for_train='train', add_for_val='val', remove_from_name=None)[source]

This function is used as Environment.train_val_split_random().

Please refer to this the help of Environment.train_val_split_random() for more information about this function.

Parameters
Returns

Examples

A full example on a training / validation / test split with explicit specification of which chronics goes in which scenarios is:

import grid2op
import os

env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# retrieve the names of the chronics:
full_path_data = env.chronics_handler.subpaths
chron_names = [os.path.split(el)[-1] for el in full_path_data]

# splitting into training / test, keeping the "last" 10 chronics to the test set
nm_env_trainval, nm_env_test = env.train_val_split(val_scen_id=chron_names[-10:],
                                                   add_for_val="test",
                                                   add_for_train="trainval")

# now splitting again the training set into training and validation, keeping the last 10 chronics
# of this environment for validation
env_trainval = grid2op.make(nm_env_trainval)  # create the "trainval" environment
full_path_data = env_trainval.chronics_handler.subpaths
chron_names = [os.path.split(el)[-1] for el in full_path_data]
nm_env_train, nm_env_val = env_trainval.train_val_split(val_scen_id=chron_names[-10:],
                                                        remove_from_name="_trainval$")

# and now you can use the following code to load the environments:
env_train = grid2op.make(nm_env+"_train")
env_val = grid2op.make(nm_env+"_val")
env_test = grid2op.make(nm_env+"_test")

For a more simple example, with less parametrization and with random assignment (recommended), please refer to the help of Environment.train_val_split_random()

NB read the “Notes” of this section for possible “unexpected” behaviour of the code snippet above.

Notes

We don’t recommend you to use this function. It provides a great level of control on which scenarios goes into which dataset, which is nice, but “with great power comes great responsibilities”.

Keep in mind that scenarios might be “sorted” by having some “month” in their names. For example, the first k scenarios might be called “April_XXX” and the last k ones having names with “September_XXX”.

In general, we would not consider good practice to have all validation (or test) scenarios coming from the same months. Keep that in mind if you use the code snippet above.

train_val_split_random(pct_val=10.0, add_for_train='train', add_for_val='val', remove_from_name=None)[source]

By default a grid2op environment contains multiple “scenarios” containing values for all the producers and consumers representing multiple days. In a “game like” environment, you can think of the scenarios as being different “game levels”: different mazes in pacman, different levels in mario etc.

We recommend to train your agent on some of these “chroncis” (aka levels) and test the performance of your agent on some others, to avoid overfitting.

This function allows to easily split an environment into different part. This is most commonly used in machine learning where part of a dataset is used for training and another part is used for assessing the performance of the trained model.

This function rely on “symbolic link” and will not duplicate data.

New created environments will behave like regular grid2op environment and will be accessible with “make” just like any others (see the examples section for more information).

This function will make the split at random. If you want more control on the which scenarios to use for training and which for validation, use the Environment.train_val_split() that allows to specify which scenarios goes in the validation environment (and the others go in the training environment).

Parameters
  • pct_val (float) – Percentage of chronics that will go to the validation set. For 10% of the chronics, set it to 10. and NOT to 0.1.

  • add_for_train (str) – Suffix that will be added to the name of the environment for the training set. We don’t recommend to modify the default value (“train”)

  • add_for_val (str) – Suffix that will be added to the name of the environment for the validation set. We don’t recommend to modify the default value (“val”)

  • remove_from_name (str) – If you “split” an environment multiple times, this allows you to keep “short” names (for example you will be able to call grid2op.make(env_name+”_train”) instead of grid2op.make(env_name+”_train_train”))

Returns

  • nm_train (str) – Complete name of the “training” environment

  • nm_val (str) – Complete name of the “validation” environment

Examples

This function can be used like:

import grid2op
env_name = "l2rpn_case14_sandbox"  # or any other...
env = grid2op.make(env_name)

# extract 1% of the "chronics" to be used in the validation environment. The other 99% will
# be used for test
nm_env_train, nm_env_val = env.train_val_split_random(pct_val=1.)

# and now you can use the training set only to train your agent:
print(f"The name of the training environment is \"{nm_env_train}\"")
print(f"The name of the validation environment is \"{nm_env_val}\"")
env_train = grid2op.make(nm_env_train)

And even after you close the python session, you can still use this environment for training. If you used the exact code above that will look like:

import grid2op
env_name_train = "l2rpn_case14_sandbox_train"  # depending on the option you passed above
env_train = grid2op.make(env_name_train)

Notes

This function will fail if an environment already exists with one of the name that would be given to the training environment or the validation environment.

class grid2op.Environment.MultiEnvMultiProcess(envs, nb_envs, obs_as_class=True, return_info=True, logger=None)[source]

This class allows to evaluate a single agent instance on multiple environments running in parrallel.

It is a kind of BaseMultiProcessEnvironment. For more information you can consult the documentation of this parent class. This class allows to interact at the same time with different copy of possibly different environments in parallel

envs

Al list of environments for which the evaluation will be made in parallel.

Type

list:grid2op.Environment.Environment

nb_envs

Number of parallel underlying environment that will be handled. MUST be the same length as the parameter envs. The total number of subprocesses will be the sum of this list.

Type

list:int

Examples

This class can be used as:

import grid2op
from grid2op.Environment import MultiEnvMultiProcess
env0 = grid2op.make()  # create an environment
env1 = grid2op.make()  # create a second environment, that can be similar, or not
# it is recommended to filter or create the environment with different parameters, otherwise this class
# is of little interest
envs = [env0, env1]  # list of all environments created
nb_envs = [1, 7]  # number of "copies" of each environment that will be made.
# in this case the first one will be copied only once, and the second one 7 times.
# the total number of environments used in the multi env will be the sum(nb_envs), here 8.

multi_env = MultiEnvMultiProcess(envs=envs, nb_envs=nb_envs)
# and now you can use it like any other grid2op environment (almost)
observations = multi_env.reset()
class grid2op.Environment.MultiMixEnvironment(envs_dir, logger=None, experimental_read_from_local_dir=False, _add_to_name='', _compat_glop_version=None, _test=False, **kwargs)[source]

This class represent a single powergrid configuration, backed by multiple environments parameters and chronics

It implements most of the BaseEnv public interface: so it can be used as a more classic environment.

MultiMixEnvironment environments behave like a superset of the environment: they are made of sub environments (called mixes) that are grid2op regular Environment. You might think the MultiMixEnvironment as a dictionary of Environment that implements some of the BaseEnv interface such as BaseEnv.step() or BaseEnv.reset().

By default, each time you call the “step” function a different mix is used. Mixes, by default are looped through always in the same order. You can see the Examples section for information about control of these

Examples

In this section we present some common use of the MultiMix environment.

Basic Usage

You can think of a MultiMixEnvironment as any Environment. So this is a perfectly valid way to use a MultiMix:

import grid2op
from grid2op.Agent import RandomAgent

# we use an example of a multimix dataset attached with grid2op pacakage
multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True)

# define an agent like in any environment
agent = RandomAgent(multimix_env.action_space)

# and now you can do the open ai gym loop
NB_EPISODE = 10
for i in range(NB_EPISODE):
    obs = multimix_env.reset()
    # each time "reset" is called, another mix is used.
    reward = multimix_env.reward_range[0]
    done = False
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = multimix_env.step(act)

Use each mix one after the other

In case you want to study each mix independently, you can iterate through the MultiMix in a pythonic way. This makes it easy to perform, for example, 10 episode for a given mix before passing to the next one.

import grid2op
from grid2op.Agent import RandomAgent

# we use an example of a multimix dataset attached with grid2op pacakage
multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True)

NB_EPISODE = 10
for mix in multimix_env:
    # mix is a regular environment, you can do whatever you want with it
    # for example
    for i in range(NB_EPISODE):
        obs = multimix_env.reset()
        # each time "reset" is called, another mix is used.
        reward = multimix_env.reward_range[0]
        done = False
        while not done:
            act = agent.act(obs, reward, done)
            obs, reward, done, info = multimix_env.step(act)

Selecting a given Mix

Sometimes it might be interesting to study only a given mix. For that you can use the [] operator to select only a given mix (which is a grid2op environment) and use it as you would.

This can be done with:

import grid2op
from grid2op.Agent import RandomAgent

# we use an example of a multimix dataset attached with grid2op pacakage
multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True)

# define an agent like in any environment
agent = RandomAgent(multimix_env.action_space)

# list all available mixes:
mixes_names = list(multimix_env.keys())

# and now supposes we want to study only the first one
mix = multimix_env[mixes_names[0]]

# and now you can do the open ai gym loop, or anything you want with it
NB_EPISODE = 10
for i in range(NB_EPISODE):
    obs = mix.reset()
    # each time "reset" is called, another mix is used.
    reward = mix.reward_range[0]
    done = False
    while not done:
        act = agent.act(obs, reward, done)
        obs, reward, done, info = mix.step(act)

Using the Runner

For MultiMixEnvironment using the grid2op.Runner.Runner cannot be done in a straightforward manner. Here we give an example on how to do it.

import os
import grid2op
from grid2op.Agent import RandomAgent

# we use an example of a multimix dataset attached with grid2op pacakage
multimix_env = grid2op.make("l2rpn_neurips_2020_track2", test=True)

# you can use the runner as following
PATH = "PATH/WHERE/YOU/WANT/TO/SAVE/THE/RESULTS"
for mix in multimix_env:
    runner = Runner(**mix.get_params_for_runner(), agentClass=RandomAgent)
    runner.run(nb_episode=1,
               path_save=os.path.join(PATH,mix.name))

Methods:

attach_layout(grid_layout)

INTERNAL

get_path_env()

Get the path that allows to create this environment.

seed([seed])

Set the seed of this Environment for a better control and to ease reproducible experiments.

set_thermal_limit(thermal_limit)

Set the thermal limit effectively.

attach_layout(grid_layout)[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\ We do not recommend to “attach layout” outside of the environment. Please refer to the function grid2op.Environment.BaseEnv.attach_layout() for more information.

grid layout is a dictionary with the keys the name of the substations, and the value the tuple of coordinates of each substations. No check are made it to ensure it is correct.

Parameters

grid_layout (dict) – See definition of GridObjects.grid_layout for more information.

get_path_env()[source]

Get the path that allows to create this environment.

It can be used for example in grid2op.utils.underlying_statistics to save the information directly inside the environment data.

seed(seed=None)[source]

Set the seed of this Environment for a better control and to ease reproducible experiments.

Parameters

seed (int) – The seed to set.

Returns

seeds – The seed used to set the prng (pseudo random number generator) for all environments, and each environment tuple seeds

Return type

list

set_thermal_limit(thermal_limit)[source]

Set the thermal limit effectively. Will propagate to all underlying mixes

class grid2op.Environment.SingleEnvMultiProcess(env, nb_env, obs_as_class=True, return_info=True, logger=None)[source]

This class allows to evaluate a single agent instance on multiple environments running in parallel.

It is a kind of BaseMultiProcessEnvironment. For more information you can consult the documentation of this parent class. It allows to interact at the same time with different copy of the (same) environment in parallel

env

Al list of environments for which the evaluation will be made in parallel.

Type

list::grid2op.Environment.Environment

nb_env

Number of parallel underlying environment that will be handled. It is also the size of the list of actions that need to be provided in MultiEnvironment.step() and the return sizes of the list of this same function.

Type

int

Examples

An example on how you can best leverage this class is given in the getting_started notebooks. Another simple example is:

from grid2op.BaseAgent import DoNothingAgent
from grid2op.MakeEnv import make
from grid2op.Environment import SingleEnvMultiProcess

# create a simple environment
env = make()
# number of parrallel environment
nb_env = 2  # change that to adapt to your system
NB_STEP = 100  # number of step for each environment

# create a simple agent
agent = DoNothingAgent(env.action_space)

# create the multi environment class
multi_envs = SingleEnvMultiProcess(env=env, nb_env=nb_env)

# making is usable
obs = multi_envs.reset()
rews = [env.reward_range[0] for i in range(nb_env)]
dones = [False for i in range(nb_env)]

# performs the appropriated steps
for i in range(NB_STEP):
    acts = [None for _ in range(nb_env)]
    for env_act_id in range(nb_env):
        acts[env_act_id] = agent.act(obs[env_act_id], rews[env_act_id], dones[env_act_id])
    obs, rews, dones, infos = multi_envs.step(acts)

    # DO SOMETHING WITH THE AGENT IF YOU WANT

# close the environments
multi_envs.close()
# close the initial environment
env.close()

If you still can’t find what you’re looking for, try in one of the following pages:

Still trouble finding the information ? Do not hesitate to send a github issue about the documentation at this link: Documentation issue template