Opponent Modeling

This page is organized as follow:

Objectives

Power systems are a really important tool today, that can be as resilient as possible to avoid possibly dramatic consequences.

In grid2op, we chose to enforce this property by implementing an “Opponent”, modeled thanks to the BaseOpponent that can take malicious actions to deteriorate the state of the powergrid and make tha Agent (grid2op.Agent) fail. To make the agent “game over” is really easy (for example it could isolate a load by forcing the disconnection of all the powerline that powers it). This would not be fair, and that is why the Opponent has some dedicated budget (modeled with the BaseActionBudget).

The class OpponentSpace has the delicate role to: - send the necessary information for the Opponent to attack properly. - make sure the attack performed by the opponent is legal - compute the cost of such attack - make sure this cost is not too high for the opponent budget.

Relation with N-1 security

The “opponent” modeling in grid2op is closely related to the N-1 security criterion used in operation by many TSOs.

Definition of N-1 security

There are different definition of “N-1 security” depending on the TSOs. In this part of the documentation we will define it clearly to avoid misunderstanding.

First, a grid is said to be “in security” if no threshold are violate for any equipement on the grid. In the context of grid2op, this most often means that all powerlines have a flow under a certain threshold (refer to as “thermal limit” in grid2op).

Important

A grid is N-1 safe, if, for any “contingency” (eg the disconnection a powerline) from a given list (in general this list is “all the lines / transformers on the grid”), the grid would still be “in security” (define in the above paragraph) after this contingency occurred.

N-1 with corrective actions

This security criteria might not be met at all time, but the grid would still be “secure” 100% of the time.

Indeed, for some actions with a very short “implementation time” (say an action nearly instantaneous, like the opening or the closing of a switch), it can be possible to implement this action only in case of contingency to avoid the grid to go out of security.

This would give the following definition of N-1 security:

Important

A grid is N-1 secure (with corrective action), if an only if, for a any contingency (among a given list), it is possible to find (at least) a corrective action (understand: an action that can be quickly implemented after the contingency, including the “I do nothing action”) that would bring the grid back to security.

Note

If a grid is N-1 secure (as described in the above paragraph), it will be “N-1 secure (with corrective action)”.

The opposite is not true.

Extension of the N-1 security, with corrective actions, on a time interval

Grid2op is especially suited to study how the grid will evolve around a given time interval (say a few hours or a few days).

The N-1 criterion can then be “extended” to include the time dependance. The natural definition would be:

Important

A grid is N-1 scure accross the time interval [beginning, end] if (and only if), for any contingency (among a given list) occurring during this time interval, there exists a list of corrective actions that can ensure the grid remains in security at any time until the end of this interval”.

Note

If the grid state is not modified by any action, assessing whether or not the grid is “N-1 secure on a given time interval” is equivalent to assess whether or not, for each step within the interval, the grid is “N-1 secure”.

This is not the case if there are “feedback loops” on the grid, such as delayed protection, presence of corrective actions or storage units on the grid, and in general anything that can affect the state of the grid and be “non instantly reversible”.

In this case, it is possible to find examples where grids are “N-1 secure for all step within the inveral”, but the grid is not “N-1 secure accross the time interval”. For example, imagine the case where a storage unit should be emptied as a “corrective measure”. In this case, the grid is “N-1 secure” at each step (we assume you can always empty it for one step). But the grid is not safe during the whole interval, if the initial capacity of the storage unit is not sufficient to last the whole duration of the contingency.

Translation with the grid2op framework

Grid2op allows to easily assess if a grid is “N-1 secure, with (or without) corrective action, accross a given time interval”.

For the version without corrective actions, you can imagine something like this:

# let's suppose env is a grid2op environment
# with the correct data concerning the proper "time interval"
# env.reset() set the state at the beginning of the time interval
# and when env.step() is "done"  it is the end of the time
# interval

is_secure = np.zeros(len(LIST_OF_ALL_CONTINGENCIES), dtype=bool)
for c_num, contingency_id in enumerate(LIST_OF_ALL_CONTINGENCIES):
  # reset the environment and disconnect the
  # given contingency_id
  init_obs = env.reset(options={"init state": {"set_line_status": [(contingency_id, -1)]}})
  obs = init_obs
  done = False
  terminated = False
  while not done:
    if (obs.rho >= 1.).any():
      # there is an overflow for this contingency_id for this step
      # grid is not secure
      is_secure[c_num] = False
      terminated = True
      break

    act = env.action_space() # replace this by a "corrective action" from a agent if needed
    obs, reward, done, info = env.step(act)

    if done and obs.current_step < obs.max_step:
      # episode terminated premarturely
      is_secure[c_num] = False
      terminated = True
      break

  if not terminated:
    is_secure[c_num] = True

# At this point the grid is N-1 safe for this time interval
# if and only if:
assert is_secure.all()

You can, almost as easily (if you use an agent that is able to take some “corrective actions”) assess if the grid is “N-1 secure for a given time interval with corrective actions” by having an agent take corrective actions (replace act variable above).

Note

If you do that, you must also make sure that the agent cannot reconnect the contingency (eg the disconnected powerline) for the entire duration of the episode.

Reformulation with an opponent

Evaluating if a grid is N-1 secure for a given time interval can be reformulated as evaluating if a grid is secure against an opponent that will try to disconnect a powerline, for a certain duration, preventing its reconnection. And you “run” this opponent for each contingency that you need to simulate.

And if an agent manages to find a “correct list of actions” for all of these contingencies, then you can say that this agent “makes the grid secure”.

Informatically, this reformulation “hides” the init_obs = env.reset(options={“init state”: {“set_line_status”: [(contingency_id, -1)]}}) inside the opponent which “lives” inside the environment.

The sketch of the above code would look like:

is_secure = np.zeros(len(LIST_OF_ALL_CONTINGENCIES), dtype=bool)
for c_num, contingency_id in enumerate(LIST_OF_ALL_CONTINGENCIES):
  env.set_opponent(Opponent_Attack_Line(contingency_id))
  # NB: this `set_opponent` is not implemented, it's an illustration of the equivalence
  # NB: the Opponent_Attack_Line(contingency_id) does not exists either
  init_obs = env.reset()
  # init_obs has the proper "attack", which is "disconnection of the line contingency_id"

  ... # rest is unchanged

And now, to limit computation time (for example in training) you can imagine having an opponent that would always find the worst possible line to disconnect (=”attack” in grid2op formulation) for a given agent (for example by simulating the previous for loop ‘in its head’ before actually disconnecting the line). If such an opponent exists, it is sufficient to ensure that “if the agent manages to overcome the single attack of this ‘oracle opponent’, then the agent makes the grid N-1 secure for the entire episode with curative actions.”

Of course, in practice this ‘oracle opponent’ is really hard to find. So we decided to target the “worst possible line to disconnect” for a given agent, with a heuristic. This lead to the different opponent present in grid2op today.

How to create an opponent in any environment

This section is a work in progress, it will only cover how to set up one type of opponent, and supposes that you already know which lines you want to attack, at which frequency etc.

More detailed information about the opponent will be provide in the future.

The set up for the opponent in the “l2rpn_neurips_track1” has the following configuration.

lines_attacked = ["62_58_180", "62_63_160", "48_50_136", "48_53_141", "41_48_131", "39_41_121",
              "43_44_125", "44_45_126", "34_35_110", "54_58_154"]
rho_normalization = [0.45, 0.45, 0.6, 0.35, 0.3, 0.2,
                     0.55, 0.3, 0.45, 0.55]
opponent_attack_cooldown = 12*24  # 24 hours, 1 hour being 12 time steps
opponent_attack_duration = 12*4  # 4 hours
opponent_budget_per_ts = 0.16667  # opponent_attack_duration / opponent_attack_cooldown + epsilon
opponent_init_budget = 144.  # no need to attack straightfully, it can attack starting at midday the first day
config = {
    "opponent_attack_cooldown": opponent_attack_cooldown,
    "opponent_attack_duration": opponent_attack_duration,
    "opponent_budget_per_ts": opponent_budget_per_ts,
    "opponent_init_budget": opponent_init_budget,
    "opponent_action_class": PowerlineSetAction,
    "opponent_class": WeightedRandomOpponent,
    "opponent_budget_class": BaseActionBudget,
    'kwargs_opponent': {"lines_attacked": lines_attacked,
                        "rho_normalization": rho_normalization,
                        "attack_period": opponent_attack_cooldown}
}

To create the same type of opponent on the case14 grid you can do:

import grid2op
from grid2op.Action import PowerlineSetAction
from grid2op.Opponent import RandomLineOpponent, BaseActionBudget
env_name = "l2rpn_case14_sandbox"

env_with_opponent = grid2op.make(env_name,
                                 opponent_attack_cooldown=12*24,
                                 opponent_attack_duration=12*4,
                                 opponent_budget_per_ts=0.5,
                                 opponent_init_budget=0.,
                                 opponent_action_class=PowerlineSetAction,
                                 opponent_class=RandomLineOpponent,
                                 opponent_budget_class=BaseActionBudget,
                                 kwargs_opponent={"lines_attacked":
                                      ["1_3_3", "1_4_4", "3_6_15", "9_10_12", "11_12_13", "12_13_14"]}
                                 )
# and now you have an opponent on the l2rpn_case14_sandbox
# you can for example
obs = env_with_opponent.reset()

act = ...  # chose an action here
obs, reward, done, info = env_with_opponent.step(act)

And for the track2 of neurips, if you want to make it even more complicated, you can add an opponent in the same fashion:

import grid2op
from grid2op.Action import PowerlineSetAction
from grid2op.Opponent import RandomLineOpponent, BaseActionBudget
env_name = "l2rpn_neurips_2020_track2_small"

env_with_opponent = grid2op.make(env_name,
                                 opponent_attack_cooldown=12*24,
                                 opponent_attack_duration=12*4,
                                 opponent_budget_per_ts=0.5,
                                 opponent_init_budget=0.,
                                 opponent_action_class=PowerlineSetAction,
                                 opponent_class=RandomLineOpponent,
                                 opponent_budget_class=BaseActionBudget,
                                 kwargs_opponent={"lines_attacked":
                                                     ["26_31_106",
                                                      "21_22_93",
                                                      "17_18_88",
                                                      "4_10_162",
                                                      "12_14_68",
                                                      "14_32_108",
                                                      "62_58_180",
                                                      "62_63_160",
                                                      "48_50_136",
                                                      "48_53_141",
                                                      "41_48_131",
                                                      "39_41_121",
                                                      "43_44_125",
                                                      "44_45_126",
                                                      "34_35_110",
                                                      "54_58_154",
                                                      "74_117_81",
                                                      "80_79_175",
                                                      "93_95_43",
                                                      "88_91_33",
                                                      "91_92_37",
                                                      "99_105_62",
                                                      "102_104_61"]}
                                 )
# and now you have an opponent on the l2rpn_case14_sandbox
# you can for example
obs = env_with_opponent.reset()

act = ...  # chose an action here
obs, reward, done, info = env_with_opponent.step(act)

To summarize what is going on here:

  • opponent_attack_cooldown: give the minimum number of time between two attacks (here 1 attack per day)

  • opponent_attack_duration: duration for each attack (when a line is attacked, it will not be possible to reconnect it for that many steps). In the example it’s 4h (so 48 steps)

  • opponent_action_class: type of the action the opponent will perform (in this case PowerlineSetAction)

  • opponent_class: type of the opponent. Change it at your own risk.

  • opponent_budget_class: Each attack will cost some budget to the opponent. If no budget, the opponent cannot attack. This specifies how the budget are computed. Do not change it.

  • opponent_budget_per_ts: increase of the budget of the opponent per step. The higher this number, the faster the the opponent will regenerate its budget.

  • opponent_init_budget: initial opponent budget. It is set to 0 to “give” the agent a bit of time before the opponent is triggered.

  • kwargs_opponent: additional information for the opponent. In this case we provide for each grid the powerline it can attack.

Note

This is only valid for the RandomLineOpponent that disconnect powerlines randomly (but not uniformly!). For other type of Opponent, we don’t provide any information in the documentation at this stage. Feel free to submit a github issue if this is an issue for you.

How to deactivate an opponent in an environment

If you come accross an environment with an “opponent” already present but for some reasons you want to deactivate it, you can do this by customization the call to “grid2op.make” like this:

import grid2op
from grid2op.Action import DontAct
from grid2op.Opponent import BaseOpponent, NeverAttackBudget
env_name = ...


# if you want to disable the opponent you can do (grid2op >= 1.9.4)
kwargs_no_opp = grid2op.Opponent.get_kwargs_no_opponent()
env_no_opp = grid2op.make(env_name, **kwargs_no_opp)
# and there the opponent is disabled

# or, in a more complex fashion (or for older grid2op version <= 1.9.3)
env_without_opponent = grid2op.make(env_name,
                                    opponent_attack_cooldown=999999,
                                    opponent_attack_duration=0,
                                    opponent_budget_per_ts=0,
                                    opponent_init_budget=0,
                                    opponent_action_class=DontAct,
                                    opponent_class=BaseOpponent,
                                    opponent_budget_class=NeverAttackBudget,
                                    ...  # other arguments pass to the "make" function
                                    )

Note

Currently it’s not possible to deactivate an opponent once the environment is created.

If you want this feature, you can comment the issue https://github.com/Grid2Op/grid2op/issues/426

Detailed Documentation by class

Classes:

BaseActionBudget(action_space)

INTERNAL

BaseOpponent(action_space)

FromEpisodeDataOpponent(action_space)

GeometricOpponent(action_space)

This opponent will disconnect lines randomly among the attackable lines lines_attacked.

GeometricOpponentMultiArea(action_space)

This opponent is a combination of several similar opponents (of Kind Geometric Opponent at this stage) attacking on different areas.

NeverAttackBudget(action_space)

This class define an unlimited budget for the opponent.

OpponentSpace(compute_budget, init_budget, ...)

Is similar to the action space, but for the opponent.

RandomLineOpponent(action_space)

An opponent that disconnect at random any powerlines among a specified list given at the initialization.

UnlimitedBudget(action_space)

This class define an unlimited budget for the opponent.

WeightedRandomOpponent(action_space)

This opponent will disconnect lines randomly among the attackable lines lines_attacked.

class grid2op.Opponent.BaseActionBudget(action_space)[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\

This is the base class representing the action budget. It makes sure the opponent uses the correct type of “action”, and compute the bugdet associated to it.

class grid2op.Opponent.BaseOpponent(action_space)[source]

Methods:

attack(observation, agent_action, ...)

This method is the equivalent of "act" for a regular agent.

get_state()

This function should return the internal state of the Opponent.

init(partial_env, **kwargs)

Generic function used to initialize the derived classes.

reset(initial_budget)

This function is called at the end of an episode, when the episode is over.

set_state(my_state)

This function is used to set the internal state of the Opponent.

tell_attack_continues(observation, ...)

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

attack(observation, agent_action, env_action, budget, previous_fails)[source]

This method is the equivalent of “act” for a regular agent.

Opponent, in this framework can have more information than a regular agent (in particular it can view time step t+1), it has access to its current budget etc.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • opp_reward (float) – THe opponent “reward” (equivalent to the agent reward, but for the opponent) TODO do i add it back ???

  • done (bool) – Whether the game ended or not TODO do i add it back ???

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

  • budget (float) – The current remaining budget (if an action is above this budget, it will be replaced by a do nothing.

  • previous_fails (bool) – Wheter the previous attack failed (due to budget or ambiguous action)

Returns:

  • attack (grid2op.Action.Action) – The attack performed by the opponent. In this case, a do nothing, all the time.

  • duration (int) – The duration of the attack

get_state()[source]

This function should return the internal state of the Opponent.

This means that after a call to opponent.set_state(opponent.get_state()) the opponent should do the exact same things than without these calls.

init(partial_env, **kwargs)[source]

Generic function used to initialize the derived classes. For example, if an opponent reads from a file, the path where is the file is located should be pass with this method.

reset(initial_budget)[source]

This function is called at the end of an episode, when the episode is over. It aims at resetting the self and prepare it for a new episode.

Parameters:

initial_budget (float) – The initial budget the opponent has

set_state(my_state)[source]

This function is used to set the internal state of the Opponent.

Parameters:

my_state

tell_attack_continues(observation, agent_action, env_action, budget)[source]

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

At every time step, either “attack” or “tell_acttack_continues” is called exactly once.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

  • budget (float) – The current remaining budget (if an action is above this budget, it will be replaced by a do nothing.

class grid2op.Opponent.FromEpisodeDataOpponent(action_space)[source]

Warning

This can only be used if your environment uses grid2op.Chronics.FromOneEpisodeData or XXX (from a list of episode data or directory) class otherwise it will NOT work.

New in version 1.9.4.

Examples

Provided that you stored some data in path_agent using a grid2op.Runner.Runner for example you can use this class with:

import grid2op
from grid2op.Chronics import FromOneEpisodeData
from grid2op.Opponent import FromEpisodeDataOpponent
from grid2op.Episode import EpisodeData

path_agent = ....  # same as above
env_name = .... # same as above

# path_agent is the path where data coming from a grid2op runner are stored
# NB it should come from a do nothing agent, or at least
# an agent that does not modify the injections (no redispatching, curtailment, storage)
li_episode = EpisodeData.list_episode(path_agent)
ep_data = li_episode[0]

env = grid2op.make(env_name,
                   chronics_class=FromOneEpisodeData,  # super important
                   data_feeding_kwargs={"ep_data": ep_data},  # super important
                   opponent_class=FromEpisodeDataOpponent,  # important
                   opponent_attack_cooldown=1,  # super important
              )
# ep_data can be either a tuple of 2 elements (like above)
# or a full path to a saved episode
# or directly an object of type EpisodeData

obs = env.reset()

# and now you can use "env" as any grid2op environment.
Parameters:

BaseOpponent (_type_) – _description_

Methods:

attack(observation, agent_action, ...)

This method is the equivalent of "act" for a regular agent.

get_state()

This function should return the internal state of the Opponent.

init(partial_env, **kwargs)

Generic function used to initialize the derived classes.

reset(initial_budget)

This function is called at the end of an episode, when the episode is over.

set_state(state)

This function is used to set the internal state of the Opponent.

tell_attack_continues(observation, ...)

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

attack(observation, agent_action, env_action, budget, previous_fails)[source]

This method is the equivalent of “act” for a regular agent.

Opponent, in this framework can have more information than a regular agent (in particular it can view time step t+1), it has access to its current budget etc.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • opp_reward (float) – THe opponent “reward” (equivalent to the agent reward, but for the opponent) TODO do i add it back ???

  • done (bool) – Whether the game ended or not TODO do i add it back ???

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

  • budget (float) – The current remaining budget (if an action is above this budget, it will be replaced by a do nothing.

  • previous_fails (bool) – Wheter the previous attack failed (due to budget or ambiguous action)

Returns:

  • attack (grid2op.Action.Action) – The attack performed by the opponent. In this case, a do nothing, all the time.

  • duration (int) – The duration of the attack

get_state()[source]

This function should return the internal state of the Opponent.

This means that after a call to opponent.set_state(opponent.get_state()) the opponent should do the exact same things than without these calls.

init(partial_env, **kwargs)[source]

Generic function used to initialize the derived classes. For example, if an opponent reads from a file, the path where is the file is located should be pass with this method.

reset(initial_budget)[source]

This function is called at the end of an episode, when the episode is over. It aims at resetting the self and prepare it for a new episode.

Parameters:

initial_budget (float) – The initial budget the opponent has

set_state(state)[source]

This function is used to set the internal state of the Opponent.

Parameters:

my_state

tell_attack_continues(observation, agent_action, env_action, budget)[source]

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

At every time step, either “attack” or “tell_acttack_continues” is called exactly once.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

  • budget (float) – The current remaining budget (if an action is above this budget, it will be replaced by a do nothing.

class grid2op.Opponent.GeometricOpponent(action_space)[source]

This opponent will disconnect lines randomly among the attackable lines lines_attacked. The sampling is done according to the lines load factor (ratio <current going through the line> to <thermal limit of the line>) (see init for more details).

The time of the attack is sampled according to a geometric distribution

Methods:

attack(observation, agent_action, ...)

This method is the equivalent of "attack" for a regular agent.

get_state()

This function should return the internal state of the Opponent.

init(partial_env[, lines_attacked, ...])

Generic function used to initialize the derived classes.

reset(initial_budget)

This function is called at the end of an episode, when the episode is over.

set_state(my_state)

This function is used to set the internal state of the Opponent.

tell_attack_continues(observation, ...)

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

attack(observation, agent_action, env_action, budget, previous_fails)[source]

This method is the equivalent of “attack” for a regular agent. Opponent, in this framework can have more information than a regular agent (in particular it can view time step t+1), it has access to its current budget etc. :param observation: The last observation (at time t) :type observation: grid2op.Observation.Observation :param opp_reward: THe opponent “reward” (equivalent to the agent reward, but for the opponent) TODO do i add it back ??? :type opp_reward: float :param done: Whether the game ended or not TODO do i add it back ??? :type done: bool :param agent_action: The action that the agent took :type agent_action: grid2op.Action.Action :param env_action: The modification that the environment will take. :type env_action: grid2op.Action.Action :param budget: The current remaining budget (if an action is above this budget, it will be replaced by a do nothing. :type budget: float :param previous_fails: Wheter the previous attack failed (due to budget or ambiguous action) :type previous_fails: bool

Returns:

  • attack (grid2op.Action.Action) – The attack performed by the opponent. In this case, a do nothing, all the time.

  • duration (int) – The duration of the attack (if None then the attack will be made for the longest allowed time)

get_state()[source]

This function should return the internal state of the Opponent.

This means that after a call to opponent.set_state(opponent.get_state()) the opponent should do the exact same things than without these calls.

init(partial_env, lines_attacked=(), attack_every_xxx_hour=24, average_attack_duration_hour=4, minimum_attack_duration_hour=2, pmax_pmin_ratio=4, **kwargs)[source]

Generic function used to initialize the derived classes. For example, if an opponent reads from a file, the path where is the file is located should be pass with this method.

Parameters:
  • partial_env (grid2op Environment) – A pointer to the environment that initializes the opponent

  • lines_attacked (list) – The list of lines that the XPOpponent should be able to disconnect

  • attack_every_xxx_hour (float) –

    Provide the average duration between two attacks. Note that this should be greater than average_attack_duration_hour as, for now, an agent can only do one consecutive attack. You should provide it in “number of hours” and not in “number of steps”

    It is used to compute the attack_hazard_rate. Attacks time are sampled with a duration distribution. For this opponent, we use the simplest of these distributions : The geometric disribution https://en.wikipedia.org/wiki/Geometric_distribution (the discrete time counterpart of the exponential distribution). The attack_hazard_rate is the main parameter of this distribution. It can be seen as the (constant) probability of having an attack in the next step. It is also the inverse of the expectation of the time to an attack.

  • average_attack_duration_hour (float) –

    Give, in number of hours, the average attack duration. This should be greater than recovery_minimum_duration_hour

    Used to compute the recovery_rate: Recovery times are random or at least should have a random part. In our case, we will say that the recovery time is equal to a fixed time (safety procedure time) plus a random time (investigations and repair operations) sampled according to a geometric distribution

  • minimum_attack_duration_hour (int) – Minimum duration of an attack (give it in hour)

  • pmax_pmin_ratio (float) – Ratio between the probability of the most likely line to be disconnected and the least likely one.

reset(initial_budget)[source]

This function is called at the end of an episode, when the episode is over. It aims at resetting the self and prepare it for a new episode.

Parameters:

initial_budget (float) – The initial budget the opponent has

set_state(my_state)[source]

This function is used to set the internal state of the Opponent.

Parameters:

my_state

tell_attack_continues(observation, agent_action, env_action, budget)[source]

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

At every time step, either “attack” or “tell_acttack_continues” is called exactly once.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

  • budget (float) – The current remaining budget (if an action is above this budget, it will be replaced by a do nothing.

class grid2op.Opponent.GeometricOpponentMultiArea(action_space)[source]

This opponent is a combination of several similar opponents (of Kind Geometric Opponent at this stage) attacking on different areas. The difference between unitary opponents is mainly the attackable lines (which belongs to different pre-identified areas

Methods:

attack(observation, agent_action, ...)

This method is the equivalent of "attack" for a regular agent.

get_state()

This function should return the internal state of the Opponent.

init(partial_env[, lines_attacked, ...])

Generic function used to initialize the derived classes.

reset(initial_budget)

This function is called at the end of an episode, when the episode is over.

seed(seed)

INTERNAL

set_state(my_state)

This function is used to set the internal state of the Opponent.

tell_attack_continues(observation, ...)

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

attack(observation, agent_action, env_action, budget, previous_fails)[source]

This method is the equivalent of “attack” for a regular agent. Opponent, in this framework can have more information than a regular agent (in particular it can view time step t+1), it has access to its current budget etc. Here we take the combination of unitary opponent attacks if they happen at the same time. We choose the attack duration as the minimum duration of several simultaneous attacks if that happen.

Parameters:
  • observation (grid2op.Observation.Observation) – see the GeometricOpponent::attack documentation

  • opp_reward (float) – see the GeometricOpponent::attack documentation

  • done (bool) – see the GeometricOpponent::attack documentation

  • agent_action (grid2op.Action.Action) – see the GeometricOpponent::attack documentation

  • env_action (grid2op.Action.Action) – see the GeometricOpponent::attack documentation

  • budget (float) – see the GeometricOpponent::attack documentation

  • previous_fails (bool) – see the GeometricOpponent::attack documentation

Returns:

  • attack (grid2op.Action.Action) – see the GeometricOpponent::attack documentation

  • duration (int) – see the GeometricOpponent::attack documentation

get_state()[source]

This function should return the internal state of the Opponent.

This means that after a call to opponent.set_state(opponent.get_state()) the opponent should do the exact same things than without these calls.

init(partial_env, lines_attacked=None, attack_every_xxx_hour=24, average_attack_duration_hour=4, minimum_attack_duration_hour=2, pmax_pmin_ratio=4, **kwargs)[source]

Generic function used to initialize the derived classes. For example, if an opponent reads from a file, the path where is the file is located should be pass with this method. This is based on init from GeometricOpponent, only parameter lines_attacked becomes a list of list

Parameters:
  • partial_env (grid2op Environment) – see the GeometricOpponent::init documentation

  • lines_attacked (list(list)) – The lists of lines attacked by each unitary opponent (this is a list of list: the size of the outer list is the number of underlying opponent / number of areas and for each inner list it gives the name of the lines to attack.)

  • attack_every_xxx_hour (float) – see the GeometricOpponent::init documentation

  • average_attack_duration_hour (float) – see the GeometricOpponent::init documentation

  • minimum_attack_duration_hour (int) – see the GeometricOpponent::init documentation

  • pmax_pmin_ratio (float) – see the GeometricOpponent::init documentation

reset(initial_budget)[source]

This function is called at the end of an episode, when the episode is over. It aims at resetting the self and prepare it for a new episode.

Parameters:

initial_budget (float) – The initial budget the opponent has

seed(seed)[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\ We do not recommend to use this function outside of the two examples given in the description of this class.

Set the seeds of the source of pseudo random number used for these several unitary opponents.

Parameters:

seed (int) – The root seed to be set for the random number generator.

Returns:

seeds – The associated list of seeds used.

Return type:

list

set_state(my_state)[source]

This function is used to set the internal state of the Opponent.

Parameters:

my_state

tell_attack_continues(observation, agent_action, env_action, budget)[source]

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

At every time step, either “attack” or “tell_acttack_continues” is called exactly once.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

  • budget (float) – The current remaining budget (if an action is above this budget, it will be replaced by a do nothing.

class grid2op.Opponent.NeverAttackBudget(action_space)[source]

This class define an unlimited budget for the opponent.

It SHOULD NOT be used if the opponent is allowed to take any actions!

class grid2op.Opponent.OpponentSpace(compute_budget, init_budget, opponent, attack_duration, attack_cooldown, budget_per_timestep=0.0, action_space=None, _local_dir_cls=None)[source]

Is similar to the action space, but for the opponent.

This class is used to express some “constraints” on the opponent attack. The opponent is free to attack whatever it wants, for how long it wants and when it wants. This class ensures that the opponent does not break any rules.

action_space

The action space defining which action the Opponent are allowed to take

Type:

grid2op.Action.ActionSpace

init_budget

The initial budget of the opponent

Type:

float

compute_budget

The tool used to compute the budget

Type:

grid2op.Opponent.ActionBudget

opponent

The agent that will take malicious actions.

Type:

grid2op.Opponent.BaseOpponent

previous_fails

Whether the last attack of the opponent failed or not

Type:

bool

budget_per_timestep

The increase of the opponent budget per time step (if any)

Type:

float

Methods:

attack(observation, agent_action, env_action)

This function calls the attack from the opponent.

close()

if this has a reference to a backend, you need to close it for grid2op to work properly.

has_failed()

This signal is sent by the environment and indicated the opponent attack could not be implmented on the powergrid, most likely due to the attack to be ambiguous.

init_opponent(partial_env, **kwargs)

Generic function used to initialize the opponent.

reset()

Reset the state of the Opponent to its original state, in particular re assign the proper budget to it.

attack(observation, agent_action, env_action)[source]

This function calls the attack from the opponent.

It check whether the budget is consistent with the attack (budget should be more that the cosst associated with the attack). If the attack cost too much, then it is replaced by a “do nothing” action. Otherwise, the attack will be implemented by the environment.

Note that if the attack is “ambiguous” it will fails (the environment will replace it by a “do nothing” action), but the budget will still be consumed.

NB it is expected that this function update the OpponentSpace.last_attack attribute with None if the opponent choose not to attack, or with the attack of the opponent otherwise.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

Returns:

res – (or “do nothing” if the attack was too costly) or class:NoneType : Returns None if no action is taken

Return type:

grid2op.Action.Action : The attack the opponent wants to perform

close()[source]

if this has a reference to a backend, you need to close it for grid2op to work properly. Do not forget to do it.

has_failed()[source]

This signal is sent by the environment and indicated the opponent attack could not be implmented on the powergrid, most likely due to the attack to be ambiguous.

init_opponent(partial_env, **kwargs)[source]

Generic function used to initialize the opponent. For example, if an opponent reads from a file, the path where is the file is located should be pass with this method.

reset()[source]

Reset the state of the Opponent to its original state, in particular re assign the proper budget to it.

class grid2op.Opponent.RandomLineOpponent(action_space)[source]

An opponent that disconnect at random any powerlines among a specified list given at the initialization.

Methods:

attack(observation, agent_action, ...)

This method is the equivalent of "attack" for a regular agent.

init(partial_env[, lines_attacked])

INTERNAL

attack(observation, agent_action, env_action, budget, previous_fails)[source]

This method is the equivalent of “attack” for a regular agent.

Opponent, in this framework can have more information than a regular agent (in particular it can view time step t+1), it has access to its current budget etc.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • opp_reward (float) – THe opponent “reward” (equivalent to the agent reward, but for the opponent) TODO do i add it back ???

  • done (bool) – Whether the game ended or not TODO do i add it back ???

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

  • budget (float) – The current remaining budget (if an action is above this budget, it will be replaced by a do nothing.

  • previous_fails (bool) – Wheter the previous attack failed (due to budget or ambiguous action)

Returns:

  • attack (grid2op.Action.Action) – The attack performed by the opponent. In this case, a do nothing, all the time.

  • duration (int) – The duration of the attack (if None then the attack will be made for the longest allowed time)

init(partial_env, lines_attacked=[], **kwargs)[source]

INTERNAL

Warning

/!\ Internal, do not use unless you know what you are doing /!\ Used when the opponent is created.

Parameters:
  • partial_env

  • lines_attacked

  • kwargs

class grid2op.Opponent.UnlimitedBudget(action_space)[source]

This class define an unlimited budget for the opponent.

It SHOULD NOT be used if the opponent is allowed to take any actions!

class grid2op.Opponent.WeightedRandomOpponent(action_space)[source]

This opponent will disconnect lines randomly among the attackable lines lines_attacked. The sampling is weighted by the lines current usage rate divided by some factor rho_normalization (see init for more details).

When an attack becomes possible, the time of the attack will be sampled uniformly in the next attack_period steps (see init).

Methods:

attack(observation, agent_action, ...)

This method is the equivalent of "attack" for a regular agent.

init(partial_env[, lines_attacked, ...])

Generic function used to initialize the derived classes.

reset(initial_budget)

This function is called at the end of an episode, when the episode is over.

tell_attack_continues(observation, ...)

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

attack(observation, agent_action, env_action, budget, previous_fails)[source]

This method is the equivalent of “attack” for a regular agent.

Opponent, in this framework can have more information than a regular agent (in particular it can view time step t+1), it has access to its current budget etc.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • opp_reward (float) – THe opponent “reward” (equivalent to the agent reward, but for the opponent) TODO do i add it back ???

  • done (bool) – Whether the game ended or not TODO do i add it back ???

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

  • budget (float) – The current remaining budget (if an action is above this budget, it will be replaced by a do nothing.

  • previous_fails (bool) – Wheter the previous attack failed (due to budget or ambiguous action)

Returns:

  • attack (grid2op.Action.Action) – The attack performed by the opponent. In this case, a do nothing, all the time.

  • duration (int) – The duration of the attack

init(partial_env, lines_attacked=[], rho_normalization=[], attack_period=288, **kwargs)[source]

Generic function used to initialize the derived classes. For example, if an opponent reads from a file, the path where is the file is located should be pass with this method.

Parameters:
  • lines_attacked (list) – The list of lines that the WeightedRandomOpponent should be able to disconnect

  • rho_normalization (list) – The list of mean usage rates for the attackable lines. Should have the same length as lines_attacked. If no value is given, no normalization will be performed. The weights for sampling the attacked line are rho / rho_normalization.

  • attack_period (int) – The number of steps among which the attack may happen. If attack_period=10, then whenever an attack can be made, it will happen in the 10 next steps.

reset(initial_budget)[source]

This function is called at the end of an episode, when the episode is over. It aims at resetting the self and prepare it for a new episode.

Parameters:

initial_budget (float) – The initial budget the opponent has

tell_attack_continues(observation, agent_action, env_action, budget)[source]

The purpose of this method is to tell the agent that his attack is being continued and to indicate the current state of the grid.

At every time step, either “attack” or “tell_acttack_continues” is called exactly once.

Parameters:
  • observation (grid2op.Observation.Observation) – The last observation (at time t)

  • agent_action (grid2op.Action.Action) – The action that the agent took

  • env_action (grid2op.Action.Action) – The modification that the environment will take.

  • budget (float) – The current remaining budget (if an action is above this budget, it will be replaced by a do nothing.

If you still can’t find what you’re looking for, try in one of the following pages:

Still trouble finding the information ? Do not hesitate to send a github issue about the documentation at this link: Documentation issue template

Copyright © Grid2Op a Series of LF Projects, LLC For website terms of use, trademark policy and other project policies please see https://lfprojects.org.