Thanks to visit codestin.com
Credit goes to github.com

Skip to content

An interface for Reinforcement Learning environments with SUMO for Traffic Signal Control. Compatible with gym, PettingZoo and MultiAgentEnv from RLlib.

License

Notifications You must be signed in to change notification settings

evantancy/sumo-rl

 
 

Repository files navigation

Changes from main repository

This fork contains the modified code used as the base for experimenting with different neural network architectures and reward shaping.

Most notably, we ...

  • experimented using Shared VS Separate Actor-Critic Networks using MLPs
  • crafted new reward functions (waiting time simple moving average reward, flow reward, and finally urgency reward), ultimately leading to a 47% decrease in waiting time and 10% increase in average speed

Check out the project post & paper here

To get started, (it is recommended you do this in a virtual environment using conda)

  • run the pip3 install -r requirements.txt
  • install SUMO
  • run pip3 install -e . in this directory

Training

Edit config.yml, from the project root directory run python experiments/sb3_train.py (for example)


Project Status: Active – The project has reached a stable, usable state and is being actively developed. License

SUMO-RL

SUMO-RL provides a simple interface to instantiate Reinforcement Learning environments with SUMO for Traffic Signal Control.

The main class SumoEnvironment behaves like a MultiAgentEnv from RLlib. If instantiated with parameter 'single-agent=True', it behaves like a regular Gym Env from OpenAI. Call env or parallel_env for PettingZoo environment support. TrafficSignal is responsible for retrieving information and actuating on traffic lights using TraCI API.

Goals of this repository:

  • Provide a simple interface to work with Reinforcement Learning for Traffic Signal Control using SUMO
  • Support Multiagent RL
  • Compatibility with gym.Env and popular RL libraries such as stable-baselines3 and RLlib
  • Easy customisation: state and reward definitions are easily modifiable

Install

Install SUMO latest version:

sudo add-apt-repository ppa:sumo/stable
sudo apt-get update
sudo apt-get install sumo sumo-tools sumo-doc

Don't forget to set SUMO_HOME variable (default sumo installation path is /usr/share/sumo)

echo 'export SUMO_HOME="/usr/share/sumo"' >> ~/.bashrc
source ~/.bashrc

Important: for a huge performance boost (~8x) with Libsumo, you can declare the variable:

export LIBSUMO_AS_TRACI=1

Notice that you will not be able to run with sumo-gui or with multiple simulations in parallel if this is active (more details).

Install SUMO-RL

Stable release version is available through pip

pip install sumo-rl

Alternatively you can install using the latest (unreleased) version

git clone https://github.com/LucasAlegre/sumo-rl
cd sumo-rl
pip install -e .

MDP - Observations, Actions and Rewards

Observation

The default observation for each traffic signal agent is a vector:

    obs = [phase_one_hot, min_green, lane_1_density,...,lane_n_density, lane_1_queue,...,lane_n_queue]
  • phase_one_hot is a one-hot encoded vector indicating the current active green phase
  • min_green is a binary variable indicating whether min_green seconds have already passed in the current phase
  • lane_i_density is the number of vehicles in incoming lane i dividided by the total capacity of the lane
  • lane_i_queueis the number of queued (speed below 0.1 m/s) vehicles in incoming lane i divided by the total capacity of the lane

You can define your own observation changing the method 'compute_observation' of TrafficSignal.

Actions

The action space is discrete. Every 'delta_time' seconds, each traffic signal agent can choose the next green phase configuration.

E.g.: In the 2-way single intersection there are |A| = 4 discrete actions, corresponding to the following green phase configurations:

Important: every time a phase change occurs, the next phase is preeceded by a yellow phase lasting yellow_time seconds.

Rewards

The default reward function is the change in cumulative vehicle delay:

That is, the reward is how much the total delay (sum of the waiting times of all approaching vehicles) changed in relation to the previous time-step.

You can define your own reward function changing the method 'compute_reward' of TrafficSignal.

Examples

PettingZoo API

env = sumo_rl.env(net_file='sumo_net_file.net.xml',
                  route_file='sumo_route_file.rou.xml',
                  use_gui=True,
                  num_seconds=3600)
env.reset()
for agent in env.agent_iter():
    observation, reward, done, info = env.last()
    action = policy(observation)
    env.step(action)

RESCO Benchmarks

In the folder nets/RESCO you can find the network and route files from RESCO (Reinforcement Learning Benchmarks for Traffic Signal Control), which was built on top of SUMO-RL. See their paper for results.

Experiments

Check experiments to see how to instantiate an environment and use it with your RL algorithm.

Q-learning in a one-way single intersection:

python3 experiments/ql_single-intersection.py

RLlib A3C multiagent in a 4x4 grid:

python3 experiments/a3c_4x4grid.py

stable-baselines3 DQN in a 2-way single intersection:

python3 experiments/dqn_2way-single-intersection.py

Plotting results:

python3 outputs/plot.py -f outputs/2way-single-intersection/a3c

Citation

If you use this repository in your research, please cite:

@misc{sumorl,
    author = {Lucas N. Alegre},
    title = {{SUMO-RL}},
    year = {2019},
    publisher = {GitHub},
    journal = {GitHub repository},
    howpublished = {\url{https://github.com/LucasAlegre/sumo-rl}},
}

About

An interface for Reinforcement Learning environments with SUMO for Traffic Signal Control. Compatible with gym, PettingZoo and MultiAgentEnv from RLlib.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 98.0%
  • Jupyter Notebook 1.3%
  • Shell 0.7%