sumo-rl

sumo-rl

用于智能交通信号控制的强化学习框架

SUMO-RL是基于SUMO交通模拟器的强化学习框架,专注于智能交通信号控制。该框架提供简洁接口,支持创建单代理和多代理强化学习环境,允许自定义状态和奖励函数,并兼容主流RL库。SUMO-RL简化了交通信号控制的强化学习研究过程,适用于多种交通网络和场景。目前已应用于多项研究,覆盖从单一交叉口到大规模城市网络的各类交通控制问题。

SUMO-RL强化学习交通信号控制多智能体交通仿真Github开源项目
<img src="docs/_static/logo.png" align="right" width="30%"/>

DOI tests PyPI version pre-commit Code style: black License

SUMO-RL

<!-- start intro -->

SUMO-RL provides a simple interface to instantiate Reinforcement Learning (RL) environments with SUMO for Traffic Signal Control.

Goals of this repository:

  • Provide a simple interface to work with Reinforcement Learning for Traffic Signal Control using SUMO
  • Support Multiagent RL
  • Compatibility with gymnasium.Env and popular RL libraries such as stable-baselines3 and RLlib
  • Easy customisation: state and reward definitions are easily modifiable

The main class is SumoEnvironment. If instantiated with parameter 'single-agent=True', it behaves like a regular Gymnasium Env. For multiagent environments, use env or parallel_env to instantiate a PettingZoo environment with AEC or Parallel API, respectively. TrafficSignal is responsible for retrieving information and actuating on traffic lights using TraCI API.

For more details, check the documentation online.

<!-- end intro -->

Install

<!-- start install -->

Install SUMO latest version:

sudo add-apt-repository ppa:sumo/stable sudo apt-get update sudo apt-get install sumo sumo-tools sumo-doc

Don't forget to set SUMO_HOME variable (default sumo installation path is /usr/share/sumo)

echo 'export SUMO_HOME="/usr/share/sumo"' >> ~/.bashrc source ~/.bashrc

Important: for a huge performance boost (~8x) with Libsumo, you can declare the variable:

export LIBSUMO_AS_TRACI=1

Notice that you will not be able to run with sumo-gui or with multiple simulations in parallel if this is active (more details).

Install SUMO-RL

Stable release version is available through pip

pip install sumo-rl

Alternatively, you can install using the latest (unreleased) version

git clone https://github.com/LucasAlegre/sumo-rl cd sumo-rl pip install -e .
<!-- end install -->

MDP - Observations, Actions and Rewards

Observation

<!-- start observation -->

The default observation for each traffic signal agent is a vector:

obs = [phase_one_hot, min_green, lane_1_density,...,lane_n_density, lane_1_queue,...,lane_n_queue]
  • phase_one_hot is a one-hot encoded vector indicating the current active green phase
  • min_green is a binary variable indicating whether min_green seconds have already passed in the current phase
  • lane_i_density is the number of vehicles in incoming lane i dividided by the total capacity of the lane
  • lane_i_queueis the number of queued (speed below 0.1 m/s) vehicles in incoming lane i divided by the total capacity of the lane

You can define your own observation by implementing a class that inherits from ObservationFunction and passing it to the environment constructor.

<!-- end observation -->

Action

<!-- start action -->

The action space is discrete. Every 'delta_time' seconds, each traffic signal agent can choose the next green phase configuration.

E.g.: In the 2-way single intersection there are |A| = 4 discrete actions, corresponding to the following green phase configurations:

<p align="center"> <img src="docs/_static/actions.png" align="center" width="75%"/> </p>

Important: every time a phase change occurs, the next phase is preeceded by a yellow phase lasting yellow_time seconds.

<!-- end action -->

Rewards

<!-- start reward -->

The default reward function is the change in cumulative vehicle delay:

<p align="center"> <img src="docs/_static/reward.png" align="center" width="25%"/> </p>

That is, the reward is how much the total delay (sum of the waiting times of all approaching vehicles) changed in relation to the previous time-step.

You can choose a different reward function (see the ones implemented in TrafficSignal) with the parameter reward_fn in the SumoEnvironment constructor.

It is also possible to implement your own reward function:

def my_reward_fn(traffic_signal): return traffic_signal.get_average_speed() env = SumoEnvironment(..., reward_fn=my_reward_fn)
<!-- end reward -->

API's (Gymnasium and PettingZoo)

Gymnasium Single-Agent API

<!-- start gymnasium -->

If your network only has ONE traffic light, then you can instantiate a standard Gymnasium env (see Gymnasium API):

import gymnasium as gym import sumo_rl env = gym.make('sumo-rl-v0', net_file='path_to_your_network.net.xml', route_file='path_to_your_routefile.rou.xml', out_csv_name='path_to_output.csv', use_gui=True, num_seconds=100000) obs, info = env.reset() done = False while not done: next_obs, reward, terminated, truncated, info = env.step(env.action_space.sample()) done = terminated or truncated
<!-- end gymnasium -->

PettingZoo Multi-Agent API

<!-- start pettingzoo -->

For multi-agent environments, you can use the PettingZoo API (see Petting Zoo API):

import sumo_rl env = sumo_rl.parallel_env(net_file='nets/RESCO/grid4x4/grid4x4.net.xml', route_file='nets/RESCO/grid4x4/grid4x4_1.rou.xml', use_gui=True, num_seconds=3600) observations = env.reset() while env.agents: actions = {agent: env.action_space(agent).sample() for agent in env.agents} # this is where you would insert your policy observations, rewards, terminations, truncations, infos = env.step(actions)
<!-- end pettingzoo -->

RESCO Benchmarks

In the folder nets/RESCO you can find the network and route files from RESCO (Reinforcement Learning Benchmarks for Traffic Signal Control), which was built on top of SUMO-RL. See their paper for results.

<p align="center"> <img src="sumo_rl/nets/RESCO/maps.png" align="center" width="60%"/> </p>

Experiments

Check experiments for examples on how to instantiate an environment and train your RL agent.

Q-learning in a one-way single intersection:

python experiments/ql_single-intersection.py

RLlib PPO multiagent in a 4x4 grid:

python experiments/ppo_4x4grid.py

stable-baselines3 DQN in a 2-way single intersection:

Obs: you need to install stable-baselines3 with pip install "stable_baselines3[extra]>=2.0.0a9" for Gymnasium compatibility.

python experiments/dqn_2way-single-intersection.py

Plotting results:

python outputs/plot.py -f outputs/4x4grid/ppo_conn0_ep2
<p align="center"> <img src="outputs/result.png" align="center" width="50%"/> </p>

Citing

<!-- start citation -->

If you use this repository in your research, please cite:

@misc{sumorl, author = {Lucas N. Alegre}, title = {{SUMO-RL}}, year = {2019}, publisher = {GitHub}, journal = {GitHub repository}, howpublished = {\url{https://github.com/LucasAlegre/sumo-rl}}, }
<!-- end citation --> <!-- start list of publications -->

List of publications that use SUMO-RL (please open a pull request to add missing entries):

<!-- end list of publications

编辑推荐精选

Vora

Vora

免费创建高清无水印Sora视频

Vora是一个免费创建高清无水印Sora视频的AI工具

Refly.AI

Refly.AI

最适合小白的AI自动化工作流平台

无需编码,轻松生成可复用、可变现的AI自动化工作流

酷表ChatExcel

酷表ChatExcel

大模型驱动的Excel数据处理工具

基于大模型交互的表格处理系统,允许用户通过对话方式完成数据整理和可视化分析。系统采用机器学习算法解析用户指令,自动执行排序、公式计算和数据透视等操作,支持多种文件格式导入导出。数据处理响应速度保持在0.8秒以内,支持超过100万行数据的即时分析。

AI工具酷表ChatExcelAI智能客服AI营销产品使用教程
TRAE编程

TRAE编程

AI辅助编程,代码自动修复

Trae是一种自适应的集成开发环境(IDE),通过自动化和多元协作改变开发流程。利用Trae,团队能够更快速、精确地编写和部署代码,从而提高编程效率和项目交付速度。Trae具备上下文感知和代码自动完成功能,是提升开发效率的理想工具。

AI工具TraeAI IDE协作生产力转型热门
AIWritePaper论文写作

AIWritePaper论文写作

AI论文写作指导平台

AIWritePaper论文写作是一站式AI论文写作辅助工具,简化了选题、文献检索至论文撰写的整个过程。通过简单设定,平台可快速生成高质量论文大纲和全文,配合图表、参考文献等一应俱全,同时提供开题报告和答辩PPT等增值服务,保障数据安全,有效提升写作效率和论文质量。

AI辅助写作AI工具AI论文工具论文写作智能生成大纲数据安全AI助手热门
博思AIPPT

博思AIPPT

AI一键生成PPT,就用博思AIPPT!

博思AIPPT,新一代的AI生成PPT平台,支持智能生成PPT、AI美化PPT、文本&链接生成PPT、导入Word/PDF/Markdown文档生成PPT等,内置海量精美PPT模板,涵盖商务、教育、科技等不同风格,同时针对每个页面提供多种版式,一键自适应切换,完美适配各种办公场景。

AI办公办公工具AI工具博思AIPPTAI生成PPT智能排版海量精品模板AI创作热门
潮际好麦

潮际好麦

AI赋能电商视觉革命,一站式智能商拍平台

潮际好麦深耕服装行业,是国内AI试衣效果最好的软件。使用先进AIGC能力为电商卖家批量提供优质的、低成本的商拍图。合作品牌有Shein、Lazada、安踏、百丽等65个国内外头部品牌,以及国内10万+淘宝、天猫、京东等主流平台的品牌商家,为卖家节省将近85%的出图成本,提升约3倍出图效率,让品牌能够快速上架。

iTerms

iTerms

企业专属的AI法律顾问

iTerms是法大大集团旗下法律子品牌,基于最先进的大语言模型(LLM)、专业的法律知识库和强大的智能体架构,帮助企业扫清合规障碍,筑牢风控防线,成为您企业专属的AI法律顾问。

SimilarWeb流量提升

SimilarWeb流量提升

稳定高效的流量提升解决方案,助力品牌曝光

稳定高效的流量提升解决方案,助力品牌曝光

Sora2视频免费生成

Sora2视频免费生成

最新版Sora2模型免费使用,一键生成无水印视频

最新版Sora2模型免费使用,一键生成无水印视频

下拉加载更多