Thanks to visit codestin.com
Credit goes to github.com

Skip to content

spateria/ISEMO

Repository files navigation

ISEMO: Multi-agent Reinforcement Learning in Spatial Domain Tasks using Inter Subtask Empowerment Rewards.

This is the official code for the following paper published in IEEE Symposium Series on Computational Intelligence (SSCI), 2019: Multi-agent Reinforcement Learning in Spatial Domain Tasks using Inter Subtask Empowerment Rewards

In the complex multi-agent tasks, various agents must cooperate to distribute relevant subtasks among each other to achieve joint task objectives. An agent's choice of the relevant subtask changes over time with the changes in the task environment state. Multi-agent Hierarchical Reinforcement Learning (MAHRL) provides an approach for learning to select the subtasks in response to the environment states, by using the joint task rewards to train various agents. When the joint task involves complex inter-agent dependencies, only a subset of agents might be capable of reaching the rewarding task states while other agents take precursory or intermediate roles. The delayed task reward might not be sufficient in such tasks to learn the coordinating policies for various agents. In this paper, we introduce a novel approach of MAHRL called Inter-Subtask Empowerment based Multi-agent Options (ISEMO) in which an Inter-Subtask Empowerment Reward (ISER) is given to an agent which enables the precondition(s) of other agents' subtasks. ISER is given in addition to the domain task reward in order to improve the inter-agent coordination. ISEMO also incorporates options model that can learn parameterized subtask termination functions and relax the limitations posed by hand-crafted termination conditions. Experiments in a spatial Search and Rescue domain show that ISEMO can learn the subtask selection policies of various agents grounded in the inter-dependencies among the agents, as well as learn the subtask termination conditions, and perform better than the standard MAHRL technique.

Baseline method is Cooperative HRL (CoHRL)

Ghavamzadeh, Mohammad, Sridhar Mahadevan, and Rajbala Makar. "Hierarchical multi-agent reinforcement learning." Autonomous Agents and Multi-Agent Systems 13.2 (2006): 197-229.

This code includes Python implementation of CoHRL.

Dependencies

Python >= 3.5.0

scikit-learn == 0.19.1

scipy == 1.0.0

opencv-python == 4.1.1.26

Setup before training

Before training, it is required to make the World objects. A World object contains attributes and configuration of the simulated Search & Rescue environment in which the multi-agent team is trained.
To make the World objects, give the following command:

python main.py --make

The result will be saved in files named as MA-World-{i}.pl, where {i} ranges from 0 to nruns-1. Here, nruns is defined in the args class in main.py. In each run, the configuration of the World changes (such as locations and/or numbers of certain objects)

Training mode

To run the software in the training mode, give the following command:

python main.py

By default, this runs ISEMO. To run CoHRL instead, give the following command:

python main.py −−runCoHRL

During training, data is saved in files with the names as: historyISEMO_testingFalse_.npy when using ISEMO and historyCoHRL_testingFalse_.npy. You can check the list of recorded data items in ISEMO.py (refer to the multi-dimensional array named history).

The learned models for the Q-value functions and the termination functions are saved in the models folder.

Testing mode

To run the software in thetesting mode, give the following command:

python main.py −−testing −−testID {i}

Here, testID {i} is the index of the saved World object (MA-World-{i}.pl) to be used for testing.

Further Details

Please refer to ISEMO_SW_Doc.pdf for more details.

About

Multi-agent Reinforcement Learning in Spatial Domain Tasks using Inter Subtask Empowerment Rewards (IEEE SSCI 2019)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages