About the Competition
Welcome to the inaugural MOASEI Competition, bringing together students, researchers, and professionals from around the world interested in addressing the challenges of Open Agent Systems (OASYS)! The first event to provide unique benchmarking to evaluate the effectiveness of Multiagent Reinforcement Learning (MARL) OASYS. What makes MOASEI unique is the focus on OASYS, which is characterized by the ability of agents and/or tasks that can join or leave the system at any time. The competition will feature a series of benchmark domains that capture the key challenges of OASYS, such as robustness and adaptability. Participants will be asked to submit their MARL checkpoints, which will be evaluated on their performance across one or more of these tracks. The competition will be held leading up to and culminating at the AAMAS conference, providing a unique opportunity for participants to showcase their work to the broader multiagent systems community. We look forward to your participation and hope to see you at the competition!
Open Systems Background
Making correct decisions to accomplish tasks and goals is often challenging for AI in real-world applications due to the uncertainty caused by incomplete information and nondeterministic environment behavior. For example, when the set of agents changes over time (due to agent openness), reasoning becomes more complex than in a closed environment because the agent should not only predict what actions other agents will perform, but also which of the other agents are even present to take actions at all. Likewise, when the set of tasks change dynamically over time (due to task openness), agents are less certain that their actions taken to complete existing tasks will remain optimal in the long run as new tasks are introduced or existing tasks disappear.
Track #1: Cybersecurity (Agent Openness Only)
Two teams of multiple agents (attackers vs. defenders) compete to either infiltrate or protect a network infrastructure. Attacker agents frequently disappear to avoid detection, and defender agents (e.g. participants) can be taken offline as the equipment they use is disrupted by network infection.
Cybersecurity Configurations
3v3 agents with 4 subnetworks.
Configurations: Linear, Star, or Fully Connected network.
Seeds: Used only for stochastic transitions.
CS1: Fully connected topology.
CS2: Star connected topology.
CS3: Linear connected topology.
Track #2: Rideshare (Task Openness Only)
Agents operating autonomous cars within a ridesharing application decide how to prioritize dynamically appearing passengers as tasks.
Rideshare Configurations
Agents always start in the same position.
Configurations: Different number of passengers entering (different number guaranteed at start).
Seeds: Different fares, starts, destinations, and entry times.
DR1: 11 initial passengers, with 1 additional appearing (stochastic).
DR2: 6 initial passengers, 6 additional appear at various times (stochastic).
DR3: 1 initial passenger, 11 new passengers appear at various times (stochastic).
Track #3: Wildfire (Both Agent and Task Openness)
Agents decide how to use limited suppressant resources to collaboratively put out wildfire tasks that appear both spontaneously and due to realistic fire-spread mechanics. Agents must temporarily disengage when they run out of limited suppressant to recharge before rejoining the firefighting efforts.
Wildfire Configurations
Agents always start in the same position.
Configurations: Varying initial agent presence, fire counts/locations, and fire-spread probability.
Seeds: Used only for stochastic transitions.
WS1: All agents present; two medium fires; slow new fire creation.
WS2: All agents present; two medium fires; fast new fire creation.
WS3: Only 2 agents present; four medium fires; fast new fire creation.
Evaluation
Solutions will be evaluated both on the initial set of shared scenarios available for training, as well as a novel set of 2-3 configuration that will not be shared (nor the random seeds used) before the conclusion of the competition. Solutions that fail to complete will be evaluated with respect to the tasks they complete before failing. Domain-independent MARL Performance Measures: All solutions in all tracks will be evaluated according to the cumulative rewards earned by agents making decisions while operating in test scenarios of the domain utilizing the submitted MARL solution. By using the same performance measures across tracks/domains, we can also try to establish general trends that could generalize to other OASYS domains and characterize the solutions submitted by competitors. Domain-specific Task Performance Measures: Ultimately , the success of MAS solutions for real-world applications and our ability as researchers to convince the general public of the usefulness of AI is measured by the ability of agents to accomplish domain-specific tasks. Thus, we will also measure the performance of solutions based on task-specific measures. In the Cybersecurity Defense domain, this includes the amount of time that different resources in the network remained free from infiltration by attackers, as well as the average amount of infiltration in each resource. In the Ridesharing domain, this includes the number of passengers successfully transported to their destination, the average wait time before passengers were picked up, and the average time a passenger rode in a car before arriving at their destination. In the Wildfire Suppression domain, this includes the total number of fires extinguished vs. burned out, the average duration of each fire and the efficiency of limited suppressant usage by agents. Determining winners: The winner of each track will be determined based on a comprehensive evaluation of each of the above performance measures in each of the scenarios within the track. In particular, the highest scoring submission in each performance measure per scenario will receive points (where n ins the number of submissions to that track), the second highest scoring will receive n-1 points, etc. The points in each performance measure will be summed up, and the highest scoring team in each track will be determined to be the winner of that track.
Registration Guidelines
Participants must register their teams using the following form: Register Here. After the registration, the MOASEI commitee will confirm your participation and provide directions for your submission.
Rules
To ensure a fair and competitive environment, we have established the following rules: Both individuals and teams are allowed to register for the competition and submit solutions to one, two, or all three tracks of their choice. Participants must register for the competition. Submitted solutions are expected to run without errors on our computing platform before the submission deadline. Participants will have access to the simulators in advance to thoroughly test that their solutions operate without error. We will provide a shared set of 2-3 scenarios within each track (e.g., different configurations of attackers and defender agents in the Cybersecurity Defense track, different numbers and challenges of fires in the Wildfire Suppression track) that participants will use for developing and training their solutions. Participants will be able to randomly generate episodes for each scenario in each track using different random seeds (or repeat the same episode using the same seed). We are encouraging MARL solutions within the competition; however, solutions utilizing other decision-making paradigms (heuristics, planning, game theory) are permissible to ensure the broadest participation and reflect the full range of decision making possible in OASYS. At the same time, our simulation environments provide information most compatible with MARL solutions, such as observations of current states and rewards after taking each action. The APIs of the simulation environments will not provide access to the models of underlying state transition, observation, and reward dynamics. Communication between agents will not be supported nor permitted. Solution submissions will consist of both (1) the source code needed to operate an agent within the simulators of the three domains (following a provided API structure for standardization), as well as (2) any serialized data files needed to operate the solutions (e.g., serialized neural networks for deep MARL solutions). We will impose a reasonable storage size for solutions (e.g., 10GB) for the sake of competition management. Trained models must be submitted; we will not have computational capacity to retrain models that require extensive computation time for training. Timing of decision making will be enforced so that agents do not have unlimited time to make each decision, allowing a reasonable amount of time per decision, reflecting real-world operations.
Simulation Platform
The competition will be conducted within the Python programming language. We will provide simulation environments for the three tracks, implemented within our Free-Range Zoo MARL testbed – an OASYS variant of the popular PettingZoo MARL testbed– with a consistent API for all environments, documentation can be found here. Solutions can utilize popular frameworks for implementing reinforcement learning solutions, including the Tensorflow and PyTorch libraries for deep learning. In addition to user supplied code, we will permit any external libraries, so long as they can be downloaded through PyPI with pip using a requirements.txt document. Solutions submitted by participants will be evaluated on research infrastructure hosted by the proposer’s institutions. These include servers with up to 32 cores, 512GB of RAM, and NVIDIA RTX 6000 Ada GPU (with 48GB VRAM).
Timeline
February 20th:Software environments will be available for download by registered participantsMarch 14th:Online "office hours" (2-3pm EST) to answer questions from participants before the final registration deadline- April 1st: Deadline to register for the competition
- April 4th: Online "office hours" (2-3pm EST) to answer questions from participants before the final registration deadline
- April 11th: Deadline for solution submission (both code and trained models)
- May 7th: Notification of finalists
- May 20th: In-person conclusion of the competition at AAMAS 2025
Organizers
- Adam Eck is the David H. and Margaret W. Barker Associate Professor of Computer Science at Oberlin College where he leads the Social Intelligence Lab. Adam's research interests include decision making for intelligent agents and multiagent systems in complex environments, as well as interdisciplinary applications of artificial intelligence and machine learning in public health and computational social science.
- Leen-Kiat Soh is Charles Bessey Professor with the School of Computing at the University of Nebraska, Lincoln, NE, where he leads the Intelligent Agents and Multiagent Systems (IAMAS) group. Leen-Kiat's research interests include multiagent team formation, learning, and modeling, Computer Science education, image processing, and intelligent data analytics.
- Prashant Doshi is a Professor of Computer Science with the School of Computing at the University of Georgia and directs the THINC Lab. Prashant's research interests lie in AI and robotics where his research has spanned two decades. In AI, his research has contributed computational algorithms and frameworks toward automated planning and learning in multiagent systems under uncertainty, with a recent emphasis on open agent systems.
Contact
For any queries or further information, please reach out to us at:
Email: Tbillings4@huskers.unl.edu
Acknowledgements
This research was supported by a collaborative NSF Grant #IIS-2312657 (to P.D.), #IIS-2312658 (to L.K.S.), and #IIS-2312659 (to A.E.). Additionally, this work was completed utilizing the Holland Computing Center of the University of Nebraska, which receives support from the UNL Office of Research and Economic Development, and the Nebraska Research Initiative. Finally, we thank the graduate and undergraduate students who have contributed to the development and testing of MOASEI: Ceferino Patino, Daniel Redder, Alireza Saleh Abadi, and Tyler Billings.