Skip to content

DaBo111/MCTF

Repository files navigation

puffer_ctf USE LIBGL_ALWAYS_SOFTWARE=1 when eval

Pufferquaticus — APM_5DA01_TP

This is a fork of PufferLib for the class APM_5DA01_TP on multiagent systems.

This repository includes puffer_ctf, a custom C version of a Capture-the-Flag environment for autonomous marine vehicles inspired by pyquaticus. Two teams (blue and red) compete to grab the opposing team's flag and return it to their base, while tagging opponents who venture into their territory.

Setup

uv venv
uv pip install -e .
source .venv/bin/activate
python setup.py build_ext --inplace --force

Test and play the environment manually with keyboard controls (press space to toggle human override):

puffer eval puffer_ctf --train.device cpu

Training and evaluation

puffer train puffer_ctf --train.device cpu --wandb

The reference throughput in a laptop is around ~80K SPS (steps per second).

Evaluation:

puffer eval puffer_ctf --train.device cpu --load-model-path latest
puffer eval puffer_ctf --train.device cpu --wandb --load-id <wandb_run_id>

Project

The project is open-ended: extend the environment and/or the learning algorithm and evaluate the impact. It can be based on this repo or on pyquaticus, a fork of the original pyquaticus.

Ideas

  • Use a different algorithm: try MAPPO, IPPO, or other MARL algorithms.
  • Modify the observation space: add or remove features and study the effect on learned behavior.
  • Modify the reward function: shape rewards to encourage different emergent behaviors.
  • Add role specialization: assign each agent a fixed role (attacker / defender) and train specialized policies.

Working in this repo

Tips

  • If you encounter some errors in your laptop or want to run larger experiments, check this for remote access to the school's computing cluster.
  • If your machine has a GPU, you can speed up training by using it (e.g., --train.device cuda instead of --train.device cpu).
  • Start with small changes and test them out in a quick training run (e.g., puffer train puffer_ctf --train.device cpu --train.total-timesteps 25_000_000) to verify they work as expected.
  • Use WandB to track experiments and compare results across different configurations.
  • You can debug functions in Python (e.g., reward or observation computation) by adding breakpoint() where needed (e.g., step() in pufferlib/ocean/ctf/ctf.py). This will drop you into an interactive pdb session. Note that this can be tricky when running with multiple workers, so you may want to test with a single environment (see comment in config/ocean/ctf.ini).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors