Pufferquaticus — APM_5DA01_TP

USE LIBGL_ALWAYS_SOFTWARE=1 when eval

Pufferquaticus — APM_5DA01_TP

This is a fork of PufferLib for the class APM_5DA01_TP on multiagent systems.

This repository includes puffer_ctf, a custom C version of a Capture-the-Flag environment for autonomous marine vehicles inspired by pyquaticus. Two teams (blue and red) compete to grab the opposing team's flag and return it to their base, while tagging opponents who venture into their territory.

Setup

uv venv
uv pip install -e .
source .venv/bin/activate
python setup.py build_ext --inplace --force

Test and play the environment manually with keyboard controls (press space to toggle human override):

puffer eval puffer_ctf --train.device cpu

Training and evaluation

puffer train puffer_ctf --train.device cpu --wandb

The reference throughput in a laptop is around ~80K SPS (steps per second).

Evaluation:

puffer eval puffer_ctf --train.device cpu --load-model-path latest
puffer eval puffer_ctf --train.device cpu --wandb --load-id <wandb_run_id>

Project

The project is open-ended: extend the environment and/or the learning algorithm and evaluate the impact. It can be based on this repo or on pyquaticus, a fork of the original pyquaticus.

Ideas

Use a different algorithm: try MAPPO, IPPO, or other MARL algorithms.
Modify the observation space: add or remove features and study the effect on learned behavior.
Modify the reward function: shape rewards to encourage different emergent behaviors.
Add role specialization: assign each agent a fixed role (attacker / defender) and train specialized policies.

Working in this repo

Algorithm: you can tweak PufferLib's core training algorithm in pufferlib/pufferl.py (see the TODOs).
Observation space / environment changes: modify pufferlib/ocean/ctf/ctf.h (in C!), for instance compute_obs() to change the local observation computation.
Rewards: edit compute_rewards in pufferlib/ocean/ctf/ctf.h.
Role specialization: augment the observation space with a one-hot encoding of the role in pufferlib/ocean/ctf/ctf.h.

Tips

If you encounter some errors in your laptop or want to run larger experiments, check this for remote access to the school's computing cluster.
If your machine has a GPU, you can speed up training by using it (e.g., --train.device cuda instead of --train.device cpu).
Start with small changes and test them out in a quick training run (e.g., puffer train puffer_ctf --train.device cpu --train.total-timesteps 25_000_000) to verify they work as expected.
Use WandB to track experiments and compare results across different configurations.
You can debug functions in Python (e.g., reward or observation computation) by adding breakpoint() where needed (e.g., step() in pufferlib/ocean/ctf/ctf.py). This will drop you into an interactive pdb session. Note that this can be tricky when running with multiple workers, so you may want to test with a single environment (see comment in config/ocean/ctf.ini).

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
artifacts		artifacts
box2d-linux-amd64		box2d-linux-amd64
box2d-web		box2d-web
build		build
examples		examples
experiments		experiments
pufferlib.egg-info		pufferlib.egg-info
pufferlib		pufferlib
raylib-5.5_linux_amd64		raylib-5.5_linux_amd64
raylib-5.5_webassembly		raylib-5.5_webassembly
scripts		scripts
tests		tests
.gitignore		.gitignore
DEMO.mp4		DEMO.mp4
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
config		config
puffer_ctf.png		puffer_ctf.png
pyproject.toml		pyproject.toml
resources		resources
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pufferquaticus — APM_5DA01_TP

Setup

Training and evaluation

Project

Ideas

Working in this repo

Tips

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Pufferquaticus — APM_5DA01_TP

Setup

Training and evaluation

Project

Ideas

Working in this repo

Tips

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages