feat: multi-gpu inference, trajectory analyzer #2

andre15silva · 2025-12-20T17:07:24Z

Most of the PR is the trajectory analyzer, but this also includes the changes to the dependencies that made it possible to run inference outisde of apptainer and on the full 8 GPU node with no issues from vllm coming up.

@BjarniHaukur you should be able to just uv sync and run it. See benchmarks/swe_bench/run_harness_eval.sh for the batchscript.

I created a pyproject from scratch since I was facing some version conflicts, but haven't added training dependencies back to it. Let me know if adding them to the new setup is not enough, we can debug if that's not the case. After that we can make a run to see if the parallelism is also working for training.

BjarniHaukur

Looks good.

Had to do some ad-hoc change to get the infer script running. (and to run vllm serve myself)

Probably some artifact I can fix on my end. Possibly something with the pyproject.

I verified whether the performance degradation I observed with increased parallelism was still in play. And excitingly it does not look like it is! (8 concurrent and 32 concurrent got the same score)

🏆 Current Leaderboard

Performance on SWE-bench Lite subset, ranked by code similarity

#	Ver	Model	Code Sim	Test Sim	Tokens	Tools
11	v5.0.1	qwen3-32b	0.276	0.000	4,409 / 16,384	23.2 / 100
12	v5.0.1	qwen3-32b	0.273	0.005	5,514 / 16,384	32.1 / 100
13	v3.2.0	qwen-2.5-72b-instruct	0.272	0.000	5,873 / 16,384	35.1 / 100
14	v3.2.0	qwen3-32b	0.255	0.000	5,281 / 16,384	28.3 / 100
15	v3.2.0	llama-4-maverick	0.255	0.000	4,647 / 16,384	10.4 / 100

BjarniHaukur · 2025-12-26T16:36:33Z

benchmarks/swe_bench_nano_infer_job.sh

+#SBATCH --array=0

 set -euo pipefail



Suggested change

# === NSC Cluster Setup ===

# Load GCC build environment for Triton JIT compilation

module load buildenv-gcccuda/12.1.1-gcc12.3.0

# Add Python 3.11 headers (required for Triton to compile cuda_utils)

# These headers were extracted from Python source since python3.11-devel is not installed

export CPATH="$HOME/.local/include/python3.11:${CPATH:-}"

had to add something like this to get it working. Obviously a hack.

YourName and others added 11 commits December 2, 2025 13:52

update inference scripts

f2506b9

add support for running envs in apptainer

c7ea79e

fix swebench harness script

87ffa85

add matplotlib

17742e0

update nano agent, use setup function

5ccf313

update nano agent script

af0d377

clean repo state

be91855

trajectory analyzer

ec8b06e

update dependencies to include vllm

8d710fe

keep old pyproject for reference

fa17d7d

remove unused scripts

f8e745a

andre15silva requested a review from BjarniHaukur December 20, 2025 17:07

BjarniHaukur reviewed Dec 26, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: multi-gpu inference, trajectory analyzer #2

feat: multi-gpu inference, trajectory analyzer #2

andre15silva commented Dec 20, 2025

Uh oh!

BjarniHaukur left a comment

Uh oh!

BjarniHaukur Dec 26, 2025

Uh oh!

BjarniHaukur Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

+# === NSC Cluster Setup ===
+# Load GCC build environment for Triton JIT compilation
+module load buildenv-gcccuda/12.1.1-gcc12.3.0
+# Add Python 3.11 headers (required for Triton to compile cuda_utils)
+# These headers were extracted from Python source since python3.11-devel is not installed
+export CPATH="$HOME/.local/include/python3.11:${CPATH:-}"

feat: multi-gpu inference, trajectory analyzer #2

Are you sure you want to change the base?

feat: multi-gpu inference, trajectory analyzer #2

Conversation

andre15silva commented Dec 20, 2025

Uh oh!

BjarniHaukur left a comment

Choose a reason for hiding this comment

🏆 Current Leaderboard

Uh oh!

BjarniHaukur Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

BjarniHaukur Dec 26, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants