Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
75bf601
Add vLLM DSv4 FP8 MI355X benchmark (vllm#40889 AITER MLA decode)
Oseltamivir Apr 26, 2026
fcfd3c0
Merge branch 'main' into dsv4-fp8-mi355x-vllm
Oseltamivir Apr 26, 2026
3793a9c
Use v0.19.1 base image and rebuild vLLM from PR branch
Oseltamivir Apr 26, 2026
4c1b22d
Add perf-changelog entry for dsv4-fp8-mi355x-vllm
Oseltamivir Apr 26, 2026
f0c6907
Switch to nightly image (v0.19.1 missing mori/libtorch_hip)
Oseltamivir Apr 26, 2026
b7d8728
Switch to ATOM MI355X image (ROCm 7.2.2) for GPU detection
Oseltamivir Apr 27, 2026
feeced6
Install setuptools-scm before vLLM build
Oseltamivir Apr 27, 2026
6e36713
Drop --no-deps and disable ATOM plugin for vLLM
Oseltamivir Apr 27, 2026
5336b24
Drop --force-reinstall to preserve ROCm torch
Oseltamivir Apr 27, 2026
a3b218b
Pin ROCm packages via constraint file during vLLM dep install
Oseltamivir Apr 27, 2026
cffca9f
Remove stale triton-kernels editable install before dep resolution
Oseltamivir Apr 27, 2026
bb2a840
Merge branch 'main' into dsv4-fp8-mi355x-vllm
Oseltamivir Apr 27, 2026
47a692f
Clean all stale /triton-test editable refs, not just triton-kernels
Oseltamivir Apr 27, 2026
5f61052
Filter out xgrammar from dep install to avoid stale /triton-test path
Oseltamivir Apr 27, 2026
42981cf
triton problem
Oseltamivir Apr 27, 2026
f03e084
next
Oseltamivir Apr 27, 2026
2f0323f
next
Oseltamivir Apr 27, 2026
ee3c7e9
next
Oseltamivir Apr 27, 2026
30005c5
amdsmi
Oseltamivir Apr 27, 2026
ee3db15
tilelang
Oseltamivir Apr 27, 2026
9577af4
lower conc
Oseltamivir Apr 27, 2026
3fb7c4d
Merge branch 'main' into dsv4-fp8-mi355x-vllm
Oseltamivir Apr 27, 2026
baca692
lower mem?
Oseltamivir Apr 27, 2026
aef9e6e
off chunked
Oseltamivir Apr 28, 2026
daf0edb
Merge branch 'main' into dsv4-fp8-mi355x-vllm
Oseltamivir Apr 28, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 24 additions & 0 deletions .github/configs/amd-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -1490,6 +1490,30 @@ dsv4-fp8-mi355x-sglang:
search-space:
- { tp: 8, conc-start: 4, conc-end: 64 }

# vLLM with AITER MLA decode for DSv4 on MI355X (vllm-project/vllm#40889,
# stacked on #40871). Uses the ATOM MI355X image (ROCm 7.2.2, aiter with
# MLA decode, MI355X GPU detection); vLLM is rebuilt from the PR branch
# at runtime by benchmarks/single_node/dsv4_fp8_mi355x_vllm.sh at a
# pinned SHA. Once both PRs merge into a release, switch to a vLLM ROCm
# MI355X image and remove the build step.
dsv4-fp8-mi355x-vllm:
image: rocm/atom:rocm7.2.2_ubuntu24.04_py3.12_pytorch_release_2.10.0_atom0.1.2.post
model: deepseek-ai/DeepSeek-V4-Pro
model-prefix: dsv4
runner: mi355x
precision: fp8
framework: vllm
multinode: false
seq-len-configs:
- isl: 1024
osl: 1024
search-space:
- { tp: 8, conc-start: 1, conc-end: 32 }
- isl: 8192
osl: 1024
search-space:
- { tp: 8, conc-start: 1, conc-end: 32 }

# Day-0 single-sequence marker for DeepSeek-V4 on ATOM (ROCm/ATOM#650).
# PR1 of the ATOM DSv4 series — single-sequence only (kv_cache[:1,...]
# hardcode), --enforce-eager required, ATOM_USE_TRITON_MOE=1 required on
Expand Down
Loading
Loading