dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries by yhyang201 · Pull Request #1180 · SemiAnalysisAI/InferenceX

yhyang201 · 2026-04-26T16:54:25Z

Summary

Restore the TP4 ep=1 (conc 16-128) and TP4 ep=4 dp-attn (conc 64-512) search-space entries for dsv4-fp4-b300-sglang-mtp that were removed in commit 287ef26
Applies to both 1k1k and 8k1k seq-len-configs

Test plan

Verify matrix generation includes the restored entries

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-actions · 2026-04-26T16:54:34Z

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

claude

LGTM, mechanical YAML config restoration.

Extended reasoning...

Overview

This PR adds 4 lines to .github/configs/nvidia-master.yaml, restoring two search-space entries (TP4 ep=1 conc 16-128, and TP4 ep=4 dp-attn conc 64-512) under dsv4-fp4-b300-sglang-mtp for both the 1k1k and 8k1k seq-len-configs. The entries follow the exact same schema as the existing TP8 entry and other entries throughout the file.

Security risks

None. This is a benchmark/CI search-space configuration file with no security-sensitive content (no auth, secrets, permissions, or external execution paths affected).

Level of scrutiny

Low. This is a pure config restoration — the PR description explicitly states these entries were previously removed in commit 287ef26 and are being put back. No logic changes, no new code paths.

Other factors

Single-file YAML diff with 4 added lines, no removals.
Bug hunting system found no issues.
Entries follow established schema patterns visible elsewhere in the file.

yhyang201 · 2026-04-26T16:59:28Z

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp

github-actions · 2026-04-26T16:59:37Z

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24962086988
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp
Pinned ref: f2dda86
Approval: not required (trusted collaborator).

yhyang201 · 2026-04-26T17:11:11Z

perf-changelog.yaml hasn’t been updated. Remember to update perf-changelog.yaml before merging.

yhyang201 · 2026-04-26T17:34:42Z

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp

github-actions · 2026-04-26T17:34:52Z

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24962803567
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp
Pinned ref: 4190a94
Approval: not required (trusted collaborator).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Resolve perf-changelog.yaml conflict: keep main's PR #1155 entry, append our PR #1180 entry after it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

cquil11 · 2026-04-26T20:15:48Z

Hi @yhyang201 please hold off on triggering sweeps til we get CI unblocked

cquil11 · 2026-04-26T21:04:09Z

Ok back @yhyang201 https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24962803567

Qiaolin-Yu · 2026-04-27T05:20:16Z

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp

github-actions · 2026-04-27T05:20:24Z

@Qiaolin-Yu Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24977935266
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp
Pinned ref: 74d5b69
Approval: not required (trusted collaborator).

cquil11 · 2026-04-27T14:12:20Z

-    export SGLANG_OPT_USE_DEEPGEMM_MEGA_MOE=1
-    export SGLANG_OPT_FIX_HASH_MEGA_MOE=1
+    # DP-attn path: flashinfer_mxfp4 + DP-attn (covers conc 16-256).
+    export SGLANG_OPT_SWA_EVICT_DROP_PAGE_MARGIN=1


This is fine for now since we are in early stages of v4 support, but eventually we'd like these to just be default settings based on scenario in engine

cquil11 · 2026-04-27T14:14:41Z

+        --disable-flashinfer-autotune
        --deepep-config "$DEEPEP_CONFIG"
+        --cuda-graph-max-bs 256
+        --schedule-conservativeness 2


Can you explain why --schedule-conservativeness 2?

Still can be default, you can remove it

…restore-configs

dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries

f2dda86

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

github-project-automation Bot added this to InferenceMAX Board Apr 26, 2026

yhyang201 requested a review from a team April 26, 2026 16:54

yhyang201 requested review from Qiaolin-Yu, jgangani and kedarpotdar-nv as code owners April 26, 2026 16:54

claude Bot reviewed Apr 26, 2026

View reviewed changes

Update nvidia-master.yaml

4190a94

yhyang201 and others added 2 commits April 27, 2026 01:39

dsv4-b300-sglang-mtp: add perf-changelog entry for PR #1180

dbde9fa

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge origin/main into dsv4-b300-sglang-mtp-restore-configs

0bbd08d

Resolve perf-changelog.yaml conflict: keep main's PR #1155 entry, append our PR #1180 entry after it. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Qiaolin-Yu self-assigned this Apr 26, 2026

Merge branch 'main' into dsv4-b300-sglang-mtp-restore-configs

1fadfce

Qiaolin-Yu changed the title ~~dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries~~ [wip] dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries Apr 27, 2026

hnyls2002 added 3 commits April 26, 2026 22:16

add dp-attn band; flashinfer + (1,1,2) chain

b8a625d

drop tp4 ep1 tp-only band

01d7a9f

restore tp4 ep1 band; conc 4-32

74d5b69

remove useless points

f66a2df

Qiaolin-Yu changed the title ~~[wip] dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries~~ dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries Apr 27, 2026

Qiaolin-Yu approved these changes Apr 27, 2026

View reviewed changes

cquil11 reviewed Apr 27, 2026

View reviewed changes

Qiaolin-Yu added 2 commits April 27, 2026 12:58

fix

e45c425

Merge remote-tracking branch 'origin/main' into dsv4-b300-sglang-mtp-…

d053ff6

…restore-configs

cquil11 approved these changes Apr 27, 2026

View reviewed changes

cquil11 merged commit 6cca39e into main Apr 27, 2026
21 checks passed

cquil11 deleted the dsv4-b300-sglang-mtp-restore-configs branch April 27, 2026 20:02

github-project-automation Bot moved this to Done in InferenceMAX Board Apr 27, 2026

Conversation

yhyang201 commented Apr 26, 2026

Summary

Test plan

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Overview

Security risks

Level of scrutiny

Other factors

Uh oh!

yhyang201 commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

yhyang201 commented Apr 26, 2026

Uh oh!

yhyang201 commented Apr 26, 2026

Uh oh!

github-actions Bot commented Apr 26, 2026

Uh oh!

cquil11 commented Apr 26, 2026

Uh oh!

cquil11 commented Apr 26, 2026

Uh oh!

Qiaolin-Yu commented Apr 27, 2026

Uh oh!

github-actions Bot commented Apr 27, 2026

Uh oh!

cquil11 Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Qiaolin-Yu Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

cquil11 Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

hnyls2002 Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Qiaolin-Yu Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants