Skip to content

dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries#1180

Merged
cquil11 merged 11 commits intomainfrom
dsv4-b300-sglang-mtp-restore-configs
Apr 27, 2026
Merged

dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries#1180
cquil11 merged 11 commits intomainfrom
dsv4-b300-sglang-mtp-restore-configs

Conversation

@yhyang201
Copy link
Copy Markdown
Collaborator

Summary

  • Restore the TP4 ep=1 (conc 16-128) and TP4 ep=4 dp-attn (conc 64-512) search-space entries for dsv4-fp4-b300-sglang-mtp that were removed in commit 287ef26
  • Applies to both 1k1k and 8k1k seq-len-configs

Test plan

  • Verify matrix generation includes the restored entries

🤖 Generated with Claude Code

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@github-actions
Copy link
Copy Markdown
Contributor

Thanks for the contribution! For vLLM & SGLang, please ensure that your recipes is similar to the official vLLM recipes and/or the SGLang cookbook

If it is not, please create a PR first before we can merge your PR into the master branch. Let's ensure that the documentation is first class such that the entire ML community can benefit from your hard work! Thank you

PR authors are responsible for ensuring that after merging, all GitHub Action jobs fully pass. A lot of the time, failures are just flakes and simply re-running the failed jobs will fix it. If re-running failed jobs is attempted, PR authors are responsible for ensuring it passes. See GitHub's docs on re-running failed jobs: https://docs.github.com/en/actions/how-tos/manage-workflow-runs/re-run-workflows-and-jobs#re-running-failed-jobs-in-a-workflow

As a rule of thumb, generally, PR authors should request a review & get a PR approval from the respective companies' CODEOWNERS before requesting a review from core maintainers.

If additional help is needed, PR authors can reach out to core maintainers over Slack.

Copy link
Copy Markdown
Contributor

@claude claude Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, mechanical YAML config restoration.

Extended reasoning...

Overview

This PR adds 4 lines to .github/configs/nvidia-master.yaml, restoring two search-space entries (TP4 ep=1 conc 16-128, and TP4 ep=4 dp-attn conc 64-512) under dsv4-fp4-b300-sglang-mtp for both the 1k1k and 8k1k seq-len-configs. The entries follow the exact same schema as the existing TP8 entry and other entries throughout the file.

Security risks

None. This is a benchmark/CI search-space configuration file with no security-sensitive content (no auth, secrets, permissions, or external execution paths affected).

Level of scrutiny

Low. This is a pure config restoration — the PR description explicitly states these entries were previously removed in commit 287ef26 and are being put back. No logic changes, no new code paths.

Other factors

  • Single-file YAML diff with 4 added lines, no removals.
  • Bug hunting system found no issues.
  • Entries follow established schema patterns visible elsewhere in the file.

@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24962086988
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp
Pinned ref: f2dda86
Approval: not required (trusted collaborator).

@yhyang201
Copy link
Copy Markdown
Collaborator Author

perf-changelog.yaml hasn’t been updated. Remember to update perf-changelog.yaml before merging.

@yhyang201
Copy link
Copy Markdown
Collaborator Author

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp

@github-actions
Copy link
Copy Markdown
Contributor

@yhyang201 Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24962803567
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp
Pinned ref: 4190a94
Approval: not required (trusted collaborator).

yhyang201 and others added 2 commits April 27, 2026 01:39
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Resolve perf-changelog.yaml conflict: keep main's PR #1155 entry,
append our PR #1180 entry after it.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Apr 26, 2026

Hi @yhyang201 please hold off on triggering sweeps til we get CI unblocked

@cquil11
Copy link
Copy Markdown
Collaborator

cquil11 commented Apr 26, 2026

@Qiaolin-Yu Qiaolin-Yu self-assigned this Apr 26, 2026
@Qiaolin-Yu Qiaolin-Yu changed the title dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries [wip] dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries Apr 27, 2026
@Qiaolin-Yu
Copy link
Copy Markdown
Collaborator

/sweep test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp

@github-actions
Copy link
Copy Markdown
Contributor

@Qiaolin-Yu Kicking off a sweep.

Run: https://github.com/SemiAnalysisAI/InferenceX/actions/runs/24977935266
Command: test-config --config-files .github/configs/nvidia-master.yaml --config-keys dsv4-fp4-b300-sglang-mtp
Pinned ref: 74d5b69
Approval: not required (trusted collaborator).

@Qiaolin-Yu Qiaolin-Yu changed the title [wip] dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries dsv4-b300-sglang-mtp: restore TP4 and DP-attn search-space entries Apr 27, 2026
export SGLANG_OPT_USE_DEEPGEMM_MEGA_MOE=1
export SGLANG_OPT_FIX_HASH_MEGA_MOE=1
# DP-attn path: flashinfer_mxfp4 + DP-attn (covers conc 16-256).
export SGLANG_OPT_SWA_EVICT_DROP_PAGE_MARGIN=1
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine for now since we are in early stages of v4 support, but eventually we'd like these to just be default settings based on scenario in engine

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

--disable-flashinfer-autotune
--deepep-config "$DEEPEP_CONFIG"
--cuda-graph-max-bs 256
--schedule-conservativeness 2
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why --schedule-conservativeness 2?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still can be default, you can remove it

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@cquil11 cquil11 merged commit 6cca39e into main Apr 27, 2026
21 checks passed
@cquil11 cquil11 deleted the dsv4-b300-sglang-mtp-restore-configs branch April 27, 2026 20:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Development

Successfully merging this pull request may close these issues.

4 participants