Skip to content

Make MoE dispatch/MLP expert-axis batch sharding configurable (fix Mixtral EP throughput)#4179

Open
gulsumgudukbay wants to merge 5 commits into
AI-Hypercomputer:mainfrom
ROCm:fix-moe-expert-parallel-sharding
Open

Make MoE dispatch/MLP expert-axis batch sharding configurable (fix Mixtral EP throughput)#4179
gulsumgudukbay wants to merge 5 commits into
AI-Hypercomputer:mainfrom
ROCm:fix-moe-expert-parallel-sharding

Test both flag states for MoE dispatch sharding; drop deepseek3-671b

edb3286
Select commit
Loading
Failed to load commit list.
Google CLA / cla/google succeeded Jun 18, 2026 in 6s

✅ All contributors are covered under a CLA with Google

See https://cla.developers.google.com/ for more info about Google's Contributor License Agreement (CLA).

ℹ️ Googlers: Go here to view more details and manage scans for this pull request.

Details

The following contributors were found for this pull request:

edb3286 Author: @gulsumgudukbay <gu****ay​@gmail.com>

(Only the first commit for a unique contributor is listed.)