Skip to content

Make MoE dispatch/MLP expert-axis batch sharding configurable (fix Mixtral EP throughput)#4179

Open
gulsumgudukbay wants to merge 5 commits into
AI-Hypercomputer:mainfrom
ROCm:fix-moe-expert-parallel-sharding
Open

Make MoE dispatch/MLP expert-axis batch sharding configurable (fix Mixtral EP throughput)#4179
gulsumgudukbay wants to merge 5 commits into
AI-Hypercomputer:mainfrom
ROCm:fix-moe-expert-parallel-sharding