Skip to content

[module] clean mtp code#11

Merged
Jintao-Huang merged 5 commits intomodelscope:mainfrom
Jintao-Huang:clear_mtp_code
Apr 2, 2026
Merged

[module] clean mtp code#11
Jintao-Huang merged 5 commits intomodelscope:mainfrom
Jintao-Huang:clear_mtp_code

Conversation

@Jintao-Huang
Copy link
Copy Markdown
Collaborator

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the package version to 1.1.0.dev0 and introduces changes to the GPT model and patching logic, including support for multimodal context parallelism and a refactor of the Multi-Token Prediction (MTP) loss calculation. Review feedback identifies critical issues in the MTP logic where the removal of a null check on labels could cause crashes and a lack of zero-division guards for token counts could lead to runtime errors. Additionally, it is recommended to split position_ids alongside input_ids during multimodal context parallel processing to avoid shape mismatches.

@Jintao-Huang
Copy link
Copy Markdown
Collaborator Author

/gemini review

@Jintao-Huang Jintao-Huang changed the title [module] clear mtp code [module] clean mtp code Apr 2, 2026
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the MCore-Bridge version to 1.1.0.dev0, improves documentation in README files, and introduces several functional enhancements. Key changes include adding support for keyword arguments in rotary positional embedding functions, implementing input splitting for multimodal models with context parallelism, and refactoring the Multi-Token Prediction (MTP) loss calculation logic. Regarding the review feedback, while the reviewer suggested a safer check for packed_seq_params, the current implementation using getattr with a default value is functionally equivalent and safe for the existing structure.

@Jintao-Huang Jintao-Huang merged commit 2dc426f into modelscope:main Apr 2, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant