Skip to content

chore(beep boop 🤖): Bump uv.lock (main, mcore-dev) (2026-05-20)#3903

Open
svcnvidia-nemo-ci wants to merge 1 commit into
mainfrom
bump-ci-container-2026-05-20-main-dev
Open

chore(beep boop 🤖): Bump uv.lock (main, mcore-dev) (2026-05-20)#3903
svcnvidia-nemo-ci wants to merge 1 commit into
mainfrom
bump-ci-container-2026-05-20-main-dev

Conversation

@svcnvidia-nemo-ci
Copy link
Copy Markdown
Contributor

🚀 PR to bump uv.lock in main.

🤖 This PR will be merged automatically once CI passes.

Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@svcnvidia-nemo-ci
Copy link
Copy Markdown
Contributor Author

/ok to test 8e84cec

@copy-pr-bot
Copy link
Copy Markdown

copy-pr-bot Bot commented May 20, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@yaoyu-33 yaoyu-33 added area:build Dependencies, packaging, images, and environment setup ci CI, automation, test queue, or workflow infrastructure work needs-review PR is ready for code review and waiting on a reviewer labels May 20, 2026
@yaoyu-33
Copy link
Copy Markdown
Contributor

MCore bump auto-fix status for dev:

Classification: MCore broke Bridge
Evidence: The current main/mcore-dev bump #3903 is still failing on 2026-05-20. gh run view 26156612801 --log-failed returned an empty log, so I used the Actions job-log API. Failed job L0_Launch_models_nemotron_omni / 76952734284 and L0_Launch_recipes_nemotron_omni / 76952734293 fail at src/megatron/bridge/models/nemotron_omni/nemotron_omni_provider.py:257 with TypeError: LLaVAModel.__init__() got an unexpected keyword argument 'dynamic_resolution'. In the checked-out bump tree, MCore dev commit bd5c98f7796259362a2e8c99b5c7b0d92b4e9d84 lacks the LLaVA constructor parameters present on MCore main commit 38986a98aae6a0cc4c8ae7b435db3288a890b0cb: dynamic_resolution, RADIO controls, sound model/projection fields, and temporal video fields. L0_Launch_models_wan / 76952734267 fails because MCore dev now calls DiTSelfAttention.get_query_key_value_tensors(..., head_wise_gate=...). gb200_L0_Launch_recipes_deepseek_fsdp / 76952734283 reaches validation but fails convergence with correlation=0.510130, final_loss_current=7.510771, final_loss_golden=8.127562, and failed metrics correlation, final_loss.
Fix PR: not opened
Guards: none added, none removed. I audited existing guard patterns and confirmed the relevant current guards use local inspect.signature(...) checks with TODO removal comments, but I did not add another guard because the Nemotron Omni LLaVA gap is broader than one kwarg and the DeepSeek convergence change is not safe to patch blindly.
Validation: Read Linear MB-401 first, re-read #3903 metadata/diff/checks/run state, inspected MCore compare cf081d5df0b34b665d214ff936f27489d7396876...bd5c98f7796259362a2e8c99b5c7b0d92b4e9d84 (15 commits ahead), checked for existing open mcore-dev-autofix / #3903 / dynamic_resolution / head_wise_gate / DiTSelfAttention / deepseek_fsdp fix PRs and found none, then checked out #3903 at 8e84cecca293d9dc428ccf6e80c85288fe04be16 locally. No code was changed, so no local or CW interactive tests were run. Handoff time: 2026-05-20 07:06 PDT.
Next action: Maintainer decision needed. A narrow Bridge PR can update WAN DiTSelfAttention.get_query_key_value_tensors to accept the new head_wise_gate kwarg, but that will not green #3903 by itself. MCore dev owners should align the LLaVA API/behavior with main for Nemotron Omni or confirm a Bridge-side compatibility wrapper strategy. DeepSeek/Megatron-FSDP owners should validate the GB200 convergence drift before any golden-value update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:build Dependencies, packaging, images, and environment setup ci CI, automation, test queue, or workflow infrastructure work full-test-suite needs-review PR is ready for code review and waiting on a reviewer

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants