Skip to content

feat: update conv1d_update op for Qwen3-Next/Qwen3.5.#1291

Open
maojunx99 wants to merge 2 commits intojd-opensource:mainfrom
maojunx99:main
Open

feat: update conv1d_update op for Qwen3-Next/Qwen3.5.#1291
maojunx99 wants to merge 2 commits intojd-opensource:mainfrom
maojunx99:main

Conversation

@maojunx99
Copy link
Copy Markdown
Contributor

@maojunx99 maojunx99 commented Apr 16, 2026

Summary

  1. Modify the shape of conv_cache in kv_cache to reduce subsequent transpose and achieve more efficient computation.
  2. The new version of conv1d_update operator has been adapted for NPU.

Updates depend on the operator library: https://gitcode.com/xLLM-AI/torch_npu_ops/pull/13

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the causal_conv1d_update_v2 kernel for NPU and updates the Qwen3GatedDeltaNetBase implementation. Key feedback includes correcting an erroneous transpose in the prefill path that impacts tensor narrowing, fixing a naming convention violation for local variables, and removing a redundant batch variable. Additionally, a reshape operation in the decode path needs to be corrected to ensure the kernel receives the expected tensor dimensions.

Comment thread xllm/core/layers/npu_torch/qwen3_gated_delta_net_base.cpp
Comment thread xllm/core/layers/npu_torch/qwen3_gated_delta_net_base.cpp Outdated
Comment thread xllm/core/layers/npu_torch/qwen3_gated_delta_net_base.cpp Outdated
Comment thread xllm/core/layers/npu_torch/qwen3_gated_delta_net_base.cpp Outdated
@maojunx99 maojunx99 force-pushed the main branch 2 times, most recently from 53ed5d5 to a6ee25d Compare April 16, 2026 06:31
Comment thread xllm/core/kernels/ops_api.cpp Outdated
Comment thread xllm/core/kernels/param.h Outdated
Comment thread xllm/core/distributed_runtime/llm_engine.cpp
@yingxudeng yingxudeng changed the title update conv1d_updae op for Qwen3 Next/Qwen 3.5 feat: update conv1d_update op for Qwen3-Next/Qwen3.5. Apr 16, 2026
Comment thread xllm/core/layers/npu_torch/qwen3_gated_delta_net_base.cpp
Comment thread xllm/core/layers/npu_torch/qwen3_gated_delta_net_base.cpp Outdated
Comment thread xllm/core/kernels/ops_api.cpp Outdated
@maojunx99 maojunx99 force-pushed the main branch 2 times, most recently from 3e67b12 to 5afd0de Compare April 17, 2026 07:00
zhang-minchao
zhang-minchao previously approved these changes Apr 17, 2026
yingxudeng
yingxudeng previously approved these changes Apr 17, 2026
DongheJin
DongheJin previously approved these changes Apr 17, 2026
yingxudeng
yingxudeng previously approved these changes Apr 18, 2026
zhang-minchao
zhang-minchao previously approved these changes Apr 18, 2026
@maojunx99 maojunx99 dismissed stale reviews from zhang-minchao and yingxudeng via 623a1e9 April 19, 2026 13:48
yingxudeng
yingxudeng previously approved these changes Apr 19, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants