Skip to content

riscv: optimize im2col gemm convolution#6740

Open
nihui wants to merge 4 commits into
Tencent:masterfrom
nihui:opt-riscv-packed-1
Open

riscv: optimize im2col gemm convolution#6740
nihui wants to merge 4 commits into
Tencent:masterfrom
nihui:opt-riscv-packed-1

Conversation

@nihui
Copy link
Copy Markdown
Member

@nihui nihui commented May 23, 2026

No description provided.

@github-actions github-actions Bot added the riscv label May 23, 2026
@tencent-adm
Copy link
Copy Markdown
Member

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 23, 2026

Codecov Report

❌ Patch coverage is 88.45619% with 166 lines in your changes missing coverage. Please review.
✅ Project coverage is 95.43%. Comparing base (cda90e0) to head (0ed6426).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/layer/riscv/convolution_im2col_gemm.h 87.90% 162 Missing ⚠️
src/layer/riscv/convolution_riscv.cpp 96.36% 2 Missing ⚠️
src/layer/riscv/convolution_riscv_zfh.cpp 95.45% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6740      +/-   ##
==========================================
- Coverage   95.72%   95.43%   -0.29%     
==========================================
  Files         965      944      -21     
  Lines      404025   406989    +2964     
==========================================
+ Hits       386736   388425    +1689     
- Misses      17289    18564    +1275     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0ed6426939

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

{
conv1x1s1_sgemm_packn_rvv(bottom_blob_bordered, top_blob, weight_data_tm, bias_data, opt);
if ((opt.use_winograd63_convolution && num_input >= packn * 2 && num_output >= packn * 2 && num_input <= packn * 16 && num_output <= packn * 16) || (!opt.use_winograd43_convolution && !opt.use_winograd23_convolution))
conv3x3s1_winograd63_rvv(bottom_blob_bordered, top_blob, weight_winograd63_data, bias_data, nT, opt);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Propagate winograd execution failures to caller

The winograd kernels now return an error code (int) and can fail (for example, returning -100 on workspace allocation failure in convolution_3x3_winograd.h), but this call site ignores that return value and always continues to activation before returning success. In low-memory or large-tile cases this can silently report success with invalid output instead of propagating the failure.

Useful? React with 👍 / 👎.

{
conv1x1s1_sgemm_packn_fp16sa_rvv(bottom_blob_bordered, top_blob, weight_data_tm, bias_data_fp16, opt);
if ((opt.use_winograd63_convolution && num_input >= packn * 2 && num_output >= packn * 2 && num_input <= packn * 16 && num_output <= packn * 16) || (!opt.use_winograd43_convolution && !opt.use_winograd23_convolution))
conv3x3s1_winograd63_fp16sa_rvv(bottom_blob_bordered, top_blob, weight_winograd63_data, bias_data_fp16, nT, opt);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Check fp16 winograd return code before returning success

The fp16 winograd path has the same regression: conv3x3s1_winograd*_fp16sa_rvv now returns int and can return allocation errors, but this result is ignored here and the function returns 0 after activation. That masks runtime failures (e.g., workspace allocation failure) and can produce incorrect outputs while signaling success.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants