[Backend][Relax] Add NPU BYOC backend example by Aristide021 · Pull Request #19425 · apache/tvm

Aristide021 · 2026-04-20T15:20:19Z

Supersedes #18247. Per maintainer guidance, resubmitting as a fresh PR due to CI workflow changes affecting old PRs.

Summary

This PR adds an example NPU BYOC backend for Relax, including end-to-end integration points:

pattern registration (python/tvm/relax/backend/contrib/example_npu/patterns.py)
backend registration (python/tvm/relax/backend/contrib/example_npu/__init__.py)
codegen entrypoint (src/relax/backend/contrib/example_npu/codegen.cc)
runtime module (src/runtime/contrib/example_npu/example_npu_runtime.cc)
CMake integration (cmake/modules/contrib/ExampleNPU.cmake, CMakeLists.txt, cmake/modules/LibInfo.cmake, src/support/libinfo.cc)
tutorial/docs (docs/how_to/tutorials/byoc_npu_example.py, README under contrib path)
tests (tests/python/contrib/test_example_npu.py)
CI build config enablement (tests/scripts/task_config_build_cpu.sh)

Review feedback addressed from #18247

Test location under tests/python/contrib/
README includes explicit enable instructions for USE_EXAMPLE_NPU_CODEGEN and USE_EXAMPLE_NPU_RUNTIME
README quick-start uses inline MatmulReLU (no import from test module)
Added CMake source wiring and feature flags for runtime/codegen
Added docs tutorial under docs/how_to/tutorials/ (not only README)
Reorganized motivation/context section near top of README
Extended pattern coverage to include example_npu.softmax with tests

Validation

Local checks run:

pre-commit on touched files (pass)
PYTHONPATH=python python -m pytest -q tests/python/contrib/test_example_npu.py (pass)

Notes

This backend is an example/tutorial implementation (CPU-emulated) intended to document modern NPU-oriented BYOC integration patterns and provide a reference path for future hardware-specific backends.

Adds a vendor-neutral example NPU backend demonstrating the BYOC (Bring Your Own Codegen) pattern for custom accelerator integration in TVM's Relax framework. Components added: - python/tvm/relax/backend/contrib/example_npu/: pattern registry with op support for matmul, conv1d/2d, depthwise conv2d, pooling, batch norm, softmax, activations, elementwise ops, quantization, and a fused conv2d+relu pattern - src/relax/backend/contrib/example_npu/codegen.cc: JSON serializer registered as relax.ext.example_npu - src/runtime/contrib/example_npu/example_npu_runtime.cc: JSON runtime demonstrating NPU architectural concepts (memory hierarchy, tiling, execution engines, quantization) via CPU emulation - cmake/modules/contrib/ExampleNPU.cmake: build integration via USE_EXAMPLE_NPU_CODEGEN and USE_EXAMPLE_NPU_RUNTIME flags - docs/how_to/tutorials/byoc_npu_example.py: tutorial walking through the full BYOC flow from pattern registration to runtime execution - tests/python/contrib/test_example_npu.py: test suite covering pattern registration, graph partitioning, codegen, and end-to-end execution CI is enabled via tests/scripts/task_config_build_cpu.sh. Addresses reviewer feedback from apache#18247: cmake integration, self- contained README with build instructions, tutorial in docs/how_to, and Context section reorganization.

gemini-code-assist

Code Review

This pull request introduces an example NPU backend for TVM's Relax framework, providing a JSON-based codegen, a C++ runtime that demonstrates architectural concepts like memory hierarchy and tiling, and Relax pattern registrations for offloading operations. The reviewer identified a potential division-by-zero bug in the quantization parameter calculation, suggested replacing placeholder logic in depthwise convolution checks with actual attribute verification, and recommended optimizing global function lookups in the compiler by moving them outside of loops.

…mple Fix three issues identified in automated code review of apache#19425: - Fix division-by-zero in CalculateQuantizationParams when all tensor values are identical (zero range); clamp scale floor to 1e-7f, guard against empty input, and use std::round for zero_point accuracy - Implement actual groups attribute check in _check_depthwise instead of relying solely on placeholder constraints; demonstrates how to access op attributes from PatternCheckContext - Move GetGlobalRequired lookup outside the compiler loop in codegen.cc so the registry hash-map is queried once rather than per-function

Fully qualify make_object as tvm::ffi::make_object to fix GCC build failure on CI. Clang accepted the unqualified form as a C++20 extension but GCC requires explicit namespace resolution.

gemini-code-assist bot reviewed Apr 20, 2026

View reviewed changes

Comment thread src/runtime/contrib/example_npu/example_npu_runtime.cc

Comment thread python/tvm/relax/backend/contrib/example_npu/patterns.py

Comment thread src/relax/backend/contrib/example_npu/codegen.cc

Aristide021 marked this pull request as ready for review April 20, 2026 15:49

[Backend][Relax] Fix make_object namespace qualification in NPU runtime

095275b

Fully qualify make_object as tvm::ffi::make_object to fix GCC build failure on CI. Clang accepted the unqualified form as a C++20 extension but GCC requires explicit namespace resolution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Backend][Relax] Add NPU BYOC backend example#19425

[Backend][Relax] Add NPU BYOC backend example#19425
Aristide021 wants to merge 3 commits intoapache:mainfrom
Aristide021:contrib-npu-generic-v2

Aristide021 commented Apr 20, 2026

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Aristide021 commented Apr 20, 2026

Summary

Review feedback addressed from #18247

Validation

Notes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant