-
Notifications
You must be signed in to change notification settings - Fork 42
Beta Support for XDNA2 Platform #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: devel
Are you sure you want to change the base?
Changes from all commits
05b9404
5615ed4
d039415
e66864a
d854846
9f7db26
14f3ced
b850b23
d79e36f
7c995bc
fc2b364
1865530
de6f961
01d458b
4427f5a
a82fd52
381b1e3
f8c1eaa
a20ec7f
1add41c
7e8736e
acd5680
888fe40
6454cc8
630648a
4fbd7e2
cde0c5d
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,53 @@ | ||
| # SPDX-FileCopyrightText: 2026 ETH Zurich and University of Bologna | ||
| # | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| --- | ||
| name: _runner-xdna2 | ||
|
|
||
| "on": | ||
| workflow_call: | ||
| inputs: | ||
| pytest-marker: | ||
| required: true | ||
| type: string | ||
| docker-image: | ||
| required: true | ||
| type: string | ||
|
|
||
| jobs: | ||
| test-runner-xdna2: | ||
| runs-on: xdna2-npu | ||
| # NOTE: We cannot use the `container:` directive here because | ||
| # GitHub Actions does not support `--device` flags required for | ||
| # NPU access (/dev/accel/accel0). Instead we use explicit | ||
| # `docker run` commands. | ||
| steps: | ||
| - name: Fix workspace permissions | ||
| shell: bash | ||
| run: | | ||
| docker run --rm \ | ||
| -v "${{ github.workspace }}":/workspace \ | ||
| ${{ inputs.docker-image }} \ | ||
| chown -R $(id -u):$(id -g) /workspace || true | ||
|
|
||
| - name: Checkout Repo | ||
| uses: actions/checkout@v4 | ||
| with: | ||
| submodules: recursive | ||
|
|
||
| - name: Run Tests in Docker | ||
| shell: bash | ||
| run: | | ||
| docker run --rm \ | ||
| --device /dev/accel/accel0 \ | ||
| --ulimit memlock=-1 \ | ||
| -v /opt/xilinx:/opt/xilinx \ | ||
| -v "${{ github.workspace }}":/app/Deeploy \ | ||
| -w /app/Deeploy \ | ||
| ${{ inputs.docker-image }} \ | ||
| bash -c " | ||
| pip install -e . && | ||
| cd DeeployTest && | ||
| pytest test_platforms.py -v -m 'xdna2 and ${{ inputs.pytest-marker }}' | ||
| " | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| # SPDX-FileCopyrightText: 2026 ETH Zurich and University of Bologna | ||
| # | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| --- | ||
| name: CI • XDNA2 | ||
|
|
||
| "on": | ||
| push: | ||
| branches: | ||
| - "**" | ||
| tags: | ||
| - "v*.*.*" | ||
| pull_request: | ||
| workflow_dispatch: | ||
| inputs: | ||
| docker_image: | ||
| description: "XDNA2 Docker image (must be pre-built on the runner)" | ||
| required: false | ||
| default: "deeploy-xdna:local" | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why use the local file and not
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because right now the CI runs locally on one of my machine and I don't have collaborators. Once this platform becomes more mainstream I think it will be a good idea to publish a docker but until then it's overkill IMO. Especially considering that the XDNA platform has little dependencies. |
||
|
|
||
| concurrency: | ||
| group: ${{ github.workflow }}-${{ github.ref }} | ||
| cancel-in-progress: true | ||
|
|
||
| jobs: | ||
| xdna2-kernels: | ||
| uses: ./.github/workflows/_runner-xdna2.yml | ||
| with: | ||
| pytest-marker: "kernels" | ||
| docker-image: ${{ inputs.docker_image || 'deeploy-xdna:local' }} | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -57,3 +57,7 @@ CHANGELOG_GEN.md | |
| # Container Artifacts | ||
| .pyusbip/ | ||
| .cache/ | ||
|
|
||
| # Claude context file | ||
| CLAUDE.md | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Are we sure we want this in the .gitignore? It could be useful if well-maintained.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For now it's reasonable IMO. In a later PR someone could develop an |
||
| Container/xrt-debs/ | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,56 @@ | ||
| # SPDX-FileCopyrightText: 2026 ETH Zurich and University of Bologna | ||
| # | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| FROM ubuntu:24.04 | ||
|
|
||
| ARG DEBIAN_FRONTEND=noninteractive | ||
| ENV TZ=Etc/UTC | ||
| ENV LANG=C.UTF-8 | ||
| ENV LC_ALL=C.UTF-8 | ||
| ENV PIP_BREAK_SYSTEM_PACKAGES=1 | ||
| ENV LLVM_INSTALL_DIR="nope" | ||
|
|
||
| WORKDIR /app/build | ||
|
|
||
| RUN apt-get update && apt-get install -y \ | ||
| software-properties-common \ | ||
| && add-apt-repository -y ppa:amd-team/xrt \ | ||
| && apt-get update && apt-get install -y \ | ||
| cmake \ | ||
| ninja-build \ | ||
| g++ \ | ||
| git \ | ||
| git-lfs \ | ||
| python3 \ | ||
| python3-pip \ | ||
| python-is-python3 \ | ||
| uuid-dev \ | ||
| wget \ | ||
| curl \ | ||
| ccache \ | ||
| libxrt2 \ | ||
| libxrt-npu2 \ | ||
| libxrt-dev \ | ||
| libxrt-utils \ | ||
| libxrt-utils-npu \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
|
|
||
| ENV XILINX_XRT=/opt/xilinx/xrt | ||
| ENV PATH=${XILINX_XRT}/bin:${PATH} | ||
| ENV LD_LIBRARY_PATH=${XILINX_XRT}/lib | ||
|
|
||
| # Remove unused files and clean up to reduce image size | ||
| WORKDIR /app | ||
| RUN rm -rf /app/build | ||
|
|
||
| COPY pyproject.toml requirements-xdna.txt ./ | ||
| RUN pip install toml-to-requirements && \ | ||
| toml-to-req --toml-file pyproject.toml && \ | ||
| pip install -r requirements.txt && \ | ||
| pip install -r requirements-xdna.txt && \ | ||
| rm -f requirements.txt pyproject.toml requirements-xdna.txt | ||
|
|
||
| ENV MLIR_AIE_PYTHON=/usr/bin/python3 | ||
|
|
||
| WORKDIR /app/Deeploy | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Nit, but can we align this to the other Dockerfile? Start with and the end do You don't have to remove individual files during the build steps, which could lead to forgetting some and polluting the work directory.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done in 7e8736e But also the XDNA Dockerfile does not build anything locally for now so there is nothing to cleanup. It may be useful in the future tough. |
||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. In general, I like separating MLIR features from the others, but some of these generic abstractions seem very specific to XDNA.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Makes sense, I think it would make sense to rename these things into MLIR-AIE instead of just MLIR to prevent ppl thinking this is generic MLIR.
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One more general issue is that an MLIR compiler for accelerator will always be quite tailored to the platform (unless one flow becomes the standard but I doubt it will happen in the next 5 years). Hence I think very few objects and abstractions will end up being labeled as 'generic MLIR'. |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,201 @@ | ||
| # SPDX-FileCopyrightText: 2026 ETH Zurich and University of Bologna | ||
| # | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| """Base classes for MLIR-emitting node templates and code transformations. | ||
|
|
||
| This module provides: | ||
|
|
||
| * :class:`MLIRNodeTemplate` — a :class:`NodeTemplate` subclass whose | ||
| ``emit()`` method populates an ``mlir.ir.Module`` instead of rendering C. | ||
| * :class:`MLIRExecutionBlock` — MLIR-specific execution state replacing the | ||
| C-oriented :class:`ExecutionBlock` (code-snippet deque) with MLIR builder | ||
| state (tile references, ObjectFifo handles, tiling parameters). | ||
| * :class:`MLIRCodeTransformationPass` — base class for MLIR code | ||
| transformation passes that operate on an :class:`MLIRExecutionBlock`. | ||
| * :class:`MLIRCodeTransformation` — two-phase pass container | ||
| (``devicePasses`` + ``runtimeSequencePasses``) that the deployer | ||
| orchestrates inside ``@aie_d.device`` and ``@aiex_d.runtime_sequence`` | ||
| regions respectively. | ||
|
|
||
| All classes are intentionally dialect-agnostic so that future MLIR-based | ||
| backends (NVGPU, Linalg, …) can reuse them. | ||
| """ | ||
|
|
||
| from __future__ import annotations | ||
|
|
||
| from abc import abstractmethod | ||
| from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple | ||
|
|
||
| from Deeploy.DeeployTypes import NodeTemplate | ||
|
|
||
| if TYPE_CHECKING: | ||
| from Deeploy.DeeployTypes import NetworkContext, OperatorRepresentation | ||
|
|
||
| # ====================================================================== | ||
| # MLIRExecutionBlock | ||
| # ====================================================================== | ||
|
|
||
|
|
||
| class MLIRExecutionBlock: | ||
| """MLIR-specific execution state for a single operator. | ||
|
|
||
| Replaces the C-oriented :class:`ExecutionBlock` (which holds a deque of | ||
| :class:`CodeSnippet` objects) with fields that carry MLIR builder state | ||
| through the code-transformation pipeline. | ||
|
|
||
| Passes populate fields progressively: | ||
|
|
||
| 1. The deployer sets ``computeTile``, ``shimTile``, | ||
| ``operatorRepresentation``, and ``patternMemoryConstraint``. | ||
| 2. A device-phase pass (e.g. ``MLIRObjectFifoPass``) fills | ||
| ``fifoMap``, ``fifoTypes``, ``tileSize``, ``numTiles``, | ||
| ``kernelFuncName``, and ``kernelObjFile``. | ||
| 3. The deployer sets ``runtimeSequenceArgs`` before the runtime- | ||
| sequence phase. | ||
| 4. A runtime-sequence pass (e.g. ``MLIRRuntimeSequencePass``) reads | ||
| all of the above to emit DMA configuration. | ||
| """ | ||
|
|
||
| def __init__(self, computeTile: Any = None, shimTile: Any = None) -> None: | ||
| # MLIR tile references (set by deployer) | ||
| self.computeTile: Any = computeTile | ||
| self.shimTile: Any = shimTile | ||
|
Comment on lines
+60
to
+62
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Isn't this specific to XDNA?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Indeed |
||
|
|
||
| # Operator metadata (set by deployer from parser) | ||
| self.operatorRepresentation: OperatorRepresentation = {} | ||
|
|
||
| # Tiling constraint from midend solver (may be None) | ||
| self.patternMemoryConstraint: Any = None | ||
|
|
||
| # Populated by device-phase passes (e.g. MLIRObjectFifoPass) | ||
| self.fifoMap: Dict[str, str] = {} # tensor name → FIFO name | ||
| self.fifoTypes: Dict[str, Any] = {} # tensor name → MemRefType | ||
| self.tileSize: int = 0 | ||
| self.numTiles: int = 0 | ||
| self.numElements: int = 0 | ||
| self.kernelFuncName: Optional[str] = None | ||
| self.kernelObjFile: Optional[str] = None | ||
|
|
||
| # The MLIRNodeTemplate for this node (set by deployer, called by | ||
| # MLIRComputeCorePass to emit the kernel call inside the core block) | ||
| self.template: Optional[Any] = None | ||
|
|
||
| # Set by deployer before runtime-sequence phase | ||
| self.runtimeSequenceArgs: List[Any] = [] | ||
|
|
||
| # Input / output tensor name lists (set by deployer from parser) | ||
| self.inputNames: List[str] = [] | ||
| self.outputNames: List[str] = [] | ||
|
|
||
|
|
||
| # ====================================================================== | ||
| # MLIRCodeTransformationPass / MLIRCodeTransformation | ||
| # ====================================================================== | ||
|
|
||
|
|
||
| class MLIRCodeTransformationPass: | ||
| """Base class for passes that transform an :class:`MLIRExecutionBlock`. | ||
|
|
||
| Subclasses override :meth:`apply` to read / mutate the block's fields | ||
| and optionally emit MLIR operations into the current insertion point. | ||
| """ | ||
|
|
||
| def apply(self, ctxt: NetworkContext, mlirBlock: MLIRExecutionBlock, | ||
| name: str) -> Tuple[NetworkContext, MLIRExecutionBlock]: | ||
| return ctxt, mlirBlock | ||
|
|
||
|
|
||
| class MLIRCodeTransformation: | ||
| """Two-phase pass container for MLIR code transformations. | ||
|
|
||
| *devicePasses* run inside an ``@aie_d.device(...)`` region (ObjectFifo | ||
| creation, external-kernel declarations, …). | ||
|
|
||
| *runtimeSequencePasses* run inside an ``@aiex_d.runtime_sequence`` | ||
| block (DMA configuration, token await, …). | ||
|
|
||
| The deployer calls :meth:`applyDevicePasses` and | ||
| :meth:`applyRuntimeSequencePasses` at the appropriate points. | ||
| """ | ||
|
|
||
| def __init__(self, | ||
| devicePasses: Optional[List[MLIRCodeTransformationPass]] = None, | ||
| runtimeSequencePasses: Optional[List[MLIRCodeTransformationPass]] = None) -> None: | ||
| self.devicePasses: List[MLIRCodeTransformationPass] = devicePasses or [] | ||
| self.runtimeSequencePasses: List[MLIRCodeTransformationPass] = runtimeSequencePasses or [] | ||
|
|
||
| def applyDevicePasses(self, ctxt: NetworkContext, mlirBlock: MLIRExecutionBlock, | ||
| name: str) -> Tuple[NetworkContext, MLIRExecutionBlock]: | ||
| for _pass in self.devicePasses: | ||
| ctxt, mlirBlock = _pass.apply(ctxt, mlirBlock, name) | ||
| return ctxt, mlirBlock | ||
|
|
||
| def applyRuntimeSequencePasses(self, ctxt: NetworkContext, mlirBlock: MLIRExecutionBlock, | ||
| name: str) -> Tuple[NetworkContext, MLIRExecutionBlock]: | ||
| for _pass in self.runtimeSequencePasses: | ||
| ctxt, mlirBlock = _pass.apply(ctxt, mlirBlock, name) | ||
| return ctxt, mlirBlock | ||
|
Comment on lines
+108
to
+137
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The notion of device and runtime in this context seems specific to XDNA. Would it make sense to abstract it into a struct that encodes a list of passes for different domains? And something like this: def registerPasses(self, domain: str, passes: List[MLIRCodeTransformationPass]) def transform(self,
ctxt: NetworkContext,
domain: str,
name: str,
verbose: CodeGenVerbosity = _NoVerbosity):XDNA2Transformer.registerPasses('device', devicePasses)
XDNA2Transformer.registerPasses('runtime', runtimeSequencePasses)Also note that the "apply" function is called
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like the idea of the pass registration, it's cleaner especially if we register pass after the object creation. The next PRs will touch a lot the |
||
|
|
||
|
|
||
| # ====================================================================== | ||
| # MLIRNodeTemplate | ||
| # ====================================================================== | ||
|
|
||
|
|
||
| class MLIRNodeTemplate(NodeTemplate): | ||
| """NodeTemplate subclass that emits MLIR instead of C code. | ||
|
|
||
| Subclasses must override :meth:`emit` to add dialect operations to an | ||
| ``mlir.ir.Module`` (or region / insertion point provided via *kwargs*). | ||
|
|
||
| ``generate()`` is overridden as a convenience that constructs a | ||
| standalone module, calls :meth:`emit`, and returns the MLIR text. | ||
| The base-class ``alignToContext`` / ``hoistTransientBuffers`` hooks are | ||
| retained and work unchanged. | ||
| """ | ||
|
|
||
| def __init__(self): | ||
| # Empty Mako template — no C code is generated. | ||
| super().__init__("") | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # Subclass API | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| @abstractmethod | ||
| def emit(self, operatorRepresentation: OperatorRepresentation, **kwargs) -> None: | ||
| """Populate an MLIR module with the operations for this node. | ||
|
|
||
| The caller (typically the deployer) sets up an ``mlir.ir.Module`` | ||
| with the appropriate device wrapper and passes dialect-specific | ||
| context through *kwargs* (e.g. insertion point, tile references, | ||
| ObjectFifo handles). | ||
|
|
||
| Parameters | ||
| ---------- | ||
| operatorRepresentation : OperatorRepresentation | ||
| The parser's node representation (buffer names, sizes, types …). | ||
| **kwargs | ||
| Dialect-specific context provided by the deployer. | ||
| """ | ||
| ... | ||
|
|
||
| # ------------------------------------------------------------------ | ||
| # NodeTemplate overrides | ||
| # ------------------------------------------------------------------ | ||
|
|
||
| def generate(self, operatorRepresentation = {}, **kwargs) -> str: | ||
| """Generate an MLIR string for this node. | ||
|
|
||
| This default implementation is a thin wrapper: it delegates to | ||
| :meth:`emit`. Deployers that need to build a single module from | ||
| multiple nodes should call :meth:`emit` directly with the shared | ||
| module context and then stringify the complete module themselves. | ||
|
|
||
| Returns | ||
| ------- | ||
| str | ||
| MLIR text (printable module or fragment). | ||
| """ | ||
| self.emit(operatorRepresentation, **kwargs) | ||
| return "" | ||
|
Victor-Jung marked this conversation as resolved.
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
Repository: pulp-platform/Deeploy
Length of output: 20728
🏁 Script executed:
Repository: pulp-platform/Deeploy
Length of output: 1914
Validate and quote workflow inputs before using in docker and bash commands.
inputs.docker-image(lines 31, 48) is unquoted and directly interpolated as a docker image argument, allowing injection of docker flags (e.g.,-v /etc:/etc).inputs.pytest-marker(line 52) is passed unvalidated to pytest inside bash. On a self-hosted runner with--device /dev/accel/accel0and volume mounts to/opt/xilinx, this enables command injection with privileged access.Add input validation before use and quote all interpolations:
Suggested hardening
jobs: test-runner-xdna2: runs-on: xdna2-npu steps: + - name: Validate workflow inputs + shell: bash + env: + DOCKER_IMAGE: ${{ inputs.docker-image }} + PYTEST_MARKER_EXPR: ${{ inputs.pytest-marker }} + run: | + set -euo pipefail + [[ "$DOCKER_IMAGE" =~ ^[a-zA-Z0-9._/:`@-`]+$ ]] || { echo "Invalid docker-image"; exit 1; } + [[ "$PYTEST_MARKER_EXPR" =~ ^[a-zA-Z0-9_[:space:]()!-]+$ ]] || { echo "Invalid pytest-marker"; exit 1; } + - name: Fix workspace permissions shell: bash run: | docker run --rm \ -v "${{ github.workspace }}":/workspace \ - ${{ inputs.docker-image }} \ + "${{ inputs.docker-image }}" \ chown -R $(id -u):$(id -g) /workspace || true - name: Run Tests in Docker shell: bash + env: + PYTEST_MARKER_EXPR: ${{ inputs.pytest-marker }} run: | docker run --rm \ --device /dev/accel/accel0 \ --ulimit memlock=-1 \ -v /opt/xilinx:/opt/xilinx \ -v "${{ github.workspace }}":/app/Deeploy \ -w /app/Deeploy \ - ${{ inputs.docker-image }} \ + "${{ inputs.docker-image }}" \ bash -c " pip install -e . && cd DeeployTest && - pytest test_platforms.py -v -m 'xdna2 and ${{ inputs.pytest-marker }}' + pytest test_platforms.py -v -m \"xdna2 and ${PYTEST_MARKER_EXPR}\" "🤖 Prompt for AI Agents