Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
05b9404
Update gitignore
Victor-Jung Mar 12, 2026
5615ed4
XDNA2 Platform Beta Support
Victor-Jung Mar 12, 2026
d039415
Add XDNA container
Victor-Jung Mar 17, 2026
e66864a
First attempt at generating MLIR code with Deeploy
Victor-Jung Mar 18, 2026
d854846
Generate tiled code but too much logic is in the Template
Victor-Jung Mar 18, 2026
9f7db26
Move data movement in passes. Template represent for loop and aquire/…
Victor-Jung Mar 18, 2026
14f3ced
Template is agnostic of tiling and data movement that are handled by …
Victor-Jung Mar 18, 2026
b850b23
Add CI on self hosted runner
Victor-Jung Mar 19, 2026
d79e36f
Remove unecessary install
Victor-Jung Mar 19, 2026
7c995bc
Add cleanup step before checkout to fix permission
Victor-Jung Mar 19, 2026
fc2b364
aie import is optional to not enforce mlir-aie and llvm-aie package i…
Victor-Jung Mar 24, 2026
1865530
Decouple xdna requirements from dev requirements
Victor-Jung Mar 24, 2026
de6f961
Format
Victor-Jung Mar 24, 2026
01d458b
Format
Victor-Jung Mar 24, 2026
4427f5a
Add general todos for future refactoring
Victor-Jung Mar 26, 2026
a82fd52
Format
Victor-Jung Mar 26, 2026
381b1e3
Free output tasks in RT sequence
Victor-Jung May 7, 2026
f8c1eaa
Add -v flag for XDNA platform only.
Victor-Jung May 7, 2026
a20ec7f
Pin llvm-aie version
Victor-Jung May 7, 2026
1add41c
Add remainder loop to Add kernel
Victor-Jung May 7, 2026
7e8736e
Use workdir to easily discard build-time files
Victor-Jung May 7, 2026
acd5680
Align CI runner interface
Victor-Jung May 8, 2026
888fe40
Remove redundent XDNA mapping
Victor-Jung May 8, 2026
6454cc8
Rename XDNA2AddChecker into XDNA2FloatAddChecker
Victor-Jung May 8, 2026
630648a
Update license of new files to 2026
Victor-Jung May 8, 2026
4fbd7e2
Hoist mlir-aie and llvm-aie detection in cmake folder to prevent dupl…
Victor-Jung May 8, 2026
cde0c5d
Improve XDNA README clarity
Victor-Jung May 8, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions .github/workflows/_runner-xdna2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# SPDX-FileCopyrightText: 2026 ETH Zurich and University of Bologna
#
# SPDX-License-Identifier: Apache-2.0

---
name: _runner-xdna2

"on":
workflow_call:
inputs:
pytest-marker:
required: true
type: string
docker-image:
required: true
type: string

jobs:
test-runner-xdna2:
runs-on: xdna2-npu
# NOTE: We cannot use the `container:` directive here because
# GitHub Actions does not support `--device` flags required for
# NPU access (/dev/accel/accel0). Instead we use explicit
# `docker run` commands.
steps:
- name: Fix workspace permissions
shell: bash
run: |
docker run --rm \
-v "${{ github.workspace }}":/workspace \
${{ inputs.docker-image }} \
chown -R $(id -u):$(id -g) /workspace || true
Comment on lines +31 to +32
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify all call sites and current constraints for reusable workflow inputs
set -euo pipefail

rg -n -C2 'uses:\s*\./\.github/workflows/_runner-xdna2\.yml|pytest-marker:|docker-image:' .github/workflows
rg -n -C2 'inputs:\s*$|pytest-marker:|docker-image:' .github/workflows/_runner-xdna2.yml

Repository: pulp-platform/Deeploy

Length of output: 20728


🏁 Script executed:

cat -n .github/workflows/_runner-xdna2.yml

Repository: pulp-platform/Deeploy

Length of output: 1914


Validate and quote workflow inputs before using in docker and bash commands.

inputs.docker-image (lines 31, 48) is unquoted and directly interpolated as a docker image argument, allowing injection of docker flags (e.g., -v /etc:/etc). inputs.pytest-marker (line 52) is passed unvalidated to pytest inside bash. On a self-hosted runner with --device /dev/accel/accel0 and volume mounts to /opt/xilinx, this enables command injection with privileged access.

Add input validation before use and quote all interpolations:

Suggested hardening
 jobs:
   test-runner-xdna2:
     runs-on: xdna2-npu
     steps:
+      - name: Validate workflow inputs
+        shell: bash
+        env:
+          DOCKER_IMAGE: ${{ inputs.docker-image }}
+          PYTEST_MARKER_EXPR: ${{ inputs.pytest-marker }}
+        run: |
+          set -euo pipefail
+          [[ "$DOCKER_IMAGE" =~ ^[a-zA-Z0-9._/:`@-`]+$ ]] || { echo "Invalid docker-image"; exit 1; }
+          [[ "$PYTEST_MARKER_EXPR" =~ ^[a-zA-Z0-9_[:space:]()!-]+$ ]] || { echo "Invalid pytest-marker"; exit 1; }
+
       - name: Fix workspace permissions
         shell: bash
         run: |
           docker run --rm \
             -v "${{ github.workspace }}":/workspace \
-            ${{ inputs.docker-image }} \
+            "${{ inputs.docker-image }}" \
             chown -R $(id -u):$(id -g) /workspace || true
       
       - name: Run Tests in Docker
         shell: bash
+        env:
+          PYTEST_MARKER_EXPR: ${{ inputs.pytest-marker }}
         run: |
           docker run --rm \
             --device /dev/accel/accel0 \
             --ulimit memlock=-1 \
             -v /opt/xilinx:/opt/xilinx \
             -v "${{ github.workspace }}":/app/Deeploy \
             -w /app/Deeploy \
-            ${{ inputs.docker-image }} \
+            "${{ inputs.docker-image }}" \
             bash -c "
               pip install -e . &&
               cd DeeployTest &&
-              pytest test_platforms.py -v -m 'xdna2 and ${{ inputs.pytest-marker }}'
+              pytest test_platforms.py -v -m \"xdna2 and ${PYTEST_MARKER_EXPR}\"
             "
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In @.github/workflows/_runner-xdna2.yml around lines 31 - 32, Validate and
safely quote workflow inputs before they are used: reject or fail fast if
inputs.docker-image or inputs.pytest-marker contain unsafe characters (eg.
leading '-' or shell metacharacters) by testing them against a strict whitelist
regex (alphanumeric, ./:_@- and optional tag), then use quoted interpolations
everywhere they are used (e.g. "${{ inputs.docker-image }}" in the docker
run/exec command and "${{ inputs.pytest-marker }}" when calling pytest inside
bash) and avoid passing unvalidated values directly into shell or docker
arguments (also ensure any chown or other shell lines use quoted paths and &&/||
chaining safely instead of relying on interpolation). Ensure the validation step
is implemented early in the job and fails the run with a clear error if inputs
do not match the allowed pattern.


- name: Checkout Repo
uses: actions/checkout@v4
with:
submodules: recursive

- name: Run Tests in Docker
shell: bash
run: |
docker run --rm \
--device /dev/accel/accel0 \
--ulimit memlock=-1 \
-v /opt/xilinx:/opt/xilinx \
-v "${{ github.workspace }}":/app/Deeploy \
-w /app/Deeploy \
${{ inputs.docker-image }} \
bash -c "
pip install -e . &&
cd DeeployTest &&
pytest test_platforms.py -v -m 'xdna2 and ${{ inputs.pytest-marker }}'
"
31 changes: 31 additions & 0 deletions .github/workflows/ci-platform-xdna2.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# SPDX-FileCopyrightText: 2026 ETH Zurich and University of Bologna
#
# SPDX-License-Identifier: Apache-2.0

---
name: CI • XDNA2

"on":
push:
branches:
- "**"
tags:
- "v*.*.*"
pull_request:
workflow_dispatch:
inputs:
docker_image:
description: "XDNA2 Docker image (must be pre-built on the runner)"
required: false
default: "deeploy-xdna:local"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why use the local file and not ghcr.io/pulp-platform/deeploy-xdnba:devel"?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because right now the CI runs locally on one of my machine and I don't have collaborators. Once this platform becomes more mainstream I think it will be a good idea to publish a docker but until then it's overkill IMO. Especially considering that the XDNA platform has little dependencies.


concurrency:
group: ${{ github.workflow }}-${{ github.ref }}
cancel-in-progress: true

jobs:
xdna2-kernels:
uses: ./.github/workflows/_runner-xdna2.yml
with:
pytest-marker: "kernels"
docker-image: ${{ inputs.docker_image || 'deeploy-xdna:local' }}
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -57,3 +57,7 @@ CHANGELOG_GEN.md
# Container Artifacts
.pyusbip/
.cache/

# Claude context file
CLAUDE.md
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we sure we want this in the .gitignore? It could be useful if well-maintained.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now it's reasonable IMO. In a later PR someone could develop an AGENT.md (to make agnostic form Claude).

Container/xrt-debs/
17 changes: 17 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ elseif(platform STREQUAL SoftHier)
message(STATUS "Building for platform 'SoftHier'")
elseif(platform STREQUAL Chimera)
message(STATUS "Building for platform 'Chimera'")
elseif(platform STREQUAL XDNA2)
message(STATUS "Building for platform 'XDNA2'")
else()
message(FATAL_ERROR "Invalid platform '${platform}' specified!")
endif()
Expand Down Expand Up @@ -299,5 +301,20 @@ if(platform STREQUAL Chimera)

endif()

if(platform STREQUAL XDNA2)

project(${TESTNAME} LANGUAGES CXX)

message(STATUS "============================= XDNA2 Configuration ============================")
message(STATUS "[cMake ] GENERATED_SOURCE = " ${GENERATED_SOURCE})
message(STATUS "[cMake ] TESTNAME = " ${TESTNAME})
message(STATUS "==============================================================================")
message(STATUS "")

add_subdirectory(TargetLibraries/XDNA2)
add_subdirectory(DeeployTest/Platforms/XDNA2)

endif()


print_simulation_config()
56 changes: 56 additions & 0 deletions Container/Dockerfile.deeploy-xdna
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# SPDX-FileCopyrightText: 2026 ETH Zurich and University of Bologna
#
# SPDX-License-Identifier: Apache-2.0

FROM ubuntu:24.04

ARG DEBIAN_FRONTEND=noninteractive
ENV TZ=Etc/UTC
ENV LANG=C.UTF-8
ENV LC_ALL=C.UTF-8
ENV PIP_BREAK_SYSTEM_PACKAGES=1
ENV LLVM_INSTALL_DIR="nope"

WORKDIR /app/build

RUN apt-get update && apt-get install -y \
software-properties-common \
&& add-apt-repository -y ppa:amd-team/xrt \
&& apt-get update && apt-get install -y \
cmake \
ninja-build \
g++ \
git \
git-lfs \
python3 \
python3-pip \
python-is-python3 \
uuid-dev \
wget \
curl \
ccache \
libxrt2 \
libxrt-npu2 \
libxrt-dev \
libxrt-utils \
libxrt-utils-npu \
&& rm -rf /var/lib/apt/lists/*

ENV XILINX_XRT=/opt/xilinx/xrt
ENV PATH=${XILINX_XRT}/bin:${PATH}
ENV LD_LIBRARY_PATH=${XILINX_XRT}/lib

# Remove unused files and clean up to reduce image size
WORKDIR /app
RUN rm -rf /app/build

COPY pyproject.toml requirements-xdna.txt ./
RUN pip install toml-to-requirements && \
toml-to-req --toml-file pyproject.toml && \
pip install -r requirements.txt && \
pip install -r requirements-xdna.txt && \
rm -f requirements.txt pyproject.toml requirements-xdna.txt

ENV MLIR_AIE_PYTHON=/usr/bin/python3

WORKDIR /app/Deeploy
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, but can we align this to the other Dockerfile?

Start with

WORKDIR /app/build

and the end do

# Remove unused files and clean up to reduce image size
WORKDIR /app
RUN rm -rf /app/build

You don't have to remove individual files during the build steps, which could lead to forgetting some and polluting the work directory.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in 7e8736e

But also the XDNA Dockerfile does not build anything locally for now so there is nothing to cleanup. It may be useful in the future tough.

201 changes: 201 additions & 0 deletions Deeploy/MLIRDataTypes.py
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I like separating MLIR features from the others, but some of these generic abstractions seem very specific to XDNA.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, I think it would make sense to rename these things into MLIR-AIE instead of just MLIR to prevent ppl thinking this is generic MLIR.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more general issue is that an MLIR compiler for accelerator will always be quite tailored to the platform (unless one flow becomes the standard but I doubt it will happen in the next 5 years). Hence I think very few objects and abstractions will end up being labeled as 'generic MLIR'.

Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
# SPDX-FileCopyrightText: 2026 ETH Zurich and University of Bologna
#
# SPDX-License-Identifier: Apache-2.0
"""Base classes for MLIR-emitting node templates and code transformations.

This module provides:

* :class:`MLIRNodeTemplate` — a :class:`NodeTemplate` subclass whose
``emit()`` method populates an ``mlir.ir.Module`` instead of rendering C.
* :class:`MLIRExecutionBlock` — MLIR-specific execution state replacing the
C-oriented :class:`ExecutionBlock` (code-snippet deque) with MLIR builder
state (tile references, ObjectFifo handles, tiling parameters).
* :class:`MLIRCodeTransformationPass` — base class for MLIR code
transformation passes that operate on an :class:`MLIRExecutionBlock`.
* :class:`MLIRCodeTransformation` — two-phase pass container
(``devicePasses`` + ``runtimeSequencePasses``) that the deployer
orchestrates inside ``@aie_d.device`` and ``@aiex_d.runtime_sequence``
regions respectively.

All classes are intentionally dialect-agnostic so that future MLIR-based
backends (NVGPU, Linalg, …) can reuse them.
"""

from __future__ import annotations

from abc import abstractmethod
from typing import TYPE_CHECKING, Any, Dict, List, Optional, Tuple

from Deeploy.DeeployTypes import NodeTemplate

if TYPE_CHECKING:
from Deeploy.DeeployTypes import NetworkContext, OperatorRepresentation

# ======================================================================
# MLIRExecutionBlock
# ======================================================================


class MLIRExecutionBlock:
"""MLIR-specific execution state for a single operator.

Replaces the C-oriented :class:`ExecutionBlock` (which holds a deque of
:class:`CodeSnippet` objects) with fields that carry MLIR builder state
through the code-transformation pipeline.

Passes populate fields progressively:

1. The deployer sets ``computeTile``, ``shimTile``,
``operatorRepresentation``, and ``patternMemoryConstraint``.
2. A device-phase pass (e.g. ``MLIRObjectFifoPass``) fills
``fifoMap``, ``fifoTypes``, ``tileSize``, ``numTiles``,
``kernelFuncName``, and ``kernelObjFile``.
3. The deployer sets ``runtimeSequenceArgs`` before the runtime-
sequence phase.
4. A runtime-sequence pass (e.g. ``MLIRRuntimeSequencePass``) reads
all of the above to emit DMA configuration.
"""

def __init__(self, computeTile: Any = None, shimTile: Any = None) -> None:
# MLIR tile references (set by deployer)
self.computeTile: Any = computeTile
self.shimTile: Any = shimTile
Comment on lines +60 to +62
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this specific to XDNA?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed


# Operator metadata (set by deployer from parser)
self.operatorRepresentation: OperatorRepresentation = {}

# Tiling constraint from midend solver (may be None)
self.patternMemoryConstraint: Any = None

# Populated by device-phase passes (e.g. MLIRObjectFifoPass)
self.fifoMap: Dict[str, str] = {} # tensor name → FIFO name
self.fifoTypes: Dict[str, Any] = {} # tensor name → MemRefType
self.tileSize: int = 0
self.numTiles: int = 0
self.numElements: int = 0
self.kernelFuncName: Optional[str] = None
self.kernelObjFile: Optional[str] = None

# The MLIRNodeTemplate for this node (set by deployer, called by
# MLIRComputeCorePass to emit the kernel call inside the core block)
self.template: Optional[Any] = None

# Set by deployer before runtime-sequence phase
self.runtimeSequenceArgs: List[Any] = []

# Input / output tensor name lists (set by deployer from parser)
self.inputNames: List[str] = []
self.outputNames: List[str] = []


# ======================================================================
# MLIRCodeTransformationPass / MLIRCodeTransformation
# ======================================================================


class MLIRCodeTransformationPass:
"""Base class for passes that transform an :class:`MLIRExecutionBlock`.

Subclasses override :meth:`apply` to read / mutate the block's fields
and optionally emit MLIR operations into the current insertion point.
"""

def apply(self, ctxt: NetworkContext, mlirBlock: MLIRExecutionBlock,
name: str) -> Tuple[NetworkContext, MLIRExecutionBlock]:
return ctxt, mlirBlock


class MLIRCodeTransformation:
"""Two-phase pass container for MLIR code transformations.

*devicePasses* run inside an ``@aie_d.device(...)`` region (ObjectFifo
creation, external-kernel declarations, …).

*runtimeSequencePasses* run inside an ``@aiex_d.runtime_sequence``
block (DMA configuration, token await, …).

The deployer calls :meth:`applyDevicePasses` and
:meth:`applyRuntimeSequencePasses` at the appropriate points.
"""

def __init__(self,
devicePasses: Optional[List[MLIRCodeTransformationPass]] = None,
runtimeSequencePasses: Optional[List[MLIRCodeTransformationPass]] = None) -> None:
self.devicePasses: List[MLIRCodeTransformationPass] = devicePasses or []
self.runtimeSequencePasses: List[MLIRCodeTransformationPass] = runtimeSequencePasses or []

def applyDevicePasses(self, ctxt: NetworkContext, mlirBlock: MLIRExecutionBlock,
name: str) -> Tuple[NetworkContext, MLIRExecutionBlock]:
for _pass in self.devicePasses:
ctxt, mlirBlock = _pass.apply(ctxt, mlirBlock, name)
return ctxt, mlirBlock

def applyRuntimeSequencePasses(self, ctxt: NetworkContext, mlirBlock: MLIRExecutionBlock,
name: str) -> Tuple[NetworkContext, MLIRExecutionBlock]:
for _pass in self.runtimeSequencePasses:
ctxt, mlirBlock = _pass.apply(ctxt, mlirBlock, name)
return ctxt, mlirBlock
Comment on lines +108 to +137
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The notion of device and runtime in this context seems specific to XDNA. Would it make sense to abstract it into a struct that encodes a list of passes for different domains?

And something like this:

def registerPasses(self, domain: str, passes: List[MLIRCodeTransformationPass])
 def transform(self,
               ctxt: NetworkContext,
               domain: str,
               name: str,
               verbose: CodeGenVerbosity = _NoVerbosity):
XDNA2Transformer.registerPasses('device', devicePasses)
XDNA2Transformer.registerPasses('runtime', runtimeSequencePasses)

Also note that the "apply" function is called transform in the equivalent CodeTransformationPass class.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of the pass registration, it's cleaner especially if we register pass after the object creation. The next PRs will touch a lot the MLIRDataTypes.py I plan to polish that when I land the spatial mapping, is that fine?



# ======================================================================
# MLIRNodeTemplate
# ======================================================================


class MLIRNodeTemplate(NodeTemplate):
"""NodeTemplate subclass that emits MLIR instead of C code.

Subclasses must override :meth:`emit` to add dialect operations to an
``mlir.ir.Module`` (or region / insertion point provided via *kwargs*).

``generate()`` is overridden as a convenience that constructs a
standalone module, calls :meth:`emit`, and returns the MLIR text.
The base-class ``alignToContext`` / ``hoistTransientBuffers`` hooks are
retained and work unchanged.
"""

def __init__(self):
# Empty Mako template — no C code is generated.
super().__init__("")

# ------------------------------------------------------------------
# Subclass API
# ------------------------------------------------------------------

@abstractmethod
def emit(self, operatorRepresentation: OperatorRepresentation, **kwargs) -> None:
"""Populate an MLIR module with the operations for this node.

The caller (typically the deployer) sets up an ``mlir.ir.Module``
with the appropriate device wrapper and passes dialect-specific
context through *kwargs* (e.g. insertion point, tile references,
ObjectFifo handles).

Parameters
----------
operatorRepresentation : OperatorRepresentation
The parser's node representation (buffer names, sizes, types …).
**kwargs
Dialect-specific context provided by the deployer.
"""
...

# ------------------------------------------------------------------
# NodeTemplate overrides
# ------------------------------------------------------------------

def generate(self, operatorRepresentation = {}, **kwargs) -> str:
"""Generate an MLIR string for this node.

This default implementation is a thin wrapper: it delegates to
:meth:`emit`. Deployers that need to build a single module from
multiple nodes should call :meth:`emit` directly with the shared
module context and then stringify the complete module themselves.

Returns
-------
str
MLIR text (printable module or fragment).
"""
self.emit(operatorRepresentation, **kwargs)
return ""
Comment thread
Victor-Jung marked this conversation as resolved.
Loading