[for 26.04_linux-nvidia]: Backport the arm-smmu-v3 kdump adoption series#460
Conversation
When transitioning to a kdump kernel, the primary kernel might have crashed while endpoint devices were actively bus-mastering DMA. Currently, the SMMU driver aggressively resets the hardware during probe by clearing CR0_SMMUEN and setting the Global Bypass Attribute (GBPA) to ABORT. In a kdump scenario, this aggressive reset is highly destructive: a) If GBPA is set to ABORT, in-flight DMA will be aborted, generating fatal PCIe AER or SErrors that may panic the kdump kernel b) If GBPA is set to BYPASS, in-flight DMA targeting some IOVAs will bypass the SMMU and corrupt the physical memory at those 1:1 mapped IOVAs. To safely absorb in-flight DMAs, a kdump kernel will have to leave SMMUEN=1 intact and avoid modifying STRTAB_BASE, allowing HW to continue translating in-flight DMAs reusing the crashed kernel's page tables until the endpoint device drivers probe and quiesce their respective hardware. However, the ARM SMMUv3 architecture specification states that updating the SMMU_STRTAB_BASE register while SMMUEN == 1 is UNPREDICTABLE or ignored. This leaves a kdump kernel no choice but to adopt the stream table from the crashed kernel. Introduce ARM_SMMU_OPT_KDUMP_ADOPT and adopt functions memremapping all the stream tables extracted from STRTAB_BASE and STRTAB_BASE_CFG. Note that the adoption of the crashed kernel's stream table follows certain strict rules, since the old stream table might be compromised. Thus, apply some basic validations against the values read from the registers. If tests fail, it means the stream table cannot be trusted, so toss it entirely. To avoid OOM due to a potentially corrupted stream table, the memremap for l2 tables is done on the kdump kernel's demand. The new option will be set in a following change. Fixes: b63b343 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel") Cc: stable@vger.kernel.org # v6.12+ Suggested-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/linux-iommu/cover.1779265413.git.nicolinc@nvidia.com/#t) Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
Though the kdump kernel adopts the crashed kernel's stream table, the iommu core will still try to attach each probed device to a default domain, which overwrites the adopted STE and breaks in-flight DMA from that device. Implement an is_attach_deferred() callback to prevent this. For each device that has STE.V=1 and STE.Cfg!=Abort in the adopted table, defer the default domain attachment, until the device driver explicitly requests it. Fixes: b63b343 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel") Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/linux-iommu/cover.1779265413.git.nicolinc@nvidia.com/#t) [jamien: Resolve context conflict around arm_smmu_remove_master() due to the different surrounding arm-smmu-v3 code in this tree.] Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
In kdump cases, the crashed kernel's CDs and page tables can be corrupted, which could trigger event spamming. Also, we cannot serve page requests. Skip the IRQ setup for EVTQ/PRIQ in arm_smmu_setup_irqs(). Skip their IRQ handler registration in unique-IRQ and combined-IRQ cases. Fixes: b63b343 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel") Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Kevin Tian <kevin.tian@intel.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/linux-iommu/cover.1779265413.git.nicolinc@nvidia.com/#t) Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
In kdump cases, the crashed kernel's CDs and page tables can be corrupted, which could trigger event spamming. Also, we cannot serve page requests. Skip the EVTQ/PRIQ setup entirely rather than enabling then disabling them. Also add some inline comments explaining that. Fixes: b63b343 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel") Cc: stable@vger.kernel.org # v6.12+ Suggested-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/linux-iommu/cover.1779265413.git.nicolinc@nvidia.com/#t) Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
When ARM_SMMU_OPT_KDUMP_ADOPT is detected, do not disable SMMUEN and skip the CR1/CR2/STRTAB_BASE update sequence in arm_smmu_device_reset(). Those register writes are all CONSTRAINED UNPREDICTABLE while CR0_SMMUEN==1, so leaving them intact lets in-flight DMAs continue to be translated by the adopted stream table. Initialize 'enables' to 0 so it can carry CR0_SMMUEN in kdump case. Then, preserve that when enabling the command queue. Clear latched gerror bits if necessary. Fixes: b63b343 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel") Cc: stable@vger.kernel.org # v6.12+ Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> Reviewed-by: Kevin Tian <kevin.tian@intel.com> (backported from https://lore.kernel.org/linux-iommu/cover.1779265413.git.nicolinc@nvidia.com/#t) Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
RMR bypass STEs are installed during SMMUv3 probe for StreamIDs listed by IORT RMR nodes. A normal boot switches the driver to a fresh stream table whose initial STEs abort, so those RMR SIDs need bypass entries before it becomes live. This preserves firmware/guest-owned traffic, including vSMMU guest MSI cases built around RMR-described SIDs. ARM_SMMU_OPT_KDUMP_ADOPT is the opposite case: the driver keeps SMMUEN set and adopts the crashed kernel's stream table, so RMR SIDs already have the only translation state known to be safe for active in-flight DMA. Replacing an adopted STE with bypass can turn translated DMA into physical DMA, then point it at the wrong memory. arm_smmu_make_bypass_ste() also rewrites the STE in place after clearing it first. While the table is live, a concurrent hardware STE fetch can observe V=0 or mixed old/new state. Leaving the adopted STE unmodified keeps the kdump kernel using the crashed kernel's translation. That gives the endpoint driver a chance to probe and quiesce the device. If the old STE was already abort or invalid, installing bypass would create new DMA permission; leaving it alone is a safer failure mode. Later domain setup still gets the RMR direct mappings through the reserved-region path. Fixes: b63b343 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel") Cc: stable@vger.kernel.org # v6.12+ Assisted-by: Codex:gpt-5.5 Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/linux-iommu/cover.1779265413.git.nicolinc@nvidia.com/#t) Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
arm_smmu_device_hw_probe() runs before arm_smmu_init_structures(), so it's natural to decide whether the kdump kernel must adopt the crashed kernel's stream table. Given that memremap is used to adopt the old stream table, set this option only on a coherent SMMU. And make sure SMMU isn't in Service Failure Mode. Fixes: b63b343 ("iommu/arm-smmu-v3: Abort all transactions if SMMU is enabled in kdump kernel") Cc: stable@vger.kernel.org # v6.12+ Reviewed-by: Kevin Tian <kevin.tian@intel.com> Reviewed-by: Jason Gunthorpe <jgg@nvidia.com> Signed-off-by: Nicolin Chen <nicolinc@nvidia.com> (backported from https://lore.kernel.org/linux-iommu/cover.1779265413.git.nicolinc@nvidia.com/#t) [jamien: Resolve context conflict around arm_smmu_device_hw_probe() due to the different probe layout in this tree.] Signed-off-by: Jamie Nguyen <jamien@nvidia.com>
PR Validation ReportPatchscan ✅ No Missing FixesAll cherry-picked commits checked — no missing upstream fixes found. PR Lint ❌ Errors foundDetailsChecking 7 commits...
Cherry-pick digest:
E: e8cf1157e3fe ("iommu/arm-smmu-v3: Add arm_smmu_kdump_ad"): diff MISMATCH with lore patch (add [Author: reason] annotation if intentional)
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local │ Referenced upstream / Patch subject │ Patch-ID │ Subject │ SoB chain │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ bc8165414c64 │ iommu/arm-smmu-v3: detect arm_smmu_opt_kdump_adopt in probe() │ match │ found │ ok, backporter: jamien │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 6b1ffcdc7fbc │ iommu/arm-smmu-v3: skip rmr bypass for kdump adoption │ match │ found │ ok, backporter: jamien │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 95ff03aefb39 │ iommu/arm-smmu-v3: retain cr0_smmuen during kdump device reset │ match │ found │ ok, backporter: jamien │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ bc26e06262ae │ iommu/arm-smmu-v3: skip evtq/priq setup in kdump kernel │ match │ found │ ok, backporter: jamien │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 253de3b4b42f │ iommu/arm-smmu-v3: do not enable evtq/priq interrupts in kdump k │ match │ found │ ok, backporter: jamien │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ ee8f2681aa9b │ iommu/arm-smmu-v3: implement is_attach_deferred() for kdump │ match │ found │ ok, backporter: jamien │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ e8cf1157e3fe │ iommu/arm-smmu-v3: add arm_smmu_kdump_adopt_strtab() for kdump │ MISMATCH │ found │ ok, backporter: jamien │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘
Lint: all checks passed.
|
clsotog
left a comment
There was a problem hiding this comment.
Acked-by: Carol L Soto <csoto@nvidia.com>
BaseOS Kernel ReviewSummaryNo functional defects found across the arm-smmu-v3 kdump series; all findings are documentation issues. The most notable are two Medium items in arm-smmu-v3.c: a comment about CR0_SMMUEN=1 updates being CONSTRAINED UNPREDICTABLE that omits the ARM_SMMU_OPT_KDUMP_ADOPT gating, and a commit message that fails to mention adoption is gated on is_kdump_kernel(). Findings: Critical: 0, High: 0, Medium: 2, Low: 10 Latest watcher review: open review Kernel deb build: successful (download debs, 4 files) Head: This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review. |
What This Fixes
This backports the arm-smmu-v3 kdump adoption series to avoid disrupting
in-flight DMA when the crash kernel boots.
Without this series, the crash kernel can reset/reprogram the SMMU while DMA
from the panicked kernel is still active. On the affected system this can show
up in BMC Redfish CPER logs as PCIe poisoned TLP and completion-timeout events
during kdump capture.
The fix lets the crash kernel detect that it is booting under kdump, adopt the
previous stream table, retain
CR0_SMMUENwhere needed, and defer attachmentfor devices whose live DMA mappings must not be disturbed during crash dump
collection.
Backported Patches
Backported from:
https://lore.kernel.org/linux-iommu/cover.1779265413.git.nicolinc@nvidia.com/#t
This PR includes:
iommu/arm-smmu-v3: Add arm_smmu_kdump_adopt_strtab() for kdumpiommu/arm-smmu-v3: Implement is_attach_deferred() for kdumpiommu/arm-smmu-v3: Do not enable EVTQ/PRIQ interrupts in kdump kerneliommu/arm-smmu-v3: Skip EVTQ/PRIQ setup in kdump kerneliommu/arm-smmu-v3: Retain CR0_SMMUEN during kdump device resetiommu/arm-smmu-v3: Skip RMR bypass for kdump adoptioniommu/arm-smmu-v3: Detect ARM_SMMU_OPT_KDUMP_ADOPT in probe()The backport required only context-level adjustments for the target tree.
Testing
Repro flow:
Test script used: nvbug5963602_repro.sh
The issue is timing-sensitive and difficult to reproduce reliably without
widening the crash-kernel SMMU window. For validation, I used a test-only delay
patch in the SMMU kdump path. The delay patch is not part of this PR.
Baseline confirmation:
6.17.13-b5963602-delayPoisoned_tlp_received=trueCompletion_timeout_status=trueBackport testing with the same repro approach plus the test-only delay:
for_26.04_linux-nvidia+ delay:Launchpad Bug
https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-7.0/+bug/2156531