[24.04-linux-nvidia-6.17] Backport: Mitigate TLBI errata on various Arm CPUs#455
[24.04-linux-nvidia-6.17] Backport: Mitigate TLBI errata on various Arm CPUs#455nvmochs wants to merge 5 commits into
Conversation
Add cputype definitions for C1-Ultra. These will be used for errata detection in subsequent patches. These values can be found in the C1-Ultra TRM: https://developer.arm.com/documentation/108014/0100/ ... in section A.5.1 ("MIDR_EL1, Main ID Register"). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> (backported from commit 60349e64a6c65f9f0aa118af711b3c7e137f07ff linux-next) [mochs: Minor context adjustment due to absent definitions] Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Add cputype definitions for C1-Premium. These will be used for errata detection in subsequent patches. These values can be found in the C1-Premium TRM: https://developer.arm.com/documentation/109416/0100/ ... in section A.5.1 ("MIDR_EL1, Main ID Register"). Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> (backported from commit d28413bfc5a255957241f1df5d7fd0c2cd74fe18 linux-next) [mochs: Minor context adjustment due to absent definitions] Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
A number of CPUs developed by Arm suffer from errata whereby a broadcast TLBI;DSB sequence may complete before the global observation of writes which are translated by an affected TLB entry. These errata ONLY affect the completion of memory accesses which have been translated by an invalidated TLB entry, and these errata DO NOT affect the actual invalidation of TLB entries. TLB entries are removed correctly. This issue has been assigned CVE ID CVE-2025-10263. To mitigate this issue, Arm recommends that software follows any affected TLBI;DSB sequence with an additional TLBI;DSB, which will ensure that all memory write effects affected by the first TLBI have been globally observed. The additional TLBI can use any operation that is broadcast to affected CPUs, and the additional DSB can use any option that is sufficient to complete the additional TLBI. The ARM64_WORKAROUND_REPEAT_TLBI workaround is sufficient to mitigate the issue. Enable this workaround for affected CPUs, and update the silicon errata documentation accordingly. Note that due to the manner in which Arm develops IP and tracks errata, some CPUs share a common erratum number. Signed-off-by: Mark Rutland <mark.rutland@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Signed-off-by: Will Deacon <will@kernel.org> (backported from commit cfd391e74134db664feb499d43af286380b10ba8 linux-next) [mochs: Minor context adjustment due to absent definitions] Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
NVIDIA Olympus cores are affected by the TLBI completion issue tracked as CVE-2025-10263. The existing ARM64_ERRATUM_4118414 handling already uses ARM64_WORKAROUND_REPEAT_TLBI to issue an additional broadcast TLBI;DSB sequence and ensure affected memory write effects are globally observed. Add MIDR_NVIDIA_OLYMPUS to the repeat-TLBI match list so the same mitigation is enabled on affected Olympus systems. Also document the NVIDIA Olympus erratum in the arm64 silicon errata table and list it in the Kconfig help text. Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Acked-by: Mark Rutland <mark.rutland@arm.com> Signed-off-by: Will Deacon <will@kernel.org> (cherry picked from commit ec7216f92e4ebd485b1c6dc6aa3f6064b71a5768 linux-next) Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Enable ARM64_ERRATUM_4118414 to mitigate CVE-2025-10263 on NVIDIA platforms. Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
|
Test results... 6.17 Grace: 6.17 Vera: |
PR Validation ReportPatchscan ✅ No Missing FixesAll cherry-picked commits checked — no missing upstream fixes found. PR Lint ❌ Errors foundDetailsChecking 5 commits...
Cherry-pick digest:
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local │ Referenced upstream / Patch subject │ Patch-ID │ Subject │ SoB chain │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 467d8487bee8 │ [SAUCE] nvidia: [config] enable arm64_erratum_4118414 │ N/A │ N/A │ mochs │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 67e54a3d6bf7 │ [SAUCE] arm64: errata: mitigate tlbi errata on nvidia olympus cp │ N/A │ N/A │ sdonthin, will, mochs │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ cfa9e97ccea2 │ [SAUCE] arm64: errata: mitigate tlbi errata on various arm cpus │ N/A │ N/A │ rutland, will, mochs │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ e6a989865842 │ [SAUCE] arm64: cputype: add c1-premium definitions │ N/A │ N/A │ rutland, will, mochs │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ cf69067289f3 │ [SAUCE] arm64: cputype: add c1-ultra definitions │ N/A │ N/A │ rutland, will, mochs │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘
Lint results:
E: 467d8487bee8 ("NVIDIA: [Config] Enable ARM64_ERRATUM_4118414"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)
E: 67e54a3d6bf7 ("arm64: errata: Mitigate TLBI errata on NVIDIA Olym"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)
E: cfa9e97ccea2 ("arm64: errata: Mitigate TLBI errata on various Arm"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)
E: e6a989865842 ("arm64: cputype: Add C1-Premium definitions"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)
E: cf69067289f3 ("arm64: cputype: Add C1-Ultra definitions"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)
|
@nirmoy Are these failing because I picked from linux-next? |
BaseOS Kernel ReviewSummaryNo functional bugs found; all findings are documentation/clarity issues. Notably, the TLBI erratum Kconfig help text imprecisely implies broken TLB invalidation rather than a completion/ordering issue, dropped specific erratum numbers and hardware scope reduce diagnostic traceability, and the CVE-2025-10263 reference is absent from the config annotations. Findings: Critical: 0, High: 0, Medium: 5, Low: 2 Latest watcher review: open review Kernel deb build: successful (download debs, 4 files) Head: This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review. |
There was a problem hiding this comment.
Commit message attributions verified: all four upstream SHAs correct, backported-from/cherry-picked-from distinction matches the actual diffs, SoB ordering correct. Code looks good — MIDR list, Kconfig help, and silicon-errata table entries are consistent.
Acked-by: Jamie Nguyen <jamien@nvidia.com>
|
|
|
|
|
Merged, closing PR. |
These patches address CVE-2025-10263, an Arm TLBI completion erratum where affected CPUs may complete a broadcast TLBI sequence before all memory accesses translated by the invalidated entry are globally observed. The mitigation enables the existing arm64 repeat-TLBI workaround so affected TLBI sequences are followed by an additional broadcast TLBI/DSB sequence.
For NVIDIA platforms, the series adds the required CPU ID coverage and enables CONFIG_ARM64_ERRATUM_4118414 in the NVIDIA kernel annotations so the mitigation is built for both Grace and Vera platforms. The platform config marks this erratum as required for Grace and Vera enablement.
Verification was performed on both Grace and Vera by booting the patched kernel and confirming:
LKML:
[PATCH 0/3] arm64: errata: Mitigate TLBI errata on various Arm CPUs - https://lore.kernel.org/all/20260609101203.1512409-1-mark.rutland@arm.com/
[PATCH v1] arm64: errata: Mitigate TLBI errata on NVIDIA Olympus CPU - https://lore.kernel.org/all/20260609234044.3945938-1-sdonthineni@nvidia.com/
linux-next:
60349e64a6c6 arm64: cputype: Add C1-Ultra definitions
d28413bfc5a2 arm64: cputype: Add C1-Premium definitions
cfd391e74134 arm64: errata: Mitigate TLBI errata on various Arm CPUs
ec7216f92e4e arm64: errata: Mitigate TLBI errata on NVIDIA Olympus CPU
LP: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-bos/+bug/2156557