Skip to content

[26.04-linux-nvidia-bos] Backport: Mitigate TLBI errata on various Arm CPUs#457

Closed
nvmochs wants to merge 5 commits into
NVIDIA:26.04_linux-nvidia-bosfrom
nvmochs:jun2026_tlbi_errata_70bos
Closed

[26.04-linux-nvidia-bos] Backport: Mitigate TLBI errata on various Arm CPUs#457
nvmochs wants to merge 5 commits into
NVIDIA:26.04_linux-nvidia-bosfrom
nvmochs:jun2026_tlbi_errata_70bos

Conversation

@nvmochs

@nvmochs nvmochs commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

These patches address CVE-2025-10263, an Arm TLBI completion erratum where affected CPUs may complete a broadcast TLBI sequence before all memory accesses translated by the invalidated entry are globally observed. The mitigation enables the existing arm64 repeat-TLBI workaround so affected TLBI sequences are followed by an additional broadcast TLBI/DSB sequence.

For NVIDIA platforms, the series adds the required CPU ID coverage and enables CONFIG_ARM64_ERRATUM_4118414 in the NVIDIA kernel annotations so the mitigation is built for both Grace and Vera platforms. The platform config marks this erratum as required for Grace and Vera enablement.

Verification was performed on both Grace and Vera by booting the patched kernel and confirming:

  • CONFIG_ARM64_ERRATUM_4118414=y
  • CONFIG_ARM64_WORKAROUND_REPEAT_TLBI=y
  • kernel log reports the active workaround: CPU features: detected: Broken broadcast TLBI completion

LKML:
[PATCH 0/3] arm64: errata: Mitigate TLBI errata on various Arm CPUs - https://lore.kernel.org/all/20260609101203.1512409-1-mark.rutland@arm.com/
[PATCH v1] arm64: errata: Mitigate TLBI errata on NVIDIA Olympus CPU - https://lore.kernel.org/all/20260609234044.3945938-1-sdonthineni@nvidia.com/

linux-next:
60349e64a6c6 arm64: cputype: Add C1-Ultra definitions
d28413bfc5a2 arm64: cputype: Add C1-Premium definitions
cfd391e74134 arm64: errata: Mitigate TLBI errata on various Arm CPUs
ec7216f92e4e arm64: errata: Mitigate TLBI errata on NVIDIA Olympus CPU


LP: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-bos/+bug/2156557

mrutland-arm and others added 5 commits June 11, 2026 18:15
Add cputype definitions for C1-Ultra. These will be used for errata
detection in subsequent patches.

These values can be found in the C1-Ultra TRM:

  https://developer.arm.com/documentation/108014/0100/

... in section A.5.1 ("MIDR_EL1, Main ID Register").

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit 60349e64a6c65f9f0aa118af711b3c7e137f07ff linux-next)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Add cputype definitions for C1-Premium. These will be used for errata
detection in subsequent patches.

These values can be found in the C1-Premium TRM:

  https://developer.arm.com/documentation/109416/0100/

... in section A.5.1 ("MIDR_EL1, Main ID Register").

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
(backported from commit d28413bfc5a255957241f1df5d7fd0c2cd74fe18 linux-next)
[mochs: Minor context adjustment due to absent definitions]
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
A number of CPUs developed by Arm suffer from errata whereby a broadcast
TLBI;DSB sequence may complete before the global observation of writes
which are translated by an affected TLB entry.

These errata ONLY affect the completion of memory accesses which have
been translated by an invalidated TLB entry, and these errata DO NOT
affect the actual invalidation of TLB entries. TLB entries are removed
correctly.

This issue has been assigned CVE ID CVE-2025-10263.

To mitigate this issue, Arm recommends that software follows any
affected TLBI;DSB sequence with an additional TLBI;DSB, which will
ensure that all memory write effects affected by the first TLBI have
been globally observed. The additional TLBI can use any operation that
is broadcast to affected CPUs, and the additional DSB can use any option
that is sufficient to complete the additional TLBI.

The ARM64_WORKAROUND_REPEAT_TLBI workaround is sufficient to mitigate
the issue. Enable this workaround for affected CPUs, and update the
silicon errata documentation accordingly.

Note that due to the manner in which Arm develops IP and tracks errata,
some CPUs share a common erratum number.

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
(backported from commit cfd391e74134db664feb499d43af286380b10ba8 linux-next)
[mochs: Minor context adjustment due to absent definitions]
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
NVIDIA Olympus cores are affected by the TLBI completion issue tracked as
CVE-2025-10263. The existing ARM64_ERRATUM_4118414 handling already uses
ARM64_WORKAROUND_REPEAT_TLBI to issue an additional broadcast TLBI;DSB
sequence and ensure affected memory write effects are globally observed.

Add MIDR_NVIDIA_OLYMPUS to the repeat-TLBI match list so the same
mitigation is enabled on affected Olympus systems. Also document the
NVIDIA Olympus erratum in the arm64 silicon errata table and list it in
the Kconfig help text.

Signed-off-by: Shanker Donthineni <sdonthineni@nvidia.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Will Deacon <will@kernel.org>
(cherry picked from commit ec7216f92e4ebd485b1c6dc6aa3f6064b71a5768 linux-next)
Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
Enable ARM64_ERRATUM_4118414 to mitigate CVE-2025-10263 on NVIDIA platforms.

Signed-off-by: Matthew R. Ochs <mochs@nvidia.com>
@nvmochs

nvmochs commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author

Test results...

7.0-bos Grace:

nvidia@gb200-nvl4-47:~/mochs$ sudo ./verify_arm64_erratum_4118414.sh 
INFO: checking config: /boot/config-7.0.0+
PASS: CONFIG_ARM64_ERRATUM_4118414=y
PASS: CONFIG_ARM64_WORKAROUND_REPEAT_TLBI=y
PASS: runtime erratum print found: CPU features: detected: Broken broadcast TLBI completion
RESULT: PASS

7.0-bos Vera:

nvidia@mgx-vera-c2-054:/home/nvidia/mochs$ sudo ./verify_arm64_erratum_4118414.sh 
INFO: checking config: /boot/config-7.0.0+
PASS: CONFIG_ARM64_ERRATUM_4118414=y
PASS: CONFIG_ARM64_WORKAROUND_REPEAT_TLBI=y
PASS: runtime erratum print found: CPU features: detected: Broken broadcast TLBI completion
RESULT: PASS

@nirmoy nirmoy added the help wanted Extra attention is needed label Jun 12, 2026
@github-actions

Copy link
Copy Markdown
Contributor

PR Validation Report

Patchscan ✅ No Missing Fixes

All cherry-picked commits checked — no missing upstream fixes found.

PR Lint ❌ Errors found

Details
Checking 5 commits...

Cherry-pick digest:
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local        │ Referenced upstream / Patch subject                              │ Patch-ID   │ Subject │ SoB chain                 │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ f0f05940c880 │ [SAUCE] nvidia: [config] enable arm64_erratum_4118414            │ N/A        │ N/A     │ mochs                     │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 2195a4924ca1 │ [SAUCE] arm64: errata: mitigate tlbi errata on nvidia olympus cp │ N/A        │ N/A     │ sdonthin, will, mochs     │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ fb301f92960f │ [SAUCE] arm64: errata: mitigate tlbi errata on various arm cpus  │ N/A        │ N/A     │ rutland, will, mochs      │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ 909e16e7cbc0 │ [SAUCE] arm64: cputype: add c1-premium definitions               │ N/A        │ N/A     │ rutland, will, mochs      │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ ffe140363b73 │ [SAUCE] arm64: cputype: add c1-ultra definitions                 │ N/A        │ N/A     │ rutland, will, mochs      │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘

Lint results:
E: f0f05940c880 ("NVIDIA: [Config] Enable ARM64_ERRATUM_4118414"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)
E: 2195a4924ca1 ("arm64: errata: Mitigate TLBI errata on NVIDIA Olym"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)
E: fb301f92960f ("arm64: errata: Mitigate TLBI errata on various Arm"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)
E: 909e16e7cbc0 ("arm64: cputype: Add C1-Premium definitions"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)
E: ffe140363b73 ("arm64: cputype: Add C1-Ultra definitions"): not SAUCE/UBUNTU/Revert but has no upstream reference trailer (cherry picked from commit ... or backported from ...)

@jamieNguyenNVIDIA jamieNguyenNVIDIA left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit message attributions verified: all four upstream SHAs correct, backported-from/cherry-picked-from distinction matches the actual diffs, SoB ordering correct. Code looks good — MIDR list, Kconfig help, and silicon-errata table entries are consistent.

Acked-by: Jamie Nguyen <jamien@nvidia.com>

@nirmoy

nirmoy commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

BaseOS Kernel Review

Summary

No significant issues found across this series. All findings are Low-severity documentation and naming nitpicks (CVE/erratum references not mirrored in Kconfig and silicon-errata docs, generic vs. specific descriptions); the erratum definitions and TLBI workaround code are correct.

Findings: Critical: 0, High: 0, Medium: 0, Low: 9

Latest watcher review: open review

Kernel deb build: successful (download debs, 4 files)

Head: f0f05940c880

This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review.

@nirmoy

nirmoy commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Acked-by: Nirmoy Das <nirmoyd@nvidia.com>

@nirmoy nirmoy added has_2_acks and removed help wanted Extra attention is needed has_1_ack labels Jun 12, 2026
@clsotog

clsotog commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Acked-by: Carol L Soto <csoto@nvidia.com>

@nvmochs

nvmochs commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author

Merged, closing PR.

816f4169f43b (nresolute/nvidia-bos-next) NVIDIA: [Config] Enable ARM64_ERRATUM_4118414
067cac8351ad arm64: errata: Mitigate TLBI errata on NVIDIA Olympus CPU
2aeb7e9836cd arm64: errata: Mitigate TLBI errata on various Arm CPUs
2fbb193ab9c0 arm64: cputype: Add C1-Premium definitions
6b3cf1101cd9 arm64: cputype: Add C1-Ultra definitions

@nvmochs nvmochs closed this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants