Skip to content

[24.04_linux-nvidia-6.17-next] PCI: mirror PI7C9X3G606GPC Port 4 BAR0/BAR1#442

Closed
nirmoy wants to merge 1 commit into
NVIDIA:24.04_linux-nvidia-6.17-nextfrom
nirmoy:codex/pericom-msix-bar-war-6.17
Closed

[24.04_linux-nvidia-6.17-next] PCI: mirror PI7C9X3G606GPC Port 4 BAR0/BAR1#442
nirmoy wants to merge 1 commit into
NVIDIA:24.04_linux-nvidia-6.17-nextfrom
nirmoy:codex/pericom-msix-bar-war-6.17

Conversation

@nirmoy

@nirmoy nirmoy commented May 27, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Add a PCI final/early-resume quirk for the Diodes/Pericom PI7C9X3G606GPC switch.
  • Mirror the immediate upstream port BAR0 value into downstream Port 4 BAR0 after Linux PCI resource assignment and during early resume.
  • If upstream BAR0 is configured as a 64-bit memory BAR, mirror BAR1 at offset 14h as well.
  • For the 64-bit BAR path, temporarily disable memory decoding while writing BAR0/BAR1 when !pdev->mmio_always_on, matching PCI core's non-atomic 64-bit BAR update sequence.
  • Scope the WAR to the Diodes-confirmed OS-visible Tile0/P4 mapping: upstream bus + 1, device 04, function 0.
  • Verify the immediate upstream bridge is the same PI7C9X3G606GPC switch before copying the upstream BAR value.
  • Port 4 BAR0/BAR1 may read back as zero through normal PCI config space even after a successful write, so the quirk rewrites the mirror whenever it runs.

Validation

  • Current PR head after rebase: b6da240e14f0481d874da5d76b01e9c511c24ad3.
  • Rebased onto upstream/24.04_linux-nvidia-6.17-next at 5ecb1af86110543dde26a444ababd5999414ce64.
  • Local patch checks passed at current head:
    • git diff --check upstream/24.04_linux-nvidia-6.17-next...HEAD
    • scripts/checkpatch.pl --strict --no-signoff --git HEAD~1..HEAD reports 0 errors and 0 warnings.
    • make O=/tmp/nv-kernels-pr442-quirks-build defconfig
    • make O=/tmp/nv-kernels-pr442-quirks-build -j$(nproc) drivers/pci/quirks.o
  • Built a final 6.17 test kernel on nvidia@10.85.119.39 and booted DUT 10.137.158.190 with BMC 10.137.159.216 available for recovery.
  • Booted test kernel:
Linux localhost-left 6.17.13-nvidia-64k-pericom-final aarch64
  • Quirk/topology evidence from dmesg:
pci 0002:a1:00.0: BAR 0 [mem 0x10300000-0x1037ffff]
pci 0002:a1:00.0: BAR 0 [mem 0x10c00000-0x10c7ffff]: assigned
pci 0002:a2:04.0: wrote upstream BAR 0 0x10c00000 to Port 4 BAR 0 for PI7C9X3G606GPC BAR 0 mirror workaround
  • Ran a 30-minute NVMe-backed rootfs fio stress test:
pericom_final_randrw: err= 0
READ: bw=736MiB/s, io=1294GiB, run=1800093msec
WRITE: bw=317MiB/s, io=557GiB, run=1800093msec
fio_status=0
  • Rootfs stayed read-write on the NVMe/BTRFS device.
  • Post-fio dmesg scan returned 0 matches for BTRFS error, hard NVMe I/O errors, NVMe power-state failures, NVMe timeout/abort, and buffer I/O errors.
  • Validation logs on the DUT: /home/nvidia/nirmoy/pericom-final-validation-20251125T181702Z.
  • The available DUT uses the default 32-bit upstream BAR0 path, so it does not runtime-exercise the defensive 64-bit BAR0/BAR1 path.
  • Did not use the BMC/I2C BAR readback helper for this validation. The Quark platform owner said that helper uses special CPED/CDEP access that is not supported as routine validation on this platform and can put the PCIe switch into a bad state.

References

Launchpad: https://bugs.launchpad.net/ubuntu/+source/linux-nvidia-6.17/+bug/2154457

NVBug: https://nvbugspro.nvidia.com/bug/6205517
NVBug: https://nvbugspro.nvidia.com/bug/6134331

Test artifacts: http://baseos-internal-tools.nvidia.com:8003/

@nirmoy nirmoy force-pushed the codex/pericom-msix-bar-war-6.17 branch from 44e1553 to 31881cf Compare May 27, 2026 16:08
@nirmoy nirmoy added the help wanted Extra attention is needed label May 27, 2026
@nirmoy

nirmoy commented May 27, 2026

Copy link
Copy Markdown
Collaborator Author

BaseOS Kernel Review

Summary

Only a minor issue found: a misleading comment in drivers/pci/quirks.c where "Tile0/P4" reads like a symbolic device name despite the code using dynamic PCI slot/function numbers. No functional or runtime concerns identified.

Findings: Critical: 0, High: 0, Medium: 0, Low: 1

Latest watcher review: open review

Kernel deb build: successful (download debs, 4 files)

Head: b6da240e14f0

This comment is maintained by nv-pr-bot. It is updated when the GitHub watcher publishes a newer review.

@github-actions

github-actions Bot commented May 27, 2026

Copy link
Copy Markdown
Contributor

PR Validation Report

Patchscan ✅ No Missing Fixes

All cherry-picked commits checked — no missing upstream fixes found.

PR Lint ✅ All checks passed

Details
Checking 1 commits...

Cherry-pick digest:
┌──────────────┬──────────────────────────────────────────────────────────────────┬────────────┬─────────┬───────────────────────────┐
│ Local        │ Referenced upstream / Patch subject                              │ Patch-ID   │ Subject │ SoB chain                 │
├──────────────┼──────────────────────────────────────────────────────────────────┼────────────┼─────────┼───────────────────────────┤
│ b6da240e14f0 │ [SAUCE] pci: quirks: mirror pi7c9x3g606gpc port 4 bar0           │ N/A        │ N/A     │ nirmoyd                   │
└──────────────┴──────────────────────────────────────────────────────────────────┴────────────┴─────────┴───────────────────────────┘

Lint: all checks passed.

@nvidia-bfigg

Copy link
Copy Markdown
Collaborator

Does this PR need to be applied to the 6.18 reference kernel as well?

@nvidia-bfigg

Copy link
Copy Markdown
Collaborator

Do you have tests (scripts) which can verify this patch set is applied and working?

@nirmoy

nirmoy commented May 27, 2026

Copy link
Copy Markdown
Collaborator Author

Hi @nvidia-bfigg

Does this PR need to be applied to the 6.18 reference kernel as well?

Yes, we need that for 6.18 too. I will create a PR for that as well

Do you have tests (scripts) which can verify this patch set is applied and working?
The kernel WAR will be always applied at the boot time because config read of the effected port always return 0 so we should see the in the dmesg wrote upstream BAR 0 %#x to Port 4 BAR 0 for PI7C

@nvidia-bfigg

Copy link
Copy Markdown
Collaborator

Hi @nvidia-bfigg

Does this PR need to be applied to the 6.18 reference kernel as well?

Yes, we need that for 6.18 too. I will create a PR for that as well

Do you have tests (scripts) which can verify this patch set is applied and working?
The kernel WAR will be always applied at the boot time because config read of the effected port always return 0 so we should see the in the dmesg wrote upstream BAR 0 %#x to Port 4 BAR 0 for PI7C

So if the patch is not applied to the kernel that message will not be in the dmesg. We should have a test that verifies that message is in the dmesg or the test should fail that the patch has not been applied, correct?

@nirmoy

nirmoy commented May 28, 2026

Copy link
Copy Markdown
Collaborator Author

Hi @nvidia-bfigg

Does this PR need to be applied to the 6.18 reference kernel as well?

Yes, we need that for 6.18 too. I will create a PR for that as well

Do you have tests (scripts) which can verify this patch set is applied and working?
The kernel WAR will be always applied at the boot time because config read of the effected port always return 0 so we should see the in the dmesg wrote upstream BAR 0 %#x to Port 4 BAR 0 for PI7C

So if the patch is not applied to the kernel that message will not be in the dmesg. We should have a test that verifies that message is in the dmesg or the test should fail that the patch has not been applied, correct?

Yes, ACK we should have a test may be a greenlit one to check the dmesg to verify the patch.

@nirmoy nirmoy force-pushed the codex/pericom-msix-bar-war-6.17 branch from 31881cf to 5edb468 Compare May 28, 2026 13:22
@nirmoy nirmoy marked this pull request as ready for review May 28, 2026 14:17
@nirmoy nirmoy marked this pull request as draft May 28, 2026 15:00
@nirmoy nirmoy removed help wanted Extra attention is needed pending_review_comment labels May 28, 2026
@nirmoy nirmoy force-pushed the codex/pericom-msix-bar-war-6.17 branch from 5edb468 to e562a4a Compare May 28, 2026 19:33
@nirmoy nirmoy marked this pull request as ready for review May 28, 2026 21:01
@nirmoy nirmoy added help wanted Extra attention is needed pending_review_comment labels May 28, 2026
@clsotog

clsotog commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Codex gave me this suggestion but not sure in this environment:
drivers/pci/quirks.c:6318 (

DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_PERICOM,
PCI_DEVICE_ID_PERICOM_PI7C9X3G606GPC,
pci_fixup_pericom_pi7c9x3g606gpc_bar0_mirror);
) registers this as a normal RESUME fixup. For a bridge coming back from D3cold, PCI core restores config space, runs
pci_fixup_resume_early, and may resume the subordinate bus in pci_pm_resume_noirq() before normal resume fixups run: see pci-driver.c:963-L970 (
if (!(skip_bus_pm && pm_suspend_no_platform()))
pci_pm_default_resume_early(pci_dev);
pci_fixup_device(pci_fixup_resume_early, pci_dev);
pcie_pme_root_status_cleanup(pci_dev);
if (!skip_bus_pm && prev_state == PCI_D3cold)
pci_pm_bridge_power_up_actions(pci_dev);
) and
pci-driver.c:581-L586 (
/*
* When powering on a bridge from D3cold, the whole hierarchy may be
* powered on into D0uninitialized state, resume them to give them a
* chance to suspend again
*/
pci_resume_bus(pci_dev->subordinate);
). If this BAR mirror is required for devices below Port 4 after the switch loses state, children can resume while Port 4 BAR0 is
still stale/zero. I’d move this to DECLARE_PCI_FIXUP_RESUME_EARLY or add an early resume fixup so the mirror is rewritten immediately after pci_restore_state() and before subordinate devices resume.

@nirmoy nirmoy force-pushed the codex/pericom-msix-bar-war-6.17 branch from e562a4a to 377a0dc Compare May 29, 2026 13:23
@nvmochs

nvmochs commented May 29, 2026

Copy link
Copy Markdown
Collaborator

@nirmoy Can you confirm that this is limited to 32-bit bars? Is that because 64-bit bars are not possible in this configuration?

@nirmoy

nirmoy commented May 29, 2026

Copy link
Copy Markdown
Collaborator Author

Codex gave me this suggestion but not sure in this environment: drivers/pci/quirks.c:6318 (

DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_PERICOM,
PCI_DEVICE_ID_PERICOM_PI7C9X3G606GPC,
pci_fixup_pericom_pi7c9x3g606gpc_bar0_mirror);

) registers this as a normal RESUME fixup. For a bridge coming back from D3cold, PCI core restores config space, runs
pci_fixup_resume_early, and may resume the subordinate bus in pci_pm_resume_noirq() before normal resume fixups run: see pci-driver.c:963-L970 (

if (!(skip_bus_pm && pm_suspend_no_platform()))
pci_pm_default_resume_early(pci_dev);
pci_fixup_device(pci_fixup_resume_early, pci_dev);
pcie_pme_root_status_cleanup(pci_dev);
if (!skip_bus_pm && prev_state == PCI_D3cold)
pci_pm_bridge_power_up_actions(pci_dev);

) and
pci-driver.c:581-L586 (

/*
* When powering on a bridge from D3cold, the whole hierarchy may be
* powered on into D0uninitialized state, resume them to give them a
* chance to suspend again
*/
pci_resume_bus(pci_dev->subordinate);

). If this BAR mirror is required for devices below Port 4 after the switch loses state, children can resume while Port 4 BAR0 is
still stale/zero. I’d move this to DECLARE_PCI_FIXUP_RESUME_EARLY or add an early resume fixup so the mirror is rewritten immediately after pci_restore_state() and before subordinate devices resume.

It make sense to have this. I will update it

@nirmoy

nirmoy commented May 29, 2026

Copy link
Copy Markdown
Collaborator Author

@nirmoy Can you confirm that this is limited to 32-bit bars? Is that because 64-bit bars are not possible in this configuration?

The erratum describes the WAR as copying only BAR0 at offset 0x10. That is sufficient for a 32-bit BAR, but would not be complete for a 64-bit BAR because BAR1 contains the upper address bits. So I am defensively skipping the WAR for 64-bit BARs.

The device can be configured with a 64-bit BAR0/BAR1 pair, but on our platform it is configured as a 32-bit BAR. I confirmed that from the BMC readback of the upstream BAR0 value, where the BAR type bits indicate a 32-bit memory BAR.

@nvmochs

nvmochs commented May 29, 2026

Copy link
Copy Markdown
Collaborator

@nirmoy Can you confirm that this is limited to 32-bit bars? Is that because 64-bit bars are not possible in this configuration?

The erratum describes the WAR as copying only BAR0 at offset 0x10. That is sufficient for a 32-bit BAR, but would not be complete for a 64-bit BAR because BAR1 contains the upper address bits. So I am defensively skipping the WAR for 64-bit BARs.

The device can be configured with a 64-bit BAR0/BAR1 pair, but on our platform it is configured as a 32-bit BAR. I confirmed that from the BMC readback of the upstream BAR0 value, where the BAR type bits indicate a 32-bit memory BAR.

Thanks for clarifying. If we end up sending this upstream we may need to relax the checking a bit.

@nirmoy nirmoy force-pushed the codex/pericom-msix-bar-war-6.17 branch 2 times, most recently from c2c4176 to 1670e40 Compare May 29, 2026 19:27
@nvmochs

nvmochs commented May 29, 2026

Copy link
Copy Markdown
Collaborator

I looked at the latest version and agree with change to resume early.

I see that the 64-bit bar check was removed. We can do that, but then I think the code needs to properly handle 32 and 64-bit bars being fixed up - I should have been clearer when I said "relax the checking" in my prior comment.

Here's what Codex has to say:

  • drivers/pci/quirks.c:6291: the updated patch removed the 64-bit BAR guard, but the code still mirrors only PCI_BASE_ADDRESS_0. If upstream BAR0 is ever a 64-bit memory BAR, BAR1 contains the high dword, so this is not a complete mirror. The current unassigned check can also misclassify a valid 64-bit BAR whose low address dword is zero. If Diodes guarantees this switch exposes that BAR only as a 32-bit memory BAR, document that and/or restore the guard; otherwise mirror BAR1 too for the 64-bit case.

@nirmoy nirmoy force-pushed the codex/pericom-msix-bar-war-6.17 branch from 46a5d4e to 6c7b78e Compare June 11, 2026 15:03
@clsotog

clsotog commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Codex finding:

  • drivers/pci/quirks.c:6322: the new 64-bit BAR mirror path writes BAR0 low dword and then BAR1 high dword while memory decoding may still be enabled. PCI core explicitly disables PCI_COMMAND_MEMORY around 64-bit BAR updates because the update is not atomic; see drivers/pci/setup-res.c:92-101. This quirk should
    do the same when bar0_64 && !pdev->mmio_always_on, then restore the command register after writing both dwords. Otherwise Port 4 can briefly decode a bogus address during final fixup or resume.

Also needs rebase.

@nirmoy nirmoy force-pushed the codex/pericom-msix-bar-war-6.17 branch from 6c7b78e to 48bb4b5 Compare June 11, 2026 15:25
@nirmoy nirmoy marked this pull request as draft June 11, 2026 15:27
@nirmoy nirmoy force-pushed the codex/pericom-msix-bar-war-6.17 branch from 48bb4b5 to 686413e Compare June 11, 2026 15:27
@nvmochs

nvmochs commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

Codex finding:

  • drivers/pci/quirks.c:6322: the new 64-bit BAR mirror path writes BAR0 low dword and then BAR1 high dword while memory decoding may still be enabled. PCI core explicitly disables PCI_COMMAND_MEMORY around 64-bit BAR updates because the update is not atomic; see drivers/pci/setup-res.c:92-101. This quirk should
    do the same when bar0_64 && !pdev->mmio_always_on, then restore the command register after writing both dwords. Otherwise Port 4 can briefly decode a bogus address during final fixup or resume.

I had this same finding when re-reviewing but dug a bit deeper with Codex. The concern really only applies to the non-boot path...but since the same code is used for both boot and resume, it should be include the update-BAR protection.

RESUME_EARLY runs after config restore, so the saved command register may already have memory decoding enabled. Since this patch now claims to support 64-bit BAR0, it should mirror BAR0/BAR1 the same way PCI core updates 64-bit BARs: temporarily clear PCI_COMMAND_MEMORY, write both dwords, then restore the command register.

@nirmoy

nirmoy commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator Author

@sforshee @clsotog Moved it to draft as I haven't finish testing yet fully. Codex kept updating the PR meanwhile. Sorry for the confusion

@nirmoy

nirmoy commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator Author

@clsotog @nvmochs Ack on the atomicity concern. Will update that

Some Pericom/Diodes PI7C9X3G606GPC switches require downstream
Port 4 BAR0 to mirror BAR0 of the immediate upstream port. Firmware
may apply this during boot, but Linux PCI resource assignment can move
the upstream BAR0 and leave Port 4 without the required mirror.

Diodes confirmed that Tile0/P4 is OS-visible as device 04, function 0
on the bus below the upstream port. Add a final and early resume quirk
for that downstream function. The quirk verifies that the immediate
upstream bridge is the same switch, then writes Port 4 BAR0 from the
upstream BAR0 after resource assignment and during early resume.

If BAR0 is configured as a 64-bit memory BAR, mirror BAR1 as the upper
dword as well. Port 4 BAR0 may read back as zero even after a
successful write, so the write must be validated by platform-specific
means.

Signed-off-by: Nirmoy Das <nirmoyd@nvidia.com>
@nirmoy nirmoy force-pushed the codex/pericom-msix-bar-war-6.17 branch from 686413e to b6da240 Compare June 11, 2026 15:49
@nirmoy nirmoy marked this pull request as ready for review June 12, 2026 09:36
@nirmoy nirmoy requested a review from sforshee June 12, 2026 10:32
@nirmoy nirmoy removed the has_1_ack label Jun 12, 2026
@nirmoy

nirmoy commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author

Reset validation update:
One review finding was that PCI reset/restore paths could miss the BAR mirror and write 0 back to Port 4 BAR0/BAR1. I tested this with a debug kernel using Port 4 sysfs reset. PCI core saved BAR0/BAR1 as 0/0 because config reads return 0, but restore skipped the BAR writes because live == saved. So that finding does not apply to this reset path.

I also tested reset_subordinate. It resets the bus below Port 4, so it did not touch Port 4 BARs or rerun the quirk.

Caveat: config space cannot confirm the switch's internal BAR value after reset. That would need the BMC/I2C readback helper, followed by AUX cycle recovery.

@nirmoy

nirmoy commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator Author

@clsotog @nvmochs @sforshee PR is ready for review again.

@nvmochs

nvmochs commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Thanks Nirmoy - no further issues from me.

Acked-by: Matthew R. Ochs <mochs@nvivida.com>

@clsotog

clsotog commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Acked-by: Carol L Soto <csoto@nvidia.com>

@nirmoy nirmoy added has_2_acks and removed help wanted Extra attention is needed has_1_ack labels Jun 12, 2026
@clsotog

clsotog commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Acked-by: Carol L Soto <csoto@nvidia.com>

@nvmochs

nvmochs commented Jun 12, 2026

Copy link
Copy Markdown
Collaborator

Merged, closing PR.

e9296f447517 NVIDIA: SAUCE: PCI: quirks: mirror PI7C9X3G606GPC Port 4 BAR0

@nvmochs nvmochs closed this Jun 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants