Skip to content

Add CNV smoke test CI lane on TNF cluster (OCP 5.0)#80512

Open
kasturinarra wants to merge 8 commits into
openshift:mainfrom
kasturinarra:add-tnf-cnv-smoke-ci-lane
Open

Add CNV smoke test CI lane on TNF cluster (OCP 5.0)#80512
kasturinarra wants to merge 8 commits into
openshift:mainfrom
kasturinarra:add-tnf-cnv-smoke-ci-lane

Conversation

@kasturinarra

@kasturinarra kasturinarra commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Add a weekly periodic job (e2e-metal-ovn-two-node-fencing-cnv-smoke) that provisions a Two Node OpenShift with Fencing (TNF) cluster, deploys LVMS + CNV 5.0, and runs CNV smoke tests
  • New step registry ref baremetalds-cnv-smoke — baremetal-compatible CNV smoke test using the team's validated pytest command (smoke and not rwx_default_storage, LVMS config, Block volume mode)
  • New workflow baremetalds-two-node-fencing-cnv-smoke — combines TNF provisioning, LVMS deployment (via storage-conf-csi-optional-topolvm), CNV deployment (via interop-tests-deploy-cnv), and CNV smoke tests
  • Implements continuous CI validation for CNV-72458

Test plan

  • make update passes locally
  • make registry-metadata validates new step registry components
  • Rehearsal job runs on the PR
  • First weekly periodic run succeeds on OCP 5.0 nightly

🤖 Generated with Claude Code

Summary by CodeRabbit

This PR updates OpenShift CI infrastructure to add continuous, weekly CNV validation for OCP 5.0 on Two-Node OpenShift with Fencing (TNF) clusters running on the equinix-edge-enablement profile.

Concretely, it adds a new weekly Prow test lane in ci-operator/config/openshift/release/openshift-release-main__nightly-5.0.yaml:

  • e2e-metal-ovn-two-node-fencing-cnv-smoke (scheduled weekly, configured with CNV_VERSION: "5.0")
  • Uses the new baremetalds-two-node-fencing-cnv-smoke workflow and is inserted right after the existing e2e-metal-ovn-two-node-fencing entry.

To implement the lane, the PR introduces two new step-registry components:

  • baremetalds-cnv-smoke step: runs CNV smoke tests using poetry run pytest with a filter of smoke and not rwx_default_storage, defaulting to LVMS-backed local storage (CNV_STORAGE_CLASS=lvms-vg1) and Block volume mode (CNV_VOLUME_MODE=Block). The step also sets up pytest/global config defaults for storage class & volume mode and includes runtime handling intended to keep the job from finishing “too quickly”.
  • baremetalds-two-node-fencing-cnv-smoke workflow: provisions the TNF cluster (redfish BMC + extra disks), deploys LVMS via storage-conf-csi-optional-topolvm, deploys CNV via interop-tests-deploy-cnv, then runs the baremetalds-cnv-smoke step as part of the full ofcir-post chain. It carries CNV_VERSION: "5.0" through the workflow.

Governance/metadata updates:

  • Adds/updates OWNERS and workflow metadata so openshift-edge-approvers can approve the new step/workflow.

Context and validation:

  • Local validation confirmed make update and make registry-metadata pass.
  • A rehearsal was requested via /pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-metal-ovn-two-node-fencing-cnv-smoke to run the new periodic lane prior to merge.
  • The CNV smoke step uses the CNV CI source image tagged 4.22 for the pytest harness (with CNV_VERSION set to 5.0 for the workflow), because the 5.0 test image was not available at PR creation time.

kasturinarra and others added 2 commits June 15, 2026 15:06
Add a weekly periodic job that provisions a Two Node OpenShift with
Fencing (TNF) cluster, deploys LVMS for local storage, deploys CNV 5.0,
and runs CNV smoke tests. This implements continuous validation for the
work described in CNV-72458.

New step registry components:
- baremetalds-cnv-smoke: baremetal-compatible CNV smoke test ref adapted
  from interop-tests-cnv-tests-smoke (no fwknop, configurable storage
  class defaulting to lvms-vg1)
- baremetalds-two-node-fencing-cnv-smoke: workflow combining TNF cluster
  provisioning, LVMS deployment (via storage-conf-csi-optional-topolvm),
  CNV deployment (via interop-tests-deploy-cnv), and CNV smoke tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Use LVMS-specific config (global_config_lvms.py) and Block volume mode
- Exclude RWX tests (-m 'smoke and not rwx_default_storage')
- Add --jira, --html report, and --data-collector flags

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

A new baremetalds-cnv-smoke CI step reference is introduced with a bash commands script that runs pytest with CNV smoke test markers and enforces a timing guard. The step is orchestrated in a new baremetalds-two-node-fencing-cnv-smoke workflow that chains CNV deployment and post-cleanup. A weekly nightly job e2e-metal-ovn-two-node-fencing-cnv-smoke is registered in the 5.0 release configuration to run this workflow.

Changes

CNV Smoke Test and Workflow Infrastructure

Layer / File(s) Summary
CNV Smoke Test Step Reference
ci-operator/step-registry/baremetalds/cnv/smoke/baremetalds-cnv-smoke-ref.yaml, ci-operator/step-registry/baremetalds/cnv/smoke/baremetalds-cnv-smoke-commands.sh, ci-operator/step-registry/baremetalds/cnv/smoke/baremetalds-cnv-smoke-ref.metadata.json, ci-operator/step-registry/baremetalds/cnv/smoke/OWNERS, ci-operator/step-registry/baremetalds/cnv/OWNERS
Step reference defines baremetalds-cnv-smoke: ref YAML configures image source (cnv-ci-src 5.0), env defaults (CNV_STORAGE_CLASS=lvms-vg1, CNV_VOLUME_MODE=Block), resource requests (100m/200Mi), and 2h timeout. Commands script runs poetry run pytest with smoke marker filtering, handles exit code with || /bin/true, and implements timing guard: if test completes in ≤600 seconds, sleeps 7200s then exits 1. Metadata and both OWNERS files declare openshift-edge-approvers.
Two-Node Fencing Workflow Orchestration
ci-operator/step-registry/baremetalds/two-node/fencing/cnv-smoke/baremetalds-two-node-fencing-cnv-smoke-workflow.yaml, ci-operator/step-registry/baremetalds/two-node/fencing/cnv-smoke/baremetalds-two-node-fencing-cnv-smoke-workflow.metadata.json, ci-operator/step-registry/baremetalds/two-node/fencing/cnv-smoke/OWNERS
Workflow baremetalds-two-node-fencing-cnv-smoke orchestrates pre-chains (including interop-tests-deploy-cnv for CNV deployment), runs baremetalds-cnv-smoke as the test step, and runs ofcir-post cleanup in post-steps. Configures two-master/zero-worker environment with redfish BMC driver and VM extra disks. Sets CNV_VERSION: "5.0". Metadata and OWNERS declare openshift-edge-approvers.
Nightly Job Registration
ci-operator/config/openshift/release/openshift-release-main__nightly-5.0.yaml
Weekly nightly job e2e-metal-ovn-two-node-fencing-cnv-smoke is registered in the 5.0 release config to run the baremetalds-two-node-fencing-cnv-smoke workflow on the equinix-edge-enablement cluster profile with CNV_VERSION: "5.0".

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Suggested labels

lgtm, approved, rehearsals-ack

🚥 Pre-merge checks | ✅ 15
✅ Passed checks (15 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and specifically describes the main change: adding a CNV smoke test CI lane on TNF clusters for OCP 5.0, which directly reflects the core purpose of the PR changes.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR does not add or modify Ginkgo test definitions. It only adds CI configuration and a shell script that invokes pytest with static test filters ('smoke and not rwx_default_storage'), containing no...
Test Structure And Quality ✅ Passed PR adds CI/CD infrastructure and bash script invoking pytest, not Ginkgo test code. Custom check requires reviewing Ginkgo test structure; check is not applicable.
Microshift Test Compatibility ✅ Passed PR adds CI workflow configuration and bash scripts, not new Ginkgo e2e tests. Check applies only to new e2e tests; no Ginkgo tests were added.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR adds CI infrastructure (workflows, shell scripts, OWNERS files) but contains no Ginkgo e2e tests. The actual tests invoked are Python pytest tests from the CNV project, not Go-based Ginkgo...
Topology-Aware Scheduling Compatibility ✅ Passed PR adds only CI test infrastructure (periodic job, step registry, workflows, scripts) without any deployment manifests, operator code, or controllers. No scheduling constraints are introduced.
Ote Binary Stdout Contract ✅ Passed PR adds only CI configuration files (YAML), step registry definitions, bash script, and metadata files. No Go code is introduced, so OTE Binary Stdout Contract check (applicable only to Go binaries...
Ipv6 And Disconnected Network Test Compatibility ✅ Passed This PR adds CI configuration and a Python pytest-based test wrapper, not Ginkgo e2e tests. The check for IPv6/disconnected compatibility applies only to Ginkgo e2e tests (Go-based), which are not...
No-Weak-Crypto ✅ Passed No weak cryptography (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB), custom crypto implementations, or insecure secret comparisons found across all modified files in the PR.
Container-Privileges ✅ Passed No privileged container settings found: no privileged: true, hostPID/Network/IPC, SYS_ADMIN, allowPrivilegeEscalation, or root user configurations detected in any YAML, JSON, or script files added...
No-Sensitive-Data-In-Logs ✅ Passed No sensitive data (passwords, tokens, API keys, PII, session IDs, internal hostnames) is logged in the bash script, environment variables, or test configurations added in this PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

Remove --jira, --data-collector, --html flags that need credentials
or setup not available in CI. Use standard global_config.py with
--tc overrides for LVMS storage class and volume mode.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: kasturinarra
Once this PR has been reviewed and has the lgtm label, please assign dtantsur, wking for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@ci-operator/step-registry/baremetalds/cnv/smoke/baremetalds-cnv-smoke-commands.sh`:
- Around line 16-44: The poetry run pytest command has `|| /bin/true` appended
which suppresses all test failures, allowing the script to continue even when
tests fail. Remove the `|| /bin/true` from the poetry run pytest command to
preserve the actual pytest exit status. Capture the exit code of the pytest
command in a variable, then after the time-based check completes, check that
captured exit code and exit with it if pytest failed, ensuring that test
failures are not silently converted to success when the test duration exceeds
600 seconds.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: ace0d6d4-8771-43e1-a887-bb27c63e75fa

📥 Commits

Reviewing files that changed from the base of the PR and between fea73e3 and a537946.

⛔ Files ignored due to path filters (1)
  • ci-operator/jobs/openshift/release/openshift-release-main-periodics.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (8)
  • ci-operator/config/openshift/release/openshift-release-main__nightly-5.0.yaml
  • ci-operator/step-registry/baremetalds/cnv/smoke/OWNERS
  • ci-operator/step-registry/baremetalds/cnv/smoke/baremetalds-cnv-smoke-commands.sh
  • ci-operator/step-registry/baremetalds/cnv/smoke/baremetalds-cnv-smoke-ref.metadata.json
  • ci-operator/step-registry/baremetalds/cnv/smoke/baremetalds-cnv-smoke-ref.yaml
  • ci-operator/step-registry/baremetalds/two-node/fencing/cnv-smoke/OWNERS
  • ci-operator/step-registry/baremetalds/two-node/fencing/cnv-smoke/baremetalds-two-node-fencing-cnv-smoke-workflow.metadata.json
  • ci-operator/step-registry/baremetalds/two-node/fencing/cnv-smoke/baremetalds-two-node-fencing-cnv-smoke-workflow.yaml

Comment on lines +16 to +44
poetry run pytest tests \
-s \
-o log_cli=true \
-o cache_dir=/tmp/cache-pytest \
-m 'smoke and not rwx_default_storage' \
--tc-file=tests/global_config_lvms.py \
--tc "default_storage_class:${CNV_STORAGE_CLASS}" \
--tc "default_volume_mode:${CNV_VOLUME_MODE}" \
--storage-class-matrix="${CNV_STORAGE_CLASS}" \
--latest-rhel \
--tb=native \
--data-collector \
--jira \
--junit-xml="${ARTIFACT_DIR}/xunit_results.xml" \
--pytest-log-file="${ARTIFACT_DIR}/pytest-tests.log" \
--html="${ARTIFACT_DIR}/report.html" \
--self-contained-html || /bin/true

FINISH_TIME=$(date "+%s")
DIFF_TIME=$((FINISH_TIME - START_TIME))

if [[ ${DIFF_TIME} -le 600 ]]; then
echo ""
echo "The tests finished too quickly (took only: ${DIFF_TIME} sec), pausing here to give time to debug"
sleep 7200
exit 1
else
echo "Finished in: ${DIFF_TIME} sec"
fi

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Preserve pytest failure status instead of forcing success

Line 32 currently swallows all pytest failures (|| /bin/true), so a failing run that lasts more than 600s is reported as success. This can silently pass broken weekly validation.

Suggested fix
 START_TIME=$(date "+%s")
 
-poetry run pytest tests \
+pytest_rc=0
+poetry run pytest tests \
   -s \
   -o log_cli=true \
   -o cache_dir=/tmp/cache-pytest \
   -m 'smoke and not rwx_default_storage' \
   --tc-file=tests/global_config_lvms.py \
   --tc "default_storage_class:${CNV_STORAGE_CLASS}" \
   --tc "default_volume_mode:${CNV_VOLUME_MODE}" \
   --storage-class-matrix="${CNV_STORAGE_CLASS}" \
   --latest-rhel \
   --tb=native \
   --data-collector \
   --jira \
   --junit-xml="${ARTIFACT_DIR}/xunit_results.xml" \
   --pytest-log-file="${ARTIFACT_DIR}/pytest-tests.log" \
-  --html="${ARTIFACT_DIR}/report.html" \
-  --self-contained-html || /bin/true
+  --html="${ARTIFACT_DIR}/report.html" \
+  --self-contained-html || pytest_rc=$?
 
 FINISH_TIME=$(date "+%s")
 DIFF_TIME=$((FINISH_TIME - START_TIME))
@@
 else
     echo "Finished in: ${DIFF_TIME} sec"
 fi
+
+if [[ ${pytest_rc} -ne 0 ]]; then
+    exit "${pytest_rc}"
+fi
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/baremetalds/cnv/smoke/baremetalds-cnv-smoke-commands.sh`
around lines 16 - 44, The poetry run pytest command has `|| /bin/true` appended
which suppresses all test failures, allowing the script to continue even when
tests fail. Remove the `|| /bin/true` from the poetry run pytest command to
preserve the actual pytest exit status. Capture the exit code of the pytest
command in a variable, then after the time-based check completes, check that
captured exit code and exit with it if pytest failed, ensuring that test
failures are not silently converted to success when the test duration exceeds
600 seconds.

@openshift-ci openshift-ci Bot requested review from dgoodwin and hoxhaeris June 15, 2026 09:43
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kasturinarra

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-metal-ovn-two-node-fencing-cnv-smoke

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@kasturinarra: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

kasturinarra and others added 2 commits June 15, 2026 16:05
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kasturinarra

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-metal-ovn-two-node-fencing-cnv-smoke

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@kasturinarra: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

- Set LVM_OPERATOR_SUB_SOURCE to lvm-catalogsource (internal Konflux)
- Set LVM_OPERATOR_SUB_CHANNEL to stable-4.22
- Use CNV 4.22 test image (5.0 not available yet)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@kasturinarra

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-metal-ovn-two-node-fencing-cnv-smoke

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@kasturinarra: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@kasturinarra: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
periodic-ci-openshift-release-main-nightly-5.0-e2e-metal-ovn-two-node-fencing-cnv-smoke N/A periodic Periodic changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@kasturinarra

Copy link
Copy Markdown
Contributor Author

/pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-metal-ovn-two-node-fencing-cnv-smoke

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

@kasturinarra: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

@kasturinarra: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-openshift-release-main-nightly-5.0-e2e-metal-ovn-two-node-fencing-cnv-smoke bc0f140 link unknown /pj-rehearse periodic-ci-openshift-release-main-nightly-5.0-e2e-metal-ovn-two-node-fencing-cnv-smoke

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant