Skip to content

OCPCLOUD-2664: Update operatorstatus to write correct sub-Conditions#552

Open
mdbooth wants to merge 2 commits into
openshift:mainfrom
openshift-cloud-team:clusteroperator-conditions
Open

OCPCLOUD-2664: Update operatorstatus to write correct sub-Conditions#552
mdbooth wants to merge 2 commits into
openshift:mainfrom
openshift-cloud-team:clusteroperator-conditions

Conversation

@mdbooth
Copy link
Copy Markdown
Contributor

@mdbooth mdbooth commented May 11, 2026

Summary by CodeRabbit

  • Refactor

    • Streamlined operator status condition handling: Progressing is always tracked; Available is only set when explicitly determined.
    • Internal condition representation simplified; degraded condition removed.
  • Bug Fixes

    • Prevents asserting Available=True before first explicit availability reconciliation.
  • Tests

    • Updated and added tests to reflect new Available/Progressing semantics and condition lookup behavior.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 11, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 11, 2026

@mdbooth: This pull request references OCPCLOUD-2664 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 11, 2026

Walkthrough

Replaced reconcile condition representations with a lightweight partialCondition; ReconcileResult now always carries Progressing and optionally Available. WriteClusterOperatorStatus always writes Progressing, writes Available only when explicitly set (otherwise preserves existing), and removed Degraded emission. Tests updated to new semantics.

Changes

Condition state refactor & status write semantics

Layer / File(s) Summary
Model / Types
pkg/operatorstatus/controller_status.go
Introduce partialCondition, add ConditionAvailableSuffix/ConditionProgressingSuffix, replace ReconcileResult fields: progressing partialCondition, available *partialCondition (remove degraded).
Result construction
pkg/operatorstatus/controller_status.go
Refactor controller helpers to use newReconcileResult; Success/NonRetryableError explicitly set available, other paths set only progressing.
Status write / merging
pkg/operatorstatus/controller_status.go
Rewrite WriteClusterOperatorStatus to always write Progressing, write Available only when ReconcileResult.available is set (otherwise preserve existing Available if present); remove Degraded emission; add findClusterOperatorCondition and preserve LastTransitionTime when unchanged.
Unit tests — operatorstatus
pkg/operatorstatus/controller_status_test.go
Update controller result generator tests to assert partialCondition values and optional available; refactor TestWriteClusterOperatorStatus to seed existing conditions, validate progressing/available behavior, add TestFindClusterOperatorCondition, and switch to testr.New(t) logger.
Unit tests — controllers (installer & revision)
pkg/controllers/installer/helpers_test.go, pkg/controllers/installer/installer_controller_test.go, pkg/controllers/revision/revision_controller_test.go
Replace test expectations that referenced Degraded with Available semantics (*Available=False where failures were previously *Degraded=True); update constants and assertions across installer and revision controller tests to match new condition naming/behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 25.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Assertion messages are missing. Tests lack explanatory text in Expect().To/ToNot() calls, making failures hard to diagnose. Add failure messages as second argument: g.Expect(err).ToNot(HaveOccurred(), "failed to write ClusterOperator status"). Apply to all Expect assertions in modified test files.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: updating operatorstatus to write correct sub-Conditions, which directly aligns with the refactoring of condition handling from degraded/progressing/available representations to the new partialCondition structure.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All Ginkgo test names in the PR are stable and deterministic. No dynamic content detected: no generated pod/node names, timestamps, UUIDs, IPs, or dynamic string building in test titles.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests added. PR modifies existing tests and adds one standard Go unit test (TestFindClusterOperatorCondition). Check not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR modifies unit/integration tests only. SNO check applies to e2e tests in test/e2e. Not applicable.
Topology-Aware Scheduling Compatibility ✅ Passed This PR only modifies status reporting logic and tests. No scheduling constraints, manifests, or pod configurations are introduced.
Ote Binary Stdout Contract ✅ Passed PR modifies only library code and test files. No process-level entry points (main/init/TestMain/BeforeSuite) are changed. No stdout writes detected. All changes comply with OTE Binary Stdout Contract.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No new Ginkgo e2e tests added. Only unit test and condition assertion refactoring. No IPv4 assumptions or external connectivity requirements.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

mdbooth added 2 commits May 11, 2026 14:22
Updates conditions written by operatorstatus to be in line with their
definitions in the clusterv1 package.
@mdbooth mdbooth force-pushed the clusteroperator-conditions branch from 15e99cf to a927a34 Compare May 11, 2026 13:23
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
pkg/operatorstatus/controller_status.go (1)

268-287: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't skip the status patch when only obsolete managed conditions need removal.

If an upgraded ClusterOperator still has a legacy <Controller>Degraded condition and the desired Progressing/Available values are otherwise unchanged, mergeConditions returns false and Line 285 exits before patching. That leaves the stale condition behind indefinitely, so existing objects never fully converge to the new condition set.

Please treat “managed condition present in co.Status.Conditions but absent from conditions” as an update, and add a regression test that seeds a legacy degraded condition.

Possible fix
 import (
 	"context"
 	"errors"
 	"fmt"
+	"strings"
 	"time"
@@
 	updated := mergeConditions(conditions, co.Status.Conditions)
+	if !updated {
+		desiredTypes := map[configv1.ClusterStatusConditionType]struct{}{}
+		for _, cond := range conditions {
+			desiredTypes[*cond.Type] = struct{}{}
+		}
+
+		for i := range co.Status.Conditions {
+			condType := co.Status.Conditions[i].Type
+			if strings.HasPrefix(string(condType), string(r.ControllerResultGenerator)) {
+				if _, ok := desiredTypes[condType]; !ok {
+					updated = true
+					break
+				}
+			}
+		}
+	}
 	if !updated {
 		return nil
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/operatorstatus/controller_status.go` around lines 268 - 287, The current
logic builds a desired conditions slice and calls mergeConditions, but
mergeConditions returns false when desired conditions equal existing ones even
if existing contains obsolete managed conditions (e.g., legacy
<Controller>Degraded) that should be removed; update mergeConditions (or wrap
it) so it treats a managed condition present in co.Status.Conditions but missing
from the desired conditions slice as a change (i.e., mark updated=true and
remove that condition), using the same identifying keys as
r.condition/ConditionProgressingSuffix/ConditionAvailableSuffix and
findClusterOperatorCondition to detect managed condition types; also add a
regression test that seeds an existing ClusterOperator.Status.Conditions with a
legacy degraded condition and asserts the controller patches it away when
desired Progressing/Available are unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@pkg/operatorstatus/controller_status.go`:
- Around line 268-287: The current logic builds a desired conditions slice and
calls mergeConditions, but mergeConditions returns false when desired conditions
equal existing ones even if existing contains obsolete managed conditions (e.g.,
legacy <Controller>Degraded) that should be removed; update mergeConditions (or
wrap it) so it treats a managed condition present in co.Status.Conditions but
missing from the desired conditions slice as a change (i.e., mark updated=true
and remove that condition), using the same identifying keys as
r.condition/ConditionProgressingSuffix/ConditionAvailableSuffix and
findClusterOperatorCondition to detect managed condition types; also add a
regression test that seeds an existing ClusterOperator.Status.Conditions with a
legacy degraded condition and asserts the controller patches it away when
desired Progressing/Available are unchanged.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 598d4b85-2dec-4cbd-ba56-1ba518e4242f

📥 Commits

Reviewing files that changed from the base of the PR and between 15e99cf and a927a34.

⛔ Files ignored due to path filters (2)
  • vendor/github.com/go-logr/logr/testr/testr.go is excluded by !**/vendor/**, !vendor/**
  • vendor/modules.txt is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (5)
  • pkg/controllers/installer/helpers_test.go
  • pkg/controllers/installer/installer_controller_test.go
  • pkg/controllers/revision/revision_controller_test.go
  • pkg/operatorstatus/controller_status.go
  • pkg/operatorstatus/controller_status_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/operatorstatus/controller_status_test.go

Copy link
Copy Markdown
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, mostly LGTM, a couple of nits, a couple of Qs.

Comment thread pkg/operatorstatus/controller_status_test.go
Comment thread pkg/operatorstatus/controller_status_test.go
Comment thread pkg/operatorstatus/controller_status.go
Comment on lines +91 to +97
// a ReconcileResult must always have an explicit progressing condition
progressing partialCondition

// the available condition is optional, and will be maintained from the
// current state if not set explicitly
available *partialCondition

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So are we planning not to set degraded ever at all?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in the aggregator. I'm thinking: any Progressing condition (error or not) which persists for more than duration X, where X is the colour of your bike shed.

Copy link
Copy Markdown
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label May 12, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aws-capi-disconnected-techpreview
/test e2e-aws-capi-techpreview
/test e2e-aws-ovn
/test e2e-aws-ovn-serial-1of2
/test e2e-aws-ovn-serial-2of2
/test e2e-aws-ovn-techpreview
/test e2e-aws-ovn-techpreview-upgrade
/test e2e-azure-capi-techpreview
/test e2e-azure-ovn-techpreview
/test e2e-azure-ovn-techpreview-upgrade
/test e2e-gcp-capi-techpreview
/test e2e-gcp-ovn-techpreview
/test e2e-metal3-capi-techpreview
/test e2e-openstack-capi-techpreview
/test e2e-openstack-ovn-techpreview
/test e2e-vsphere-capi-techpreview
/test regression-clusterinfra-aws-ipi-techpreview-capi

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: damdo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 12, 2026
@mdbooth
Copy link
Copy Markdown
Contributor Author

mdbooth commented May 12, 2026

/override e2e-openstack-ovn-techpreview

Permafailing

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

@mdbooth: /override requires failed status contexts, check run or a prowjob name to operate on.
The following unknown contexts/checkruns were given:

  • e2e-openstack-ovn-techpreview

Only the following failed contexts/checkruns were expected:

  • CodeRabbit
  • ci/prow/build
  • ci/prow/e2e-aws-capi-disconnected-techpreview
  • ci/prow/e2e-aws-capi-techpreview
  • ci/prow/e2e-aws-ovn
  • ci/prow/e2e-aws-ovn-serial-1of2
  • ci/prow/e2e-aws-ovn-serial-2of2
  • ci/prow/e2e-aws-ovn-techpreview
  • ci/prow/e2e-aws-ovn-techpreview-upgrade
  • ci/prow/e2e-azure-capi-techpreview
  • ci/prow/e2e-azure-ovn-techpreview
  • ci/prow/e2e-azure-ovn-techpreview-upgrade
  • ci/prow/e2e-gcp-capi-techpreview
  • ci/prow/e2e-gcp-ovn-techpreview
  • ci/prow/e2e-metal3-capi-techpreview
  • ci/prow/e2e-openstack-capi-techpreview
  • ci/prow/e2e-openstack-ovn-techpreview
  • ci/prow/e2e-vsphere-capi-techpreview
  • ci/prow/images
  • ci/prow/lint
  • ci/prow/okd-scos-images
  • ci/prow/regression-clusterinfra-aws-ipi-techpreview-capi
  • ci/prow/unit
  • ci/prow/vendor
  • ci/prow/verify-deps
  • pull-ci-openshift-cluster-capi-operator-main-build
  • pull-ci-openshift-cluster-capi-operator-main-e2e-aws-capi-disconnected-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-aws-capi-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-aws-ovn
  • pull-ci-openshift-cluster-capi-operator-main-e2e-aws-ovn-serial-1of2
  • pull-ci-openshift-cluster-capi-operator-main-e2e-aws-ovn-serial-2of2
  • pull-ci-openshift-cluster-capi-operator-main-e2e-aws-ovn-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-aws-ovn-techpreview-upgrade
  • pull-ci-openshift-cluster-capi-operator-main-e2e-azure-capi-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-azure-ovn-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-azure-ovn-techpreview-upgrade
  • pull-ci-openshift-cluster-capi-operator-main-e2e-gcp-capi-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-gcp-ovn-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-metal3-capi-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-openstack-capi-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-openstack-ovn-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-e2e-vsphere-capi-techpreview
  • pull-ci-openshift-cluster-capi-operator-main-images
  • pull-ci-openshift-cluster-capi-operator-main-lint
  • pull-ci-openshift-cluster-capi-operator-main-okd-scos-images
  • pull-ci-openshift-cluster-capi-operator-main-regression-clusterinfra-aws-ipi-techpreview-capi
  • pull-ci-openshift-cluster-capi-operator-main-unit
  • pull-ci-openshift-cluster-capi-operator-main-vendor
  • pull-ci-openshift-cluster-capi-operator-main-verify-deps
  • tide

If you are trying to override a checkrun that has a space in it, you must put a double quote on the context.

Details

In response to this:

/override e2e-openstack-ovn-techpreview

Permafailing

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@mdbooth
Copy link
Copy Markdown
Contributor Author

mdbooth commented May 12, 2026

/override pull-ci-openshift-cluster-capi-operator-main-e2e-openstack-ovn-techpreview

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

@mdbooth: Overrode contexts on behalf of mdbooth: ci/prow/e2e-openstack-ovn-techpreview

Details

In response to this:

/override pull-ci-openshift-cluster-capi-operator-main-e2e-openstack-ovn-techpreview

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@damdo
Copy link
Copy Markdown
Member

damdo commented May 12, 2026

/override ci/prow/e2e-gcp-ovn-techpreview

Unrelated issue.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

@damdo: Overrode contexts on behalf of damdo: ci/prow/e2e-gcp-ovn-techpreview

Details

In response to this:

/override ci/prow/e2e-gcp-ovn-techpreview

Unrelated issue.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 12, 2026

@mdbooth: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/regression-clusterinfra-aws-ipi-techpreview-capi a927a34 link false /test regression-clusterinfra-aws-ipi-techpreview-capi

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants