Skip to content

feat(controller): add OCP version auto-detection for RAG configuration#73

Merged
openshift-merge-bot[bot] merged 1 commit intoopenstack-lightspeed:mainfrom
omkarjoshi0304:OSPRH-23918
Feb 9, 2026
Merged

feat(controller): add OCP version auto-detection for RAG configuration#73
openshift-merge-bot[bot] merged 1 commit intoopenstack-lightspeed:mainfrom
omkarjoshi0304:OSPRH-23918

Conversation

@omkarjoshi0304
Copy link
Contributor

@omkarjoshi0304 omkarjoshi0304 commented Jan 22, 2026

Description

This PR implements the automatic detection of the OpenShift (OCP) cluster version to dynamically select the correct RAG database directory and index name.

Key Changes

  1. Auto-Detection: The operator now queries the config.openshift.io/v1/ClusterVersion API.
    • Matches detected version to supported DB paths (e.g., 4.16 -> /ocp/4.16).
    • Fallback Logic: If the detected version is newer or unsupported (e.g., 4.20), it defaults to latest and sets a Warning condition in the Status.
  • Status Fields Added:
    • detectedOCPVersion: The actual version found on the cluster.
    • activeOCPVersion: The version string used for paths (e.g., latest or 4.18).
    • ocpVersionFallback: Boolean flag indicating if fallback logic was used.

Testing / Verification

I have verified this functionality on a local CRC environment running OCP 4.20 (which triggers the fallback logic as expected).

1. Verification of Logic (Unit Tests)

Ran go test ./internal/controller/... covering detection, path generation, and fallback logic.

=== RUN   TestResolveOCPVersion/Unsupported_version_-_fallback
--- PASS: TestResolveOCPVersion/Unsupported_version_-_fallback (0.00s)
PASS

### 2. Verification from live cluster (operator log)

2026-01-22T14:13:55Z	INFO	Controllers.OpenStackLightspeed	OpenStackLightspeed Reconciling	{"controller": "openstacklightspeed", "controllerGroup": "lightspeed.openstack.org", "controllerKind": "OpenStackLightspeed", "OpenStackLightspeed": {"name":"test-lightspeed-2","namespace":"openshift-lightspeed"}, "namespace": "openshift-lightspeed", "name": "test-lightspeed-2", "reconcileID": "c693ac3f-c64b-4537-b02d-ebdc66bcd9fc"}
2026-01-22T14:13:55Z	INFO	Controllers.OpenStackLightspeed	Detected OCP cluster version	{"controller": "openstacklightspeed", "controllerGroup": "lightspeed.openstack.org", "controllerKind": "OpenStackLightspeed", "OpenStackLightspeed": {"name":"test-lightspeed-2","namespace":"openshift-lightspeed"}, "namespace": "openshift-lightspeed", "name": "test-lightspeed-2", "reconcileID": "c693ac3f-c64b-4537-b02d-ebdc66bcd9fc", "version": "4.20"}
2026-01-22T14:13:55Z	INFO	Controllers.OpenStackLightspeed	Using 'latest' OCP documentation as fallback	{"controller": "openstacklightspeed", "controllerGroup": "lightspeed.openstack.org", "controllerKind": "OpenStackLightspeed", "OpenStackLightspeed": {"name":"test-lightspeed-2","namespace":"openshift-lightspeed"}, "namespace": "openshift-lightspeed", "name": "test-lightspeed-2", "reconcileID": "c693ac3f-c64b-4537-b02d-ebdc66bcd9fc", "detectedVersion": "4.20", "supportedVersions": ["4.16", "4.18", "latest"]}

### 2. Verification from live cluster (CR Status Output)

status:
activeOCPVersion: latest
conditions:

  • lastTransitionTime: "2026-01-22T10:46:39Z"
    message: Waiting for the OpenShift Lightspeed operator to deploy.
    reason: Requested
    severity: Info
    status: "False"
    type: Ready
  • lastTransitionTime: "2026-01-22T13:45:28Z"
    message: 'Cluster version 4.20 is not explicitly supported. Using ''latest'' OCP
    documentation. Supported versions: [4.16 4.18 latest]'
    reason: Ready
    status: "True"
    type: OCPVersionResolved
  • lastTransitionTime: "2026-01-22T10:46:39Z"
    message: Waiting for the OpenShift Lightspeed operator to deploy.
    reason: Requested
    severity: Info
    status: "False"
    type: OpenShiftLightspeedOperatorReady
  • lastTransitionTime: "2026-01-22T10:46:39Z"
    message: OpenStack Lightspeed not started
    reason: Init
    status: Unknown
    type: OpenStackLightspeedReady
    detectedOCPVersion: "4.20"
    observedGeneration: 1
    ocpVersionFallback: true

@openshift-ci openshift-ci bot requested review from lpiwowar and umago January 22, 2026 15:11
Copy link
Contributor

@lpiwowar lpiwowar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just very quickly went through the code:). I'll revisit. Good start 💪 but definitely needs some more work. But once polished it is going to be a good addition to the repo:).

@lpiwowar
Copy link
Contributor

Copy link
Contributor

@Akrog Akrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff!!
Thank you for working on this much needed feature.
I have added some comments for your consideration.

// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:name="Status",type="string",JSONPath=".status.conditions[0].status",description="Status"
// +kubebuilder:printcolumn:name="Message",type="string",JSONPath=".status.conditions[0].message",description="Message"
// +kubebuilder:printcolumn:name="OCP Version",type="string",JSONPath=".status.activeOCPVersion"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be named OCP RAG instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be named OCP RAG instead?

renamed

LABEL operators.operatorframework.io.bundle.package.v1=openstack-lightspeed-operator
LABEL operators.operatorframework.io.bundle.channels.v1=alpha
LABEL operators.operatorframework.io.metrics.builder=operator-sdk-v1.38.0
LABEL operators.operatorframework.io.metrics.builder=operator-sdk-v1.42.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a necessary change to include within this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a necessary change to include within this PR?

reverted


// ActiveOCPVersion contains the OCP version being used for RAG configuration
// Will be one of: "4.16", "4.18", "latest", or empty if OCP RAG is disabled
ActiveOCPVersion string `json:"activeOCPVersion,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What value will it have if the user sets the override to 4.20 and uses a custom RAG image that includes that directory?
Because according to the comment it cannot be that value.
I think the name is not great, because ActiveOCPVersion may not fully indicate this is referring to the RAG being used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to ActiveOCPRAGVersion to clearly indicate this is the RAG version. Also updated the comment to: 'Contains the OCP version being used for RAG configuration. This is the override value if specified, otherwise the detected cluster version (or 'latest' for unsupported versions). Empty if OCP RAG is disabled

// +kubebuilder:validation:Optional
// OCPVersionOverride allows forcing a specific OCP version instead of auto-detection.
// Format should be like "4.15", "4.16", etc.
OCPVersionOverride string `json:"ocpVersionOverride,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add the word RAG in there to make it clear what it is about?
Something like OCPRAGVersionOverride?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed

- clusterversions
verbs:
- get
- list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should also watch the object so the deployment can be changed if OCP is updated from 4.16 to 4.18.
If it's too much work maybe we can do it in a follow up PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added watched object , we can see the changes in the RBAC and i have modified controller to set up watch

RAGImage string `json:"ragImage"`

// +kubebuilder:validation:Optional
// +kubebuilder:default=true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to default it to false, since our RAG Image doesn't include it yet due to the CPU restrictions of the job.


// Step 1: Detect cluster version
detectedVersion, err := DetectOCPVersion(ctx, helper)
instance.Status.DetectedOCPVersion = detectedVersion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we assign the version before we do the error check on L283?

cond := condition.FalseCondition(
apiv1beta1.OCPVersionCondition,
condition.ErrorReason,
condition.SeverityWarning,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the severity should be error, because the agent cannot work as expected without the OCP RAG, right?

And we don't configure OLS without it.

// Use override if provided
if overrideVersion != "" {
// Check if override is a supported version
if IsSupportedOCPVersion(overrideVersion) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinion we shouldn't check the override values, as they could be using a custom image that actually has that value.
It's only on the default image that we'll be limited to those 3 versions.
If they give us a version that is not available on the RAG image the OLS service should fail, and that's what we should report, right?

ActiveOCPVersion string `json:"activeOCPVersion,omitempty"`

// OCPVersionFallback indicates if using 'latest' as fallback for unsupported version
OCPVersionFallback bool `json:"ocpVersionFallback,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really something helpful? I mean, if we see the latest value on the version being used, then we know it's doing the fallback.

@openshift-ci
Copy link

openshift-ci bot commented Jan 22, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: omkarjoshi0304
Once this PR has been reviewed and has the lgtm label, please ask for approval from akrog. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@softwarefactory-project-zuul
Copy link

Merge Failed.

This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset.
Warning:
Error merging github.com/openstack-lightspeed/operator for 73,da35be9ff08d5d0e3b51c5cc9ece1020a9fbb623

@omkarjoshi0304 omkarjoshi0304 force-pushed the OSPRH-23918 branch 6 times, most recently from 9e8f1f8 to 0ec4544 Compare February 5, 2026 17:08
Copy link
Contributor

@Akrog Akrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good in terms of functionality, though there are a couple of things I would prefer to see it changed before merging.


// ActiveOCPVersion contains the OCP version being used for RAG configuration
// Will be one of: "4.16", "4.18", "latest", or empty if OCP RAG is disabled
ActiveOCPRAGVersion string `json:"activeOCPVersion,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this in an easy change and we should be consistent

type: string
enableOCPRAG:
default: false
description: EnableOCPRAG enables automatic OCP documentation based
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't write the name of the field in the description, please.

type: string
ocpVersionOverride:
description: |-
OCPVersionOverride allows forcing a specific OCP version instead of auto-detection.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

properties:
activeOCPVersion:
description: |-
ActiveOCPVersion contains the OCP version being used for RAG configuration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

// "latest" -> "ocp-product-docs-latest"
func GetOCPIndexName(version string) string {
// For 'latest', keep it as-is
if version == OCPVersionLatest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This if section seems unnecessary to me.
You can just do the code in L115-L116 and you'll get the same results, it's just that the replacing of the "." with "_" doesn't do anything.


// IsSupportedOCPVersion checks if the version is explicitly supported in RAG DB
func IsSupportedOCPVersion(version string) bool {
for _, supported := range SupportedOCPVersions {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return overrideVersion, false, nil
}

// Use detected version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is incorrect

@omkarjoshi0304 omkarjoshi0304 force-pushed the OSPRH-23918 branch 7 times, most recently from 432d848 to 4081e9d Compare February 6, 2026 11:40
Copy link
Contributor

@lpiwowar lpiwowar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is at least one thing that I believe should be fixed in the Kuttl tests 🙈 .

reason: Ready
message: OpenStack Lightspeed created
---
apiVersion: lightspeed.openstack.org/v1beta1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question (blocking): This does not feel right. Why is there a second OpenStackLightspeed instance in this file? There should be only one. The changes should be made to the OpenStackLighspeed instance above.

userDataCollection:
feedbackDisabled: true
transcriptsDisabled: true
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion (non-blocking): If you want to assert the OpenStackLightspeed update, then let's move it to a separate file 06-assert-openstacklightspeed-update.yaml.

Comment on lines 8 to 9
catalogSourceName: redhat-operators
catalogSourceNamespace: openshift-marketplace
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

issue (non-blocking): These changes do not harm. Just something to keep in mind that the PR should focus on one thing. And if you want to make some changes in adjecent code then ideally it should be a separate commit.

@omkarjoshi0304 omkarjoshi0304 force-pushed the OSPRH-23918 branch 6 times, most recently from 13eaed6 to 3b1af57 Compare February 6, 2026 15:50
Add automatic OpenShift Container Platform (OCP) version detection to enable
version-specific documentation in the RAG (Retrieval Augmented Generation)
system. The operator now detects the cluster's OCP version and configures
the appropriate documentation sources.

Features:
- Auto-detect OCP cluster version from ClusterVersion resource
- Support for manual version override via ocpVersionOverride field
- Fallback to 'latest' docs for unsupported OCP versions
- Enable/disable OCP RAG via enableOCPRAG field
- Status field activeOCPRAGVersion shows currently active version
- Watch ClusterVersion changes to trigger reconciliation on upgrades

Supported OCP versions: 4.16, 4.18

Tests:
- Updated KUTTL tests to validate OCP version override functionality
- Split assertion files for better test organization
- Fixed test field names and condition expectations

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
@lpiwowar
Copy link
Contributor

lpiwowar commented Feb 6, 2026

LGTM!:) 👍 Once the CI passes, I'm ok with merging.

@lpiwowar
Copy link
Contributor

lpiwowar commented Feb 9, 2026

/lgtm

Thanks!

@lpiwowar
Copy link
Contributor

lpiwowar commented Feb 9, 2026

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Feb 9, 2026
Copy link
Contributor

@Akrog Akrog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@Akrog Akrog added the approved label Feb 9, 2026
@openshift-merge-bot openshift-merge-bot bot merged commit d1328e1 into openstack-lightspeed:main Feb 9, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants