feat(controller): add OCP version auto-detection for RAG configuration#73
Conversation
lpiwowar
left a comment
There was a problem hiding this comment.
Just very quickly went through the code:). I'll revisit. Good start 💪 but definitely needs some more work. But once polished it is going to be a good addition to the repo:).
|
Akrog
left a comment
There was a problem hiding this comment.
Great stuff!!
Thank you for working on this much needed feature.
I have added some comments for your consideration.
| // +kubebuilder:subresource:status | ||
| // +kubebuilder:printcolumn:name="Status",type="string",JSONPath=".status.conditions[0].status",description="Status" | ||
| // +kubebuilder:printcolumn:name="Message",type="string",JSONPath=".status.conditions[0].message",description="Message" | ||
| // +kubebuilder:printcolumn:name="OCP Version",type="string",JSONPath=".status.activeOCPVersion" |
There was a problem hiding this comment.
Should this be named OCP RAG instead?
There was a problem hiding this comment.
Should this be named
OCP RAGinstead?
renamed
bundle.Dockerfile
Outdated
| LABEL operators.operatorframework.io.bundle.package.v1=openstack-lightspeed-operator | ||
| LABEL operators.operatorframework.io.bundle.channels.v1=alpha | ||
| LABEL operators.operatorframework.io.metrics.builder=operator-sdk-v1.38.0 | ||
| LABEL operators.operatorframework.io.metrics.builder=operator-sdk-v1.42.0 |
There was a problem hiding this comment.
Is this a necessary change to include within this PR?
There was a problem hiding this comment.
Is this a necessary change to include within this PR?
reverted
|
|
||
| // ActiveOCPVersion contains the OCP version being used for RAG configuration | ||
| // Will be one of: "4.16", "4.18", "latest", or empty if OCP RAG is disabled | ||
| ActiveOCPVersion string `json:"activeOCPVersion,omitempty"` |
There was a problem hiding this comment.
What value will it have if the user sets the override to 4.20 and uses a custom RAG image that includes that directory?
Because according to the comment it cannot be that value.
I think the name is not great, because ActiveOCPVersion may not fully indicate this is referring to the RAG being used.
There was a problem hiding this comment.
Renamed to ActiveOCPRAGVersion to clearly indicate this is the RAG version. Also updated the comment to: 'Contains the OCP version being used for RAG configuration. This is the override value if specified, otherwise the detected cluster version (or 'latest' for unsupported versions). Empty if OCP RAG is disabled
| // +kubebuilder:validation:Optional | ||
| // OCPVersionOverride allows forcing a specific OCP version instead of auto-detection. | ||
| // Format should be like "4.15", "4.16", etc. | ||
| OCPVersionOverride string `json:"ocpVersionOverride,omitempty"` |
There was a problem hiding this comment.
Should we add the word RAG in there to make it clear what it is about?
Something like OCPRAGVersionOverride?
| - clusterversions | ||
| verbs: | ||
| - get | ||
| - list |
There was a problem hiding this comment.
We should also watch the object so the deployment can be changed if OCP is updated from 4.16 to 4.18.
If it's too much work maybe we can do it in a follow up PR.
There was a problem hiding this comment.
added watched object , we can see the changes in the RBAC and i have modified controller to set up watch
| RAGImage string `json:"ragImage"` | ||
|
|
||
| // +kubebuilder:validation:Optional | ||
| // +kubebuilder:default=true |
There was a problem hiding this comment.
We need to default it to false, since our RAG Image doesn't include it yet due to the CPU restrictions of the job.
|
|
||
| // Step 1: Detect cluster version | ||
| detectedVersion, err := DetectOCPVersion(ctx, helper) | ||
| instance.Status.DetectedOCPVersion = detectedVersion |
There was a problem hiding this comment.
Why do we assign the version before we do the error check on L283?
| cond := condition.FalseCondition( | ||
| apiv1beta1.OCPVersionCondition, | ||
| condition.ErrorReason, | ||
| condition.SeverityWarning, |
There was a problem hiding this comment.
I think the severity should be error, because the agent cannot work as expected without the OCP RAG, right?
And we don't configure OLS without it.
internal/controller/ocp_version.go
Outdated
| // Use override if provided | ||
| if overrideVersion != "" { | ||
| // Check if override is a supported version | ||
| if IsSupportedOCPVersion(overrideVersion) { |
There was a problem hiding this comment.
In my opinion we shouldn't check the override values, as they could be using a custom image that actually has that value.
It's only on the default image that we'll be limited to those 3 versions.
If they give us a version that is not available on the RAG image the OLS service should fail, and that's what we should report, right?
| ActiveOCPVersion string `json:"activeOCPVersion,omitempty"` | ||
|
|
||
| // OCPVersionFallback indicates if using 'latest' as fallback for unsupported version | ||
| OCPVersionFallback bool `json:"ocpVersionFallback,omitempty"` |
There was a problem hiding this comment.
Is this really something helpful? I mean, if we see the latest value on the version being used, then we know it's doing the fallback.
4a793b4 to
da35be9
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: omkarjoshi0304 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Merge Failed. This change or one of its cross-repo dependencies was unable to be automatically merged with the current state of its repository. Please rebase the change and upload a new patchset. |
da35be9 to
cbc4e59
Compare
927c36e to
ba2298b
Compare
9e8f1f8 to
0ec4544
Compare
Akrog
left a comment
There was a problem hiding this comment.
The code looks good in terms of functionality, though there are a couple of things I would prefer to see it changed before merging.
|
|
||
| // ActiveOCPVersion contains the OCP version being used for RAG configuration | ||
| // Will be one of: "4.16", "4.18", "latest", or empty if OCP RAG is disabled | ||
| ActiveOCPRAGVersion string `json:"activeOCPVersion,omitempty"` |
There was a problem hiding this comment.
I think this in an easy change and we should be consistent
| type: string | ||
| enableOCPRAG: | ||
| default: false | ||
| description: EnableOCPRAG enables automatic OCP documentation based |
There was a problem hiding this comment.
Don't write the name of the field in the description, please.
| type: string | ||
| ocpVersionOverride: | ||
| description: |- | ||
| OCPVersionOverride allows forcing a specific OCP version instead of auto-detection. |
| properties: | ||
| activeOCPVersion: | ||
| description: |- | ||
| ActiveOCPVersion contains the OCP version being used for RAG configuration |
internal/controller/ocp_version.go
Outdated
| // "latest" -> "ocp-product-docs-latest" | ||
| func GetOCPIndexName(version string) string { | ||
| // For 'latest', keep it as-is | ||
| if version == OCPVersionLatest { |
There was a problem hiding this comment.
This if section seems unnecessary to me.
You can just do the code in L115-L116 and you'll get the same results, it's just that the replacing of the "." with "_" doesn't do anything.
internal/controller/ocp_version.go
Outdated
|
|
||
| // IsSupportedOCPVersion checks if the version is explicitly supported in RAG DB | ||
| func IsSupportedOCPVersion(version string) bool { | ||
| for _, supported := range SupportedOCPVersions { |
There was a problem hiding this comment.
nit: I believe we could use slices.Contains for this:
https://pkg.go.dev/slices#Contains
https://golangtutorial.dev/tips/golang-slice-contains-method/
internal/controller/ocp_version.go
Outdated
| return overrideVersion, false, nil | ||
| } | ||
|
|
||
| // Use detected version |
432d848 to
4081e9d
Compare
lpiwowar
left a comment
There was a problem hiding this comment.
There is at least one thing that I believe should be fixed in the Kuttl tests 🙈 .
| reason: Ready | ||
| message: OpenStack Lightspeed created | ||
| --- | ||
| apiVersion: lightspeed.openstack.org/v1beta1 |
There was a problem hiding this comment.
question (blocking): This does not feel right. Why is there a second OpenStackLightspeed instance in this file? There should be only one. The changes should be made to the OpenStackLighspeed instance above.
| userDataCollection: | ||
| feedbackDisabled: true | ||
| transcriptsDisabled: true | ||
| --- |
There was a problem hiding this comment.
suggestion (non-blocking): If you want to assert the OpenStackLightspeed update, then let's move it to a separate file 06-assert-openstacklightspeed-update.yaml.
| catalogSourceName: redhat-operators | ||
| catalogSourceNamespace: openshift-marketplace |
There was a problem hiding this comment.
issue (non-blocking): These changes do not harm. Just something to keep in mind that the PR should focus on one thing. And if you want to make some changes in adjecent code then ideally it should be a separate commit.
13eaed6 to
3b1af57
Compare
Add automatic OpenShift Container Platform (OCP) version detection to enable version-specific documentation in the RAG (Retrieval Augmented Generation) system. The operator now detects the cluster's OCP version and configures the appropriate documentation sources. Features: - Auto-detect OCP cluster version from ClusterVersion resource - Support for manual version override via ocpVersionOverride field - Fallback to 'latest' docs for unsupported OCP versions - Enable/disable OCP RAG via enableOCPRAG field - Status field activeOCPRAGVersion shows currently active version - Watch ClusterVersion changes to trigger reconciliation on upgrades Supported OCP versions: 4.16, 4.18 Tests: - Updated KUTTL tests to validate OCP version override functionality - Split assertion files for better test organization - Fixed test field names and condition expectations Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
3b1af57 to
ac2a209
Compare
|
LGTM!:) 👍 Once the CI passes, I'm ok with merging. |
|
/lgtm Thanks! |
|
/lgtm |
d1328e1
into
openstack-lightspeed:main
Description
This PR implements the automatic detection of the OpenShift (OCP) cluster version to dynamically select the correct RAG database directory and index name.
Key Changes
config.openshift.io/v1/ClusterVersionAPI.4.16->/ocp/4.16).4.20), it defaults tolatestand sets a Warning condition in the Status.detectedOCPVersion: The actual version found on the cluster.activeOCPVersion: The version string used for paths (e.g.,latestor4.18).ocpVersionFallback: Boolean flag indicating if fallback logic was used.Testing / Verification
I have verified this functionality on a local CRC environment running OCP 4.20 (which triggers the fallback logic as expected).
1. Verification of Logic (Unit Tests)
Ran
go test ./internal/controller/...covering detection, path generation, and fallback logic.status:
activeOCPVersion: latest
conditions:
message: Waiting for the OpenShift Lightspeed operator to deploy.
reason: Requested
severity: Info
status: "False"
type: Ready
message: 'Cluster version 4.20 is not explicitly supported. Using ''latest'' OCP
documentation. Supported versions: [4.16 4.18 latest]'
reason: Ready
status: "True"
type: OCPVersionResolved
message: Waiting for the OpenShift Lightspeed operator to deploy.
reason: Requested
severity: Info
status: "False"
type: OpenShiftLightspeedOperatorReady
message: OpenStack Lightspeed not started
reason: Init
status: Unknown
type: OpenStackLightspeedReady
detectedOCPVersion: "4.20"
observedGeneration: 1
ocpVersionFallback: true