Skip to content

SPLAT-2668: hypershift/aws/ccm: enable optional managed security group conformance job#77567

Open
mtulio wants to merge 1 commit into
openshift:mainfrom
mtulio:splat-2587-hypershift-conformance-nlb
Open

SPLAT-2668: hypershift/aws/ccm: enable optional managed security group conformance job#77567
mtulio wants to merge 1 commit into
openshift:mainfrom
mtulio:splat-2587-hypershift-conformance-nlb

Conversation

@mtulio
Copy link
Copy Markdown
Contributor

@mtulio mtulio commented Apr 9, 2026

Summary

Add monthly periodic jobs to validate the AWS CCM managed security group feature
(AWSServiceLBNetworkSecurityGroup) on HyperShift for 4.23 and 5.0.

Changes

  • Periodics (release-4.23 and release-5.0): Two monthly jobs each:
    • e2e-aws-ovn-conformance-ccm — conformance with managed SG enabled
    • e2e-aws-ovn-conformance-ccm-techpreview — same with TechPreviewNoUpgrade
    • Both use hypershift-aws-conformance workflow with TEST_SKIPS adjusted
      to include NLB security group tests
  • Slack alerts: Failure/error notifications to #forum-ocp-splat-alerts-aws

Context

Dedicated jobs to validate CCM managed SG on HyperShift without disrupting
existing monitoring. Once the feature is stable, these will be removed and the
AWSServiceLBNetworkSecurityGroup skip reverted from baseline conformance.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 9, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 9, 2026

@mtulio: This pull request references SPLAT-2587 which is a valid jira issue.

Details

In response to this:

Create a dedicated conformance job enabling managed security group feature of CCM for NLB, so that we can validate the feature on hypershift at PR
openshift/hypershift#7460

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot requested review from devguyio and enxebre April 9, 2026 02:19
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 9, 2026

@mtulio: This pull request references SPLAT-2587 which is a valid jira issue.

Details

In response to this:

Create a dedicated conformance job enabling managed security group feature of CCM for NLB, so that we can validate the feature on hypershift at PR
openshift/hypershift#7460

The new job presubmit job for hypershift project, overrides the TESTS_SKIPS from conformance workflow. A dedicated job has been created to prevent disruption in existing monitoring, while we can validate the PR individually. Once the tests are validated on Hypershift, and feature deliverable and stable, we will remove this one, and the TEST_SKIPS for AWSServiceLBNetworkSecurityGroup

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Apr 9, 2026

This is expected to fail as the feature is not merged on hypershift:

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-conformance-aws-ccm-nlb-sg

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@mtulio: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Apr 9, 2026

/assign enxebre

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 9, 2026

@mtulio: This pull request references SPLAT-2587 which is a valid jira issue.

Details

In response to this:

Create a dedicated conformance job enabling managed security group feature of CCM for NLB, so that we can validate the feature on hypershift at PR
openshift/hypershift#7460

The new job presubmit job for hypershift project, overrides the TESTS_SKIPS from conformance workflow. A dedicated job has been created to prevent disruption in existing monitoring, while we can validate the PR individually. Once the tests are validated on Hypershift, and feature deliverable and stable, we will remove this one, and the TEST_SKIPS for AWSServiceLBNetworkSecurityGroup

https://redhat.atlassian.net/browse/SPLAT-2668

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio mtulio changed the title SPLAT-2587: hypershift/aws/ccm: enable managed security group tests to conformance SPLAT-2668: hypershift/aws/ccm: enable managed security group tests to conformance Apr 9, 2026
@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 9, 2026

@mtulio: This pull request references SPLAT-2668 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Create a dedicated conformance job enabling managed security group feature of CCM for NLB, so that we can validate the feature on hypershift at PR
openshift/hypershift#7460

The new job presubmit job for hypershift project, overrides the TESTS_SKIPS from conformance workflow. A dedicated job has been created to prevent disruption in existing monitoring, while we can validate the PR individually. Once the tests are validated on Hypershift, and feature deliverable and stable, we will remove this one, and the TEST_SKIPS for AWSServiceLBNetworkSecurityGroup

https://redhat.atlassian.net/browse/SPLAT-2668

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented Apr 9, 2026

@mtulio: This pull request references SPLAT-2668 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Create a dedicated conformance job enabling managed security group feature of CCM for NLB, so that we can validate the feature on hypershift at PR
openshift/hypershift#7460

The new job presubmit job for hypershift project, overrides the TESTS_SKIPS from conformance workflow. A dedicated job has been created to prevent disruption in existing monitoring, while we can validate the PR individually. Once the tests are validated on Hypershift, and feature deliverable and stable, we will remove this one, and the TEST_SKIPS for AWSServiceLBNetworkSecurityGroup

https://redhat.atlassian.net/browse/SPLAT-2587
https://redhat.atlassian.net/browse/SPLAT-2668

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio mtulio force-pushed the splat-2587-hypershift-conformance-nlb branch from 2110c4c to ddf4d3a Compare April 9, 2026 12:59
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Apr 9, 2026

job is passing: rehearse-77567-pull-ci-openshift-hypershift-main-e2e-conformance-aws-ccm-nlb-sg #2042066510754615296

I just fixed the "ci metadata" with EOL fixes (nit):

/pj-rehearse ack

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mtulio: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot openshift-merge-bot Bot added rehearsals-ack Signifies that rehearsal jobs have been acknowledged and removed rehearsals-ack Signifies that rehearsal jobs have been acknowledged labels Apr 9, 2026
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Apr 9, 2026

/assign @muraee

@mtulio mtulio changed the title SPLAT-2668: hypershift/aws/ccm: enable managed security group tests to conformance SPLAT-2668: hypershift/aws/ccm: enable optional managed security group conformance job Apr 9, 2026
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Apr 9, 2026

/assign @enxebre

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Apr 9, 2026

@enxebre @muraee WDYT to create a periodic and an alert to Slack (my teams channel) to this job, so i can monitor it closely. Are you ok to add one more job for that purpose? I will remove it as part of getting complete the task https://redhat.atlassian.net/browse/SPLAT-2668

@vr4manta
Copy link
Copy Markdown
Contributor

vr4manta commented Apr 9, 2026

/lgtm

@openshift-ci openshift-ci Bot added lgtm Indicates that a PR is ready to be merged. and removed lgtm Indicates that a PR is ready to be merged. labels Apr 9, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 10, 2026

New changes are detected. LGTM label has been removed.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Apr 10, 2026

Adding the periiodic with alerts to SPLAT channel to monitor closely the progress of NLB+SG feature

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented Apr 10, 2026

/pj-rehearse periodic-ci-openshift-release-main-hypershift-4.22-hypershift-conformance-aws-ccm-nlb-sg

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mtulio: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 4, 2026

subset PR has been filed to isolate changes #78757

@mtulio mtulio force-pushed the splat-2587-hypershift-conformance-nlb branch from f1a1529 to 957491c Compare May 6, 2026 06:44
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Add AWS CCM conformance job entries (one optional main job, two monthly periodics) that use the hypershift-aws-conformance workflow with PUBLIC_ONLY: "true" and CCM/NLB-related TEST_SKIPS; update prowgen Slack reporter to notify #forum-ocp-splat-alerts-aws for these jobs.

Changes

AWS CCM Conformance tests

Layer / File(s) Summary
Main optional test entry
ci-operator/config/openshift/hypershift/openshift-hypershift-main.yaml
Insert e2e-conformance-aws-ccm (always_run: false, optional: true) with cluster_profile: hypershift-aws, workflow: hypershift-aws-conformance, env: PUBLIC_ONLY="true", and TEST_SKIPS that exclude NLB target-node-labels and NLB internal (hairpinning) reachability scenarios.
Monthly periodic tests (release-4.23)
ci-operator/config/openshift/hypershift/openshift-hypershift-release-4.23__periodics.yaml
Add e2e-aws-ovn-conformance-ccm and e2e-aws-ovn-conformance-ccm-techpreview (both cron: '@monthly') using cluster_profile: hypershift-aws, workflow: hypershift-aws-conformance, env: PUBLIC_ONLY="true", and the same TEST_SKIPS. The -techpreview variant also sets GUEST_FEATURE_SET: TechPreviewNoUpgrade.

Prowgen Slack reporter

Layer / File(s) Summary
Slack reporter config
ci-operator/config/openshift/hypershift/.config.prowgen
Add slack_reporter targeting #forum-ocp-splat-alerts-aws for job_states_to_report: [failure, error], provide a report_template with job name/state and links, include job_names for e2e-conformance-aws-ccm, e2e-aws-ovn-conformance-ccm, e2e-aws-ovn-conformance-ccm-techpreview, and set excluded_variants for specified jobs.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • openshift/release#79127: Modifies Prowgen Slack reporter configs to route AWS-related job alerts to #forum-ocp-splat-alerts-aws and updates AWS job lists.

Suggested labels

lgtm, rehearsals-ack

🚥 Pre-merge checks | ✅ 12
✅ Passed checks (12 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically describes the main change: enabling an optional managed security group conformance job for hypershift/aws/ccm.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed PR only modifies CI/CD config files (YAML), not Ginkgo test code. Custom check applies to test names in test code, which are absent here. CI job names are static and descriptive.
Test Structure And Quality ✅ Passed This PR contains only CI configuration YAML files, not Ginkgo test code. The check targets test code quality which is not applicable here.
Microshift Test Compatibility ✅ Passed PR only modifies CI configuration (YAML and Prowgen) to configure existing test jobs. No new Ginkgo test code is added, so the MicroShift compatibility check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR does not add new Ginkgo e2e tests. It only modifies CI configuration files to add test job entries that run existing conformance workflows. The check applies only to new test code.
Topology-Aware Scheduling Compatibility ✅ Passed This PR modifies only CI/CD test configuration files, not deployment manifests, operator code, or controllers. No scheduling constraints or topology assumptions are introduced.
Ote Binary Stdout Contract ✅ Passed This PR modifies only CI configuration files (YAML), not test binary source code. The OTE Binary Stdout Contract check targets Go source code, making it not applicable here.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds no new Ginkgo test code; only modifies CI/CD configuration to schedule existing workflows. The check for IPv6/disconnected compatibility is not applicable to configuration files.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 6, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: mtulio, vr4manta
Once this PR has been reviewed and has the lgtm label, please ask for approval from enxebre. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

openshift-ci-robot commented May 6, 2026

@mtulio: This pull request references SPLAT-2668 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Create a dedicated conformance job enabling managed security group feature of CCM for NLB, so that we can validate the feature on hypershift at PR
openshift/hypershift#7460

The new job presubmit job for hypershift project, overrides the TESTS_SKIPS from conformance workflow. A dedicated job has been created to prevent disruption in existing monitoring, while we can validate the PR individually. Once the tests are validated on Hypershift, and feature deliverable and stable, we will remove this one, and the TEST_SKIPS for AWSServiceLBNetworkSecurityGroup

https://redhat.atlassian.net/browse/SPLAT-2587
https://redhat.atlassian.net/browse/SPLAT-2668

Summary by CodeRabbit

Release Notes

  • Tests
  • Added conformance test for AWS cloud controller manager supporting Network Load Balancer configurations with security group validation
  • Added weekly periodic test for AWS OVN infrastructure validation to extend regular testing coverage

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 6, 2026

/pj-rehearse periodic-ci-openshift-hpershift-main-hypershift-4.23-e2e-aws-ovn-conformance-ccm

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mtulio: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mtulio: job(s): periodic-ci-openshift-hpershift-main-hypershift-4.23-e2e-aws-ovn-conformance-ccm either don't exist or were not found to be affected, and cannot be rehearsed

@mtulio mtulio force-pushed the splat-2587-hypershift-conformance-nlb branch from 957491c to 1be2a84 Compare May 6, 2026 11:52
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 6, 2026

/pj-rehearse periodic-ci-openshift-hypershift-release-4.23-periodics-e2e-aws-ovn-conformance-ccm

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mtulio: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@mtulio mtulio force-pushed the splat-2587-hypershift-conformance-nlb branch from 1be2a84 to f5086a5 Compare May 7, 2026 21:23
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 7, 2026

/pj-rehearse periodic-ci-openshift-hypershift-release-4.23-periodics-e2e-aws-ovn-conformance-ccm periodic-ci-openshift-hypershift-release-4.23-periodics-e2e-aws-ovn-conformance-ccm-techpreview

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mtulio: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
ci-operator/config/openshift/hypershift/openshift-hypershift-release-4.23__periodics.yaml (1)

161-170: ⚡ Quick win

Stagger the weekly schedules to avoid synchronized capacity spikes.

Line 161 and Line 170 both use @weekly, so both jobs trigger together. Using offset explicit cron times reduces quota contention and flake risk on hypershift-aws.

Suggested change
 - as: e2e-aws-ovn-conformance-ccm
-  cron: '@weekly'
+  cron: 13 2 * * 1
   steps:
     cluster_profile: hypershift-aws
@@
 - as: e2e-aws-ovn-conformance-ccm-techpreview
-  cron: '@weekly'
+  cron: 43 2 * * 1
   steps:
     cluster_profile: hypershift-aws
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/config/openshift/hypershift/openshift-hypershift-release-4.23__periodics.yaml`
around lines 161 - 170, Two jobs in this snippet share the same schedule
(`@weekly`), causing them to run simultaneously; update one of the cron fields to
an explicit, staggered weekly time to avoid capacity spikes. Locate the job with
workflow: hypershift-aws-conformance and the job labeled as:
e2e-aws-ovn-conformance-ccm-techpreview and replace one of their cron: '@weekly'
entries with an explicit cron expression (e.g., a different day/time) so the
runs are offset; ensure the new cron is still weekly and document the chosen
offset in a short comment.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@ci-operator/config/openshift/hypershift/openshift-hypershift-release-4.23__periodics.yaml`:
- Around line 161-170: Two jobs in this snippet share the same schedule
(`@weekly`), causing them to run simultaneously; update one of the cron fields to
an explicit, staggered weekly time to avoid capacity spikes. Locate the job with
workflow: hypershift-aws-conformance and the job labeled as:
e2e-aws-ovn-conformance-ccm-techpreview and replace one of their cron: '@weekly'
entries with an explicit cron expression (e.g., a different day/time) so the
runs are offset; ensure the new cron is still weekly and document the chosen
offset in a short comment.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 58dc910b-bdc7-46b7-b22a-3eb7d3eabf90

📥 Commits

Reviewing files that changed from the base of the PR and between 1be2a84 and f5086a5.

⛔ Files ignored due to path filters (2)
  • ci-operator/jobs/openshift/hypershift/openshift-hypershift-main-presubmits.yaml is excluded by !ci-operator/jobs/**
  • ci-operator/jobs/openshift/hypershift/openshift-hypershift-release-4.23-periodics.yaml is excluded by !ci-operator/jobs/**
📒 Files selected for processing (2)
  • ci-operator/config/openshift/hypershift/openshift-hypershift-main.yaml
  • ci-operator/config/openshift/hypershift/openshift-hypershift-release-4.23__periodics.yaml
🚧 Files skipped from review as they are similar to previous changes (1)
  • ci-operator/config/openshift/hypershift/openshift-hypershift-main.yaml

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 7, 2026

@mtulio: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/periodic-ci-openshift-release-main-hypershift-4.22-hypershift-conformance-aws-ccm-nlb-sg f1a1529 link unknown /pj-rehearse periodic-ci-openshift-release-main-hypershift-4.22-hypershift-conformance-aws-ccm-nlb-sg
ci/rehearse/periodic-ci-openshift-hypershift-release-4.23-periodics-e2e-aws-ovn-conformance-ccm-techpreview f5086a5 link unknown /pj-rehearse periodic-ci-openshift-hypershift-release-4.23-periodics-e2e-aws-ovn-conformance-ccm-techpreview

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 11, 2026

Analysing the failed test : [cloud-provider-aws-e2e-openshift] loadbalancer NLB [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup] should have correct security group rules for service ports [Suite:openshift/conformance/parallel] from job r4.23-periodics-e2e-aws-ovn-conformance-ccm-techpreview #2052500713367408640, it's failing to describe LBs:

I0507 23:01:09.418187 121774 helper.go:32] describing load balancers with DNS aa8e7bd5378944f25b018513b8d58a4d-5de60da620c43a5d.elb.us-east-1.amazonaws.com
I0507 23:01:09.429376 121774 loadbalancer.go:560] Unexpected error: failed to find load balancer with DNS name aa8e7bd5378944f25b018513b8d58a4d-5de60da620c43a5d.elb.us-east-1.amazonaws.com: 
    <*errors.errorString | 0xc00007b2b0>: 
    failed to describe load balancers: operation error Elastic Load Balancing v2: DescribeLoadBalancers, https response error StatusCode: 0, RequestID: , request send failed, Post "https://elasticloadbalancing.5aa684a8-6f26-470d-8e09-e0c0a971e7e4.amazonaws.com/": dial tcp: lookup elasticloadbalancing.5aa684a8-6f26-470d-8e09-e0c0a971e7e4.amazonaws.com on 172.30.0.10:53: no such host
    {
        s: "failed to describe load balancers: operation error Elastic Load Balancing v2: DescribeLoadBalancers, https response error StatusCode: 0, RequestID: , request send failed, Post \"https://elasticloadbalancing.5aa684a8-6f26-470d-8e09-e0c0a971e7e4.amazonaws.com/\": dial tcp: lookup elasticloadbalancing.5aa684a8-6f26-470d-8e09-e0c0a971e7e4.amazonaws.com on 172.30.0.10:53: no such host",
    }
  [FAILED] in [It] - github.com/openshift/cluster-cloud-controller-manager-operator/openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go:560 @ 05/07/26 23:01:09.429
I0507 23:01:09.429648 121774 framework.go:378] Found DeleteNamespaceOnFailure=false and current test failed, skipping namespace deletion!

fail [github.com/openshift/cluster-cloud-controller-manager-operator/openshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.go:560]: failed to find load balancer with DNS name aa8e7bd5378944f25b018513b8d58a4d-5de60da620c43a5d.elb.us-east-1.amazonaws.com: failed to describe load balancers: operation error Elastic Load Balancing v2: DescribeLoadBalancers, https response error StatusCode: 0, RequestID: , request send failed, Post "https://elasticloadbalancing.5aa684a8-6f26-470d-8e09-e0c0a971e7e4.amazonaws.com/": dial tcp: lookup elasticloadbalancing.5aa684a8-6f26-470d-8e09-e0c0a971e7e4.amazonaws.com on 172.30.0.10:53: no such host

But CCM is creating it correctly - Load Balancer is created (CCM logs):

I0507 23:01:03.405746       1 event.go:389] "Event occurred" object="cloud-provider-aws-1055/nlb-sg-rules-test" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuringLoadBalancer" message="Ensuring load balancer"
I0507 23:01:04.366382       1 aws.go:2280] Creating NLB security group "k8s-cloudpro-nlbsgrul-fab1e3edaf" for service "cloud-provider-aws-1055/nlb-sg-rules-test"
I0507 23:01:04.756415       1 aws.go:2288] Created NLB security group "sg-0e89bb072cb2a2514" for service "cloud-provider-aws-1055/nlb-sg-rules-test"
I0507 23:01:04.890304       1 aws_loadbalancer.go:248] Creating load balancer for cloud-provider-aws-1055/nlb-sg-rules-test with name: aa8e7bd5378944f25b018513b8d58a4d
I0507 23:01:05.702056       1 aws_loadbalancer.go:832] Creating load balancer target group for cloud-provider-aws-1055/nlb-sg-rules-test with name: k8s-cloudpro-nlbsgrul-50674cfefd (IP address type: ipv4)
I0507 23:01:06.162245       1 aws_loadbalancer.go:804] Creating load balancer listener for cloud-provider-aws-1055/nlb-sg-rules-test
I0507 23:01:06.261005       1 aws_loadbalancer.go:832] Creating load balancer target group for cloud-provider-aws-1055/nlb-sg-rules-test with name: k8s-cloudpro-nlbsgrul-0ffc093d38 (IP address type: ipv4)
I0507 23:01:06.701404       1 aws_loadbalancer.go:804] Creating load balancer listener for cloud-provider-aws-1055/nlb-sg-rules-test
I0507 23:01:08.449479       1 event.go:389] "Event occurred" object="cloud-provider-aws-1055/nlb-sg-rules-test" fieldPath="" kind="Service" apiVersion="v1" type="Normal" reason="EnsuredLoadBalancer" message="Ensured load balancer"

The problem is test is not retrying enough times and/or correctly. We fixed similar issue in upstream tests[1], but this needs to be ported on downstream - while OTE/downstream is not using upstream library[2]. All tests not using this fix are failing: [OCPFeatureGate:AWSServiceLBNetworkSecurityGroup]

[1] https://redhat.atlassian.net/browse/OCPBUGS-83399
[2] https://github.com/openshift/cluster-cloud-controller-manager-operator/tree/main/openshift-tests/ccm-aws-tests

The bug https://redhat.atlassian.net/browse/OCPBUGS-85414 has been filed to track this issue.

@mtulio mtulio force-pushed the splat-2587-hypershift-conformance-nlb branch from 8fb5448 to 690c8f5 Compare May 12, 2026 02:44
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 12, 2026

As described in the comment #77567 (comment), the new job is expected to fail as it blocked by OCPBUGS-85414, and we need this job to validate the fix on the PR, currently WIP on openshift/cluster-cloud-controller-manager-operator#462

/pj-rehearse ack

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@mtulio: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot openshift-merge-bot Bot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label May 12, 2026
@mtulio
Copy link
Copy Markdown
Contributor Author

mtulio commented May 12, 2026

/test config check-gh-automation

@mtulio mtulio force-pushed the splat-2587-hypershift-conformance-nlb branch from 690c8f5 to b6636f4 Compare May 12, 2026 14:14
@openshift-merge-bot openshift-merge-bot Bot removed the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label May 12, 2026

'
job_names:
- e2e-conformance-aws-ccm
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove, not used anymore

Suggested change
- e2e-conformance-aws-ccm

Create a dedicated conformance job enabling managed security group
feature of CCM for NLB, so that we can validate the feature on
hypershift at PR
openshift/hypershift#7460
@mtulio mtulio force-pushed the splat-2587-hypershift-conformance-nlb branch from b6636f4 to 43d8f31 Compare May 12, 2026 14:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants