Skip to content

NO-JIRA: Replace blind sleep with TCP readiness checks in helm test setup#16689

Closed
stefanonardo wants to merge 1 commit into
openshift:mainfrom
stefanonardo:chartmuseum-fix
Closed

NO-JIRA: Replace blind sleep with TCP readiness checks in helm test setup#16689
stefanonardo wants to merge 1 commit into
openshift:mainfrom
stefanonardo:chartmuseum-fix

Conversation

@stefanonardo

@stefanonardo stefanonardo commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Summary

The ci/prow/backend job fails intermittently because setupTestWithTls() uses a hardcoded time.Sleep(5s) before connecting to ChartMuseum/Zot. In CI, under resource contention, the servers sometimes aren't ready within 5s. Replace all four time.Sleep calls with waitForTCP — a retry loop that polls the port with 1s intervals and a 30s timeout.

Test plan

  • ci/prow/backend passes
  • Local: 10/10 runs pass

Summary by CodeRabbit

  • Tests
    • Improved test setup reliability by waiting for local services to become reachable before continuing.
    • Replaced fixed delays with connection checks, which should make setup fail faster when a service does not start correctly.

@openshift-ci-robot

Copy link
Copy Markdown
Contributor

@stefanonardo: This pull request explicitly references no jira issue.

Details

In response to this:

Summary

ChartMuseum v0.14.0 (Go 1.17) can't complete TLS handshakes with OpenSSL 3.5.x clients, breaking ci/prow/backend for all PRs since ~June 24. Upgrading to v0.16.5 (Go 1.25.8) fixes it. Verified locally.

Test plan

  • ci/prow/backend passes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jul 1, 2026
@openshift-ci openshift-ci Bot requested review from baijum and sowmya-sl July 1, 2026 06:36
@openshift-ci openshift-ci Bot added the component/backend Related to backend label Jul 1, 2026
@stefanonardo stefanonardo changed the title UNO-JIRA: upgrade ChartMuseum from v0.14.0 to v0.16.5 for TLS compatibility NO-JIRA: upgrade ChartMuseum from v0.14.0 to v0.16.5 for TLS compatibility Jul 1, 2026
@coderabbitai

coderabbitai Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Walkthrough

Test setup for helm actions replaces fixed 5-second sleeps with a new waitForTCP helper that polls TCP connectivity against ChartMuseum and Zot service ports across TLS, no-TLS, basic-auth, and OCI basic-auth modes, returning mode-specific timeout errors on failure.

Changes

TCP Readiness Polling

Layer / File(s) Summary
waitForTCP helper
pkg/helm/actions/setup_test.go
Adds net import and a waitForTCP(addr, timeout) helper that repeatedly dials a TCP address, closes successful connections, and returns a timeout error if the deadline is exceeded.
Apply readiness checks across setup modes
pkg/helm/actions/setup_test.go
Replaces fixed sleeps with waitForTCP calls for ChartMuseum and Zot ports in TLS, no-TLS, basic-auth, and OCI basic-auth setups, each returning mode-specific timeout errors.

Estimated code review effort: 2 (Simple) | ~10 minutes

Sequence Diagram(s)

sequenceDiagram
  participant TestSetup
  participant waitForTCP
  participant ChartMuseum
  participant Zot
  TestSetup->>waitForTCP: check localhost:port for ChartMuseum
  waitForTCP->>ChartMuseum: DialTimeout("tcp", addr, 1s)
  ChartMuseum-->>waitForTCP: connection success/failure
  TestSetup->>waitForTCP: check localhost:port for Zot
  waitForTCP->>Zot: DialTimeout("tcp", addr, 1s)
  Zot-->>waitForTCP: connection success/failure
  waitForTCP-->>TestSetup: ready or timeout error
Loading

Important

Pre-merge checks failed

Please resolve all errors before merging. Addressing warnings is optional.

❌ Failed checks (1 error, 3 warnings)

Check name Status Explanation Resolution
Ote Binary Stdout Contract ❌ Error TestMain/setup code still writes to stdout via fmt.Println and exec.Cmd.Stdout=os.Stdout, which violates the OTE JSON-on-stdout contract. Redirect child process output and warnings to stderr or GinkgoWriter, and avoid stdout writes in TestMain/startup cleanup.
Description check ⚠️ Warning It includes a summary and test plan, but omits most required template sections such as root cause, solution details, test cases, and reviewers. Add the missing template sections: Analysis/Root cause, Solution description, Test setup, Test cases, Browser conformance, Additional info, and reviewers/assignees.
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Ipv6 And Disconnected Network Test Compatibility ⚠️ Warning The test setup still binds Zot to 127.0.0.1 and downloads ChartMuseum/Helm/Zot from get.helm.sh and GitHub, so it assumes IPv4 loopback and public internet access. Use IPv6-aware loopback/bind-all configs, net.JoinHostPort for host:port formatting, and mirrored/internal artifacts or a disconnected skip for offline CI.
✅ Passed checks (11 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed Touched test file contains no Ginkgo specs or dynamic test titles; the change only adds setup helpers and TCP polling.
Test Structure And Quality ✅ Passed No Ginkgo tests here; setup helpers are single-purpose, use cleanup defer, and all readiness waits have 30s timeouts.
Microshift Test Compatibility ✅ Passed pkg/helm/actions/setup_test.go only adds setup/polling helpers; no new Ginkgo specs, MicroShift guards, or unsupported OpenShift APIs/namespaces found.
Single Node Openshift (Sno) Test Compatibility ✅ Passed The change only adds startup TCP polling in setup helpers; no new Ginkgo specs or single-node assumptions appear in pkg/helm/actions.
Topology-Aware Scheduling Compatibility ✅ Passed Only pkg/helm/actions/setup_test.go changed; it adds test-only TCP readiness polling and no manifests, controllers, or scheduling constraints.
No-Weak-Crypto ✅ Passed Changed code only adds TCP readiness polling and no MD5/SHA1/DES/RC4/3DES/custom crypto or secret comparisons appear in the diff.
Container-Privileges ✅ Passed PR only updates helm test setup; no container/K8s manifest changes or privileged fields were found in the repo scan.
No-Sensitive-Data-In-Logs ✅ Passed No sensitive values in new log/error messages; only localhost ports and generic readiness failures are logged.
Title check ✅ Passed The title accurately reflects the main change: replacing fixed sleeps with TCP readiness polling in helm test setup.
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands.

@sowmya-sl

Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jul 1, 2026
@openshift-ci

openshift-ci Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sowmya-sl, stefanonardo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 1, 2026
@stefanonardo

Copy link
Copy Markdown
Contributor Author

/test backend

1 similar comment
@stefanonardo

Copy link
Copy Markdown
Contributor Author

/test backend

@stefanonardo

Copy link
Copy Markdown
Contributor Author

/hold testing consistency in CI

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 1, 2026
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Jul 1, 2026
@openshift-ci

openshift-ci Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

New changes are detected. LGTM label has been removed.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/helm/actions/setup_test.go`:
- Around line 200-211: In waitForTCP, replace the bare net.DialTimeout call with
a net.Dialer using a timeout and DialContext so the connection attempt is
cancellable via context and satisfies the noctx lint. Update the helper to
accept or derive a context, pass it into the dialer, and keep the timeout
behavior equivalent. Also handle the conn.Close result explicitly in the
successful path instead of discarding it, even if that just means checking and
surfacing any close error before returning.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e62121c0-8231-48f0-a8c0-dab88182a1b0

📥 Commits

Reviewing files that changed from the base of the PR and between edd4633 and 6f7aee9.

📒 Files selected for processing (2)
  • pkg/helm/actions/setup_test.go
  • pkg/helm/actions/testdata/downloadChartmuseum.sh
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/helm/actions/testdata/downloadChartmuseum.sh

Comment thread pkg/helm/actions/setup_test.go
@stefanonardo

Copy link
Copy Markdown
Contributor Author

/test backend

2 similar comments
@stefanonardo

Copy link
Copy Markdown
Contributor Author

/test backend

@stefanonardo

Copy link
Copy Markdown
Contributor Author

/test backend

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@stefanonardo stefanonardo changed the title NO-JIRA: upgrade ChartMuseum from v0.14.0 to v0.16.5 for TLS compatibility Replace blind sleep with TCP readiness checks in helm test setup Jul 1, 2026
@stefanonardo stefanonardo changed the title Replace blind sleep with TCP readiness checks in helm test setup NO-JIRA: Replace blind sleep with TCP readiness checks in helm test setup Jul 1, 2026
@stefanonardo

Copy link
Copy Markdown
Contributor Author

/test backend

@stefanonardo

Copy link
Copy Markdown
Contributor Author

/unhold

@openshift-ci openshift-ci Bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 1, 2026
@openshift-ci

openshift-ci Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

@stefanonardo: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-playwright c81b61e link false /test e2e-playwright

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

conn.Close()
return nil
}
time.Sleep(time.Second)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker.
Do we try each second? Or maybe 2-5 seconds each time?

@stefanonardo

Copy link
Copy Markdown
Contributor Author

duplicate #16687

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. component/backend Related to backend jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants