Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -196,7 +196,8 @@ configure_claude() {
"Write(//tmp/**)",
"Bash(bash plugins/microshift-ci/scripts/*)",
"Bash(python3 plugins/microshift-ci/scripts/*)",
"Skill(microshift-ci:*)"
"Skill(microshift-ci:*)",
"mcp__openshift-ci__*"
]
}
}
Expand All @@ -223,6 +224,31 @@ EOF
else
echo "WARNING: Jira API token or username not available. Jira MCP will not be available."
fi

# Configure the OpenShift CI MCP — read-only Sippy/Release-Controller/
# Search.CI access used by the analysis skills for job history and
# known-regression context. No credentials involved. Failures are
# non-fatal: skills record the absence in their analysis gaps.
echo "Configuring OpenShift CI MCP..."
local -r ocimcp_version="v0.5.0"
local -r ocimcp_sha256="a9221011c8aded3108a89a9ee8fa19bcd86daed0582d997ff51913445d5eb53e"
local -r ocimcp_bin="${WORKDIR}/bin/openshift-ci-mcp"
mkdir -p "${WORKDIR}/bin"
if curl -sL --retry 3 -o "${ocimcp_bin}" \
"https://github.com/openshift-eng/openshift-ci-mcp/releases/download/${ocimcp_version}/openshift-ci-mcp-linux-amd64" &&
echo "${ocimcp_sha256} ${ocimcp_bin}" | sha256sum --check --quiet; then
Comment on lines +237 to +239

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add explicit network timeouts to the MCP binary download.

The new curl call retries, but without --connect-timeout/--max-time it can hang long enough to burn step budget during network stalls.

Suggested fix
-    if curl -sL --retry 3 -o "${ocimcp_bin}" \
+    if curl -sL --retry 3 --retry-delay 2 --connect-timeout 10 --max-time 120 -o "${ocimcp_bin}" \
             "https://github.com/openshift-eng/openshift-ci-mcp/releases/download/${ocimcp_version}/openshift-ci-mcp-linux-amd64" &&
         echo "${ocimcp_sha256}  ${ocimcp_bin}" | sha256sum --check --quiet; then
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
if curl -sL --retry 3 -o "${ocimcp_bin}" \
"https://github.com/openshift-eng/openshift-ci-mcp/releases/download/${ocimcp_version}/openshift-ci-mcp-linux-amd64" &&
echo "${ocimcp_sha256} ${ocimcp_bin}" | sha256sum --check --quiet; then
if curl -sL --retry 3 --retry-delay 2 --connect-timeout 10 --max-time 120 -o "${ocimcp_bin}" \
"https://github.com/openshift-eng/openshift-ci-mcp/releases/download/${ocimcp_version}/openshift-ci-mcp-linux-amd64" &&
echo "${ocimcp_sha256} ${ocimcp_bin}" | sha256sum --check --quiet; then
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/openshift/edge-tooling/microshift-ci/doctor/openshift-edge-tooling-microshift-ci-doctor-commands.sh`
around lines 237 - 239, The curl command downloading the openshift-ci-mcp binary
(in the condition starting with "if curl -sL --retry 3") lacks explicit network
timeout settings. This can cause the download to hang indefinitely during
network stalls, exhausting the step budget. Add `--connect-timeout` and
`--max-time` flags to the curl invocation to enforce reasonable timeouts (for
example, 30 seconds for connection and 60 seconds for total operation time),
ensuring the command fails fast when network conditions are poor rather than
consuming excessive budget waiting for a response.

chmod +x "${ocimcp_bin}"
claude mcp add --scope user --transport stdio openshift-ci -- "${ocimcp_bin}" --tools core,jobs,tests,prs,search
echo "Waiting for OpenShift CI MCP to become available..."
if wait_for_mcp_status "openshift-ci" "Connected"; then
echo "OpenShift CI MCP is available."
else
echo "WARNING: OpenShift CI MCP did not connect. Job history will not be available."
fi
else
echo "WARNING: Failed to download or verify openshift-ci-mcp ${ocimcp_version}. Job history will not be available."
rm -f "${ocimcp_bin}"
fi
}

#
Expand All @@ -243,10 +269,16 @@ load_secrets
configure_claude

# Use the edge-tooling source pre-installed in the image
SRC_DIR="${EDGE_TOOLING_DIR}"
#SRC_DIR="${EDGE_TOOLING_DIR}"
#PLUGIN_DIR="${SRC_DIR}/plugins/microshift-ci"

SRC_DIR=/tmp/edge-tooling
PLUGIN_DIR="${SRC_DIR}/plugins/microshift-ci"
git clone https://github.com/pmtk/edge-tooling.git -b ci-doctor-rca "${SRC_DIR}"
Comment on lines +272 to +277

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Hardcoded personal fork and feature branch must be reverted before merge.

The workspace setup clones from pmtk/edge-tooling.git branch ci-doctor-rca instead of using the pre-installed EDGE_TOOLING_DIR. If merged, production CI would depend on a personal fork and feature branch that may disappear or diverge.

Restore the original configuration or update to use the official repository:

Suggested fix (restore original)
-#SRC_DIR="${EDGE_TOOLING_DIR}"
-#PLUGIN_DIR="${SRC_DIR}/plugins/microshift-ci"
-#cd "${SRC_DIR}"
-
-SRC_DIR=/tmp/edge-tooling
+SRC_DIR="${EDGE_TOOLING_DIR}"
 PLUGIN_DIR="${SRC_DIR}/plugins/microshift-ci"
-git clone https://github.com/pmtk/edge-tooling.git -b ci-doctor-rca "${SRC_DIR}"
+cd "${SRC_DIR}"
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@ci-operator/step-registry/openshift/edge-tooling/microshift-ci/doctor/openshift-edge-tooling-microshift-ci-doctor-commands.sh`
around lines 272 - 278, The workspace setup in the script is cloning from a
personal fork (pmtk/edge-tooling.git) on a feature branch (ci-doctor-rca)
instead of using the pre-installed EDGE_TOOLING_DIR variable. Revert to the
original configuration by uncommenting the three lines that set SRC_DIR to use
EDGE_TOOLING_DIR and PLUGIN_DIR relative to it, then remove the hardcoded
SRC_DIR assignment to /tmp/edge-tooling and the entire git clone command that
references the personal fork and feature branch.


cd "${SRC_DIR}"


# Configure the GitHub token for MicroShift repo operations
{ set +x; export GITHUB_TOKEN="${GITHUB_TOKEN_USHIFT}"; set -x; }

Expand All @@ -260,17 +292,20 @@ echo "Running automatic closing of duplicate rebase PRs..."
--filter 'NO-ISSUE: rebase-release'
echo "Automatic closing of duplicate rebase PRs completed"

# Run analysis on all releases and open rebase PRs (45m and 100 turns).
# Run analysis on all releases and open rebase PRs (60m and 150 turns).
# The deeper per-job root cause analysis (sosreport extraction, source
# correlation, causal chains) needs more wall clock than the old
# surface-level scan did.
echo "Running Claude to analyze MicroShift CI jobs and pull requests..."
CLAUDE_RC=0
timeout 2700 claude \
timeout 3600 claude \
--model "${CLAUDE_MODEL}" \
--max-turns 100 \
--max-turns 150 \
--output-format stream-json \
--plugin-dir "${PLUGIN_DIR}" \
-p "/microshift-ci:doctor ${RELEASE_VERSIONS}" \
--verbose 2>&1 | tee "${CLAUDE_DOCTOR_LOG}" || CLAUDE_RC=$?
check_claude_rc "${CLAUDE_RC}" "doctor" 45
check_claude_rc "${CLAUDE_RC}" "doctor" 60

# Run bug creation for failed jobs (15m and 50 turns).
echo "Running Claude to create bugs for failed jobs..."
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ ref:
requests:
cpu: 2000m
memory: 4Gi
timeout: 1h30m0s
timeout: 2h15m0s
grace_period: 10m0s
documentation: |-
Analyzes MicroShift periodic jobs and pull requests using Claude AI.
Expand Down