OCPEDGE-2753: Add catalog index image info to LVMS CI doctor report by kasturinarra · Pull Request #186 · openshift-eng/edge-tooling

kasturinarra · 2026-06-11T11:29:52Z

Summary

Prow-job skill now extracts the LVMS catalog index image from lvms-catalogsource/build-log.txt and resolves digest/build date/source commit via skopeo inspect
create-report.py renders this as a styled info box in each release section, showing image reference, SHA digest, build date, and a linked source commit
Gated on component == "lvm-operator" — MicroShift reports are unaffected

Test plan

Run /lvms-ci:doctor and verify the index image info appears in the HTML report
Verify MicroShift reports are unchanged

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Release reports now display LVMS Index Image metadata (image reference, digest, build timestamp, shortened commit link) and show extraction errors when present.
- Automatic extraction of catalog image metadata integrated into release workflows so per-release index-image data is collected and included in generated reports.

The prow-job skill now extracts the LVMS catalog index image from lvms-catalogsource build logs and resolves the digest via skopeo. create-report.py renders this as an info box per release section, showing image, digest, build date, and source commit. Gated on component=lvm-operator so MicroShift is unaffected. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openshift-ci · 2026-06-11T11:30:01Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: kasturinarra

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [kasturinarra]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai · 2026-06-11T11:30:07Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 854ae1cb-3279-47c2-844c-8e1c86485ed8

📥 Commits

Reviewing files that changed from the base of the PR and between 14fd30e and afe6bf8.

📒 Files selected for processing (1)

plugins/shared/scripts/extract-index-image.sh

🚧 Files skipped from review as they are similar to previous changes (1)

plugins/shared/scripts/extract-index-image.sh

Walkthrough

This PR adds LVMS index image metadata extraction and reporting. A new extract-index-image.sh utility scans job artifacts for LVM_INDEX_IMAGE build log entries, resolves image digests and metadata using skopeo, and writes JSON results. doctor.sh orchestrates this extraction during finalize and refresh workflows. create-report.py gains helper functions to load and render the extracted metadata as an styled info box in HTML reports.

Changes

LVMS Index Image Extraction and Reporting

Layer / File(s)	Summary
Index image extraction utility script `plugins/shared/scripts/extract-index-image.sh`, `plugins/lvms-ci/scripts/extract-index-image.sh`	New `extract-index-image.sh` scans job artifacts for `LVM_INDEX_IMAGE` references in build logs, uses `skopeo inspect` to resolve digests and build metadata, and writes results to `index-image/release-<release>.json`. Symlink in lvms-ci directory for access.
Orchestration in doctor.sh workflows `plugins/shared/scripts/doctor.sh`	`cmd_finalize` and `cmd_refresh` now invoke `extract-index-image.sh` after PR aggregation and before HTML report generation, with warning logging on extraction failure.
HTML report helpers for index image rendering `plugins/shared/scripts/create-report.py`	New `.index-image-info` CSS styling and helper functions `extract_index_image()` and `_render_index_image()` load metadata and render it as conditional HTML with catalog image, digest, build timestamp, GitHub commit link, and optional error text.
Report generation wiring for index image data `plugins/shared/scripts/create-report.py`	Extended `render_release_section()`, `generate_html()`, and `main()` signatures to thread index metadata through the report pipeline. `main()` builds an `index_data` lookup per release version, `generate_html()` creates `_idx` and passes it to `render_release_section()`, which injects the rendered index-image HTML before the breakdown and issues tables.

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 2

❌ Failed checks (2 warnings)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 69.23% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Ai-Attribution	⚠️ Warning	PR commit message includes “Co-Authored-By: Claude Opus 4.6 …” but contains no “Generated-by:” or “Assisted-by:” trailers.	Add Red Hat “Generated-by:” or “Assisted-by:” trailers for the AI assistance and remove/avoid using “Co-Authored-By” for the AI tool.

✅ Passed checks (9 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically summarizes the main change: adding catalog index image information to the LVMS CI doctor report, referencing the associated ticket OCPEDGE-2753.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
No-Weak-Crypto	✅ Passed	Searched plugins/shared and plugins/lvms-ci scripts for MD5/SHA1/DES/RC4/3DES/Blowfish/ECB and for compare_digest/token/secret== patterns; no matches found in PR-related files.
Container-Privileges	✅ Passed	Repo-wide scan found 0 occurrences of privileged:true, hostPID/hostNetwork/hostIPC:true, allowPrivilegeEscalation:true, SYS_ADMIN, runAsUser:0, or runAsNonRoot:false.
No-Sensitive-Data-In-Logs	✅ Passed	Logged messages in PR code only include constant warnings, image/digest metadata, and file/exception paths; no passwords/tokens/API keys/PII are printed.
No-Hardcoded-Secrets	✅ Passed	Scanned PR-related scripts (create-report.py, doctor.sh, extract-index-image.sh) for PEM private keys, embedded-credential URLs, Authorization/Bearer literals, secret/token/password variable assign...
No-Injection-Vectors	✅ Passed	Scanned PR-relevant files (create-report.py, doctor.sh, extract-index-image.sh and symlink target) for eval/exec/pickle.loads/yaml.load/os.system/dangerouslySetInnerHTML/subprocess shell=True; none...

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/shared/scripts/create-report.py`:
- Around line 705-729: extract_index_image currently returns info from the first
file found in sorted(glob_mod.glob(pattern)), which can pick stale/arbitrary
metadata; change the selection to be deterministic by sorting candidate files by
modification time (most recent first) before extracting metadata: get the file
list from glob_mod.glob(pattern), sort it using os.path.getmtime (or
pathlib.Path.stat().st_mtime) in descending order, then iterate that sorted
list, parse each file as done in the function (refer to variables pattern,
filepath and the info extraction block) and return the first info with an
"image".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 50f8bd69-56ce-4b5f-a3ff-4de58f4d3dd9

📥 Commits

Reviewing files that changed from the base of the PR and between 8131026 and 38214ce.

📒 Files selected for processing (2)

plugins/lvms-ci/skills/prow-job/SKILL.md
plugins/shared/scripts/create-report.py

The extract_index_image function returns None when no ## Index Image section exists in the report files, so the component gate is unnecessary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

plugins/shared/scripts/create-report.py (1)

698-730: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Per CONTRIBUTING.md: Parsing logic should be robust and testable, not brittle manual string splitting.

The current implementation uses fragile string operations (line.split("**Image:**", 1)[1].strip()) that:

Break silently with format variations (extra whitespace, markdown changes)
Extract invalid data without validation (e.g., malformed SHA, non-ISO dates, invalid image refs)
Lack test coverage to prevent regressions

As per coding guidelines: "Parsing/validation logic (e.g., scanning build logs and extracting 'Index Image' fields) should include positive and negative test cases rather than relying on brittle/manual parsing."

🛡️ Recommended improvements

Use regex patterns with validation:

+import re
+
 def extract_index_image(workdir, version):
     """Extract index image info from per-job report files.
 
     Scans per-job report files for an '## Index Image' section containing
     Image, Digest, Built, and Source Commit fields.
     """
+    # Patterns with validation
+    IMAGE_PATTERN = re.compile(r'^\s*-\s*\*\*Image:\*\*\s+(.+\S)\s*$')
+    DIGEST_PATTERN = re.compile(r'^\s*-\s*\*\*Digest:\*\*\s+(sha256:[a-f0-9]{64})\s*$')
+    BUILT_PATTERN = re.compile(r'^\s*-\s*\*\*Built:\*\*\s+(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z)\s*$')
+    COMMIT_PATTERN = re.compile(r'^\s*-\s*\*\*Source Commit:\*\*\s+([a-f0-9]{7,40})\s*$')
+
     pattern = os.path.join(workdir, "jobs", f"release-{version}-job-*.txt")
     for filepath in sorted(glob_mod.glob(pattern)):
         try:
             with open(filepath, "r") as f:
                 content = f.read()
         except IOError:
             continue
 
         if "## Index Image" not in content:
             continue
 
         info = {}
         for line in content.split("\n"):
-            line = line.strip()
-            if line.startswith("- **Image:**"):
-                info["image"] = line.split("**Image:**", 1)[1].strip()
-            elif line.startswith("- **Digest:**"):
-                info["digest"] = line.split("**Digest:**", 1)[1].strip()
-            elif line.startswith("- **Built:**"):
-                info["built"] = line.split("**Built:**", 1)[1].strip()
-            elif line.startswith("- **Source Commit:**"):
-                info["commit"] = line.split("**Source Commit:**", 1)[1].strip()
+            if m := IMAGE_PATTERN.match(line):
+                info["image"] = m.group(1)
+            elif m := DIGEST_PATTERN.match(line):
+                info["digest"] = m.group(1)
+            elif m := BUILT_PATTERN.match(line):
+                info["built"] = m.group(1)
+            elif m := COMMIT_PATTERN.match(line):
+                info["commit"] = m.group(1)
 
         if info.get("image"):
             return info
 
     return None

Add unit tests covering:
- Valid "## Index Image" section extraction
- Missing sections (returns None)
- Malformed field lines (skipped)
- Invalid digest/date/commit formats (rejected)
- Multiple job files (picks first with valid data)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/shared/scripts/create-report.py` around lines 698 - 730, The parsing
in extract_index_image is brittle; replace the manual string splits with robust
regex parsing and validation: when scanning each file matched by pattern in
extract_index_image, locate the "## Index Image" section and run compiled
regexes to capture named groups for Image, Digest, Built, and Source Commit;
validate the digest with a sha256 hex pattern, validate Built by parsing as ISO
date/time (datetime.fromisoformat or strict format), validate the image ref with
a permissive image reference regex, and validate the commit as a 7+ hex SHA
before accepting the record; skip files or individual fields that fail
validation and continue to the next job file, returning the first fully valid
info dict, and add unit tests covering valid extraction, missing section,
malformed lines, invalid digest/date/commit, and multiple job files picking the
first valid one.

Source: Coding guidelines

🧹 Nitpick comments (1)

plugins/shared/scripts/create-report.py (1)
744-750: ⚖️ Poor tradeoff

Consider making the repository URL derivable from image metadata.

The GitHub commit link hardcodes openshift/lvm-operator (line 749), which works for LVMS but assumes the catalog image is always built from that specific repository. If the image metadata includes a source repository label (e.g., org.opencontainers.image.source or vcs-url), consider deriving the repo URL from that instead of hardcoding it.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/shared/scripts/create-report.py` around lines 744 - 750, The HTML
link for the commit currently hardcodes "openshift/lvm-operator" when building
the GitHub URL; update the block that uses index_info, commit and the
lines.append call to first look for repository source metadata (e.g.,
index_info.get("org.opencontainers.image.source") or index_info.get("vcs-url")),
normalize the value to a GitHub "owner/repo" path (strip protocol and .git
suffix), and use that derived path when formatting the commit link, falling back
to "openshift/lvm-operator" only if no source label is present; ensure you still
escape values with _e(commit/short) and validate the derived URL before
inserting into lines.append.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@plugins/shared/scripts/create-report.py`:
- Around line 698-730: The parsing in extract_index_image is brittle; replace
the manual string splits with robust regex parsing and validation: when scanning
each file matched by pattern in extract_index_image, locate the "## Index Image"
section and run compiled regexes to capture named groups for Image, Digest,
Built, and Source Commit; validate the digest with a sha256 hex pattern,
validate Built by parsing as ISO date/time (datetime.fromisoformat or strict
format), validate the image ref with a permissive image reference regex, and
validate the commit as a 7+ hex SHA before accepting the record; skip files or
individual fields that fail validation and continue to the next job file,
returning the first fully valid info dict, and add unit tests covering valid
extraction, missing section, malformed lines, invalid digest/date/commit, and
multiple job files picking the first valid one.

---

Nitpick comments:
In `@plugins/shared/scripts/create-report.py`:
- Around line 744-750: The HTML link for the commit currently hardcodes
"openshift/lvm-operator" when building the GitHub URL; update the block that
uses index_info, commit and the lines.append call to first look for repository
source metadata (e.g., index_info.get("org.opencontainers.image.source") or
index_info.get("vcs-url")), normalize the value to a GitHub "owner/repo" path
(strip protocol and .git suffix), and use that derived path when formatting the
commit link, falling back to "openshift/lvm-operator" only if no source label is
present; ensure you still escape values with _e(commit/short) and validate the
derived URL before inserting into lines.append.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 0f73be2a-dfa8-4d32-b28a-43003759c3b6

📥 Commits

Reviewing files that changed from the base of the PR and between 38214ce and a7dd0be.

📒 Files selected for processing (1)

plugins/shared/scripts/create-report.py

kasturinarra · 2026-06-11T18:00:57Z

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

plugins/shared/scripts/create-report.py (1)> 698-730: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Per CONTRIBUTING.md: Parsing logic should be robust and testable, not brittle manual string splitting.
The current implementation uses fragile string operations (line.split("**Image:**", 1)[1].strip()) that:

Break silently with format variations (extra whitespace, markdown changes)
Extract invalid data without validation (e.g., malformed SHA, non-ISO dates, invalid image refs)
Lack test coverage to prevent regressions

As per coding guidelines: "Parsing/validation logic (e.g., scanning build logs and extracting 'Index Image' fields) should include positive and negative test cases rather than relying on brittle/manual parsing."

🛡️ Recommended improvements

Use regex patterns with validation:

+import re
+
 def extract_index_image(workdir, version):
     """Extract index image info from per-job report files.
 
     Scans per-job report files for an '## Index Image' section containing
     Image, Digest, Built, and Source Commit fields.
     """
+    # Patterns with validation
+    IMAGE_PATTERN = re.compile(r'^\s*-\s*\*\*Image:\*\*\s+(.+\S)\s*$')
+    DIGEST_PATTERN = re.compile(r'^\s*-\s*\*\*Digest:\*\*\s+(sha256:[a-f0-9]{64})\s*$')
+    BUILT_PATTERN = re.compile(r'^\s*-\s*\*\*Built:\*\*\s+(\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z)\s*$')
+    COMMIT_PATTERN = re.compile(r'^\s*-\s*\*\*Source Commit:\*\*\s+([a-f0-9]{7,40})\s*$')
+
     pattern = os.path.join(workdir, "jobs", f"release-{version}-job-*.txt")
     for filepath in sorted(glob_mod.glob(pattern)):
         try:
             with open(filepath, "r") as f:
                 content = f.read()
         except IOError:
             continue
 
         if "## Index Image" not in content:
             continue
 
         info = {}
         for line in content.split("\n"):
-            line = line.strip()
-            if line.startswith("- **Image:**"):
-                info["image"] = line.split("**Image:**", 1)[1].strip()
-            elif line.startswith("- **Digest:**"):
-                info["digest"] = line.split("**Digest:**", 1)[1].strip()
-            elif line.startswith("- **Built:**"):
-                info["built"] = line.split("**Built:**", 1)[1].strip()
-            elif line.startswith("- **Source Commit:**"):
-                info["commit"] = line.split("**Source Commit:**", 1)[1].strip()
+            if m := IMAGE_PATTERN.match(line):
+                info["image"] = m.group(1)
+            elif m := DIGEST_PATTERN.match(line):
+                info["digest"] = m.group(1)
+            elif m := BUILT_PATTERN.match(line):
+                info["built"] = m.group(1)
+            elif m := COMMIT_PATTERN.match(line):
+                info["commit"] = m.group(1)
 
         if info.get("image"):
             return info
 
     return None

Add unit tests covering:
- Valid "## Index Image" section extraction
- Missing sections (returns None)
- Malformed field lines (skipped)
- Invalid digest/date/commit formats (rejected)
- Multiple job files (picks first with valid data)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/shared/scripts/create-report.py` around lines 698 - 730, The parsing
in extract_index_image is brittle; replace the manual string splits with robust
regex parsing and validation: when scanning each file matched by pattern in
extract_index_image, locate the "## Index Image" section and run compiled
regexes to capture named groups for Image, Digest, Built, and Source Commit;
validate the digest with a sha256 hex pattern, validate Built by parsing as ISO
date/time (datetime.fromisoformat or strict format), validate the image ref with
a permissive image reference regex, and validate the commit as a 7+ hex SHA
before accepting the record; skip files or individual fields that fail
validation and continue to the next job file, returning the first fully valid
info dict, and add unit tests covering valid extraction, missing section,
malformed lines, invalid digest/date/commit, and multiple job files picking the
first valid one.

Source: Coding guidelines

🧹 Nitpick comments (1)

plugins/shared/scripts/create-report.py (1)> 744-750: ⚖️ Poor tradeoff
Consider making the repository URL derivable from image metadata.
The GitHub commit link hardcodes openshift/lvm-operator (line 749), which works for LVMS but assumes the catalog image is always built from that specific repository. If the image metadata includes a source repository label (e.g., org.opencontainers.image.source or vcs-url), consider deriving the repo URL from that instead of hardcoding it.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/shared/scripts/create-report.py` around lines 744 - 750, The HTML
link for the commit currently hardcodes "openshift/lvm-operator" when building
the GitHub URL; update the block that uses index_info, commit and the
lines.append call to first look for repository source metadata (e.g.,
index_info.get("org.opencontainers.image.source") or index_info.get("vcs-url")),
normalize the value to a GitHub "owner/repo" path (strip protocol and .git
suffix), and use that derived path when formatting the commit link, falling back
to "openshift/lvm-operator" only if no source label is present; ensure you still
escape values with _e(commit/short) and validate the derived URL before
inserting into lines.append.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@plugins/shared/scripts/create-report.py`:
- Around line 698-730: The parsing in extract_index_image is brittle; replace
the manual string splits with robust regex parsing and validation: when scanning
each file matched by pattern in extract_index_image, locate the "## Index Image"
section and run compiled regexes to capture named groups for Image, Digest,
Built, and Source Commit; validate the digest with a sha256 hex pattern,
validate Built by parsing as ISO date/time (datetime.fromisoformat or strict
format), validate the image ref with a permissive image reference regex, and
validate the commit as a 7+ hex SHA before accepting the record; skip files or
individual fields that fail validation and continue to the next job file,
returning the first fully valid info dict, and add unit tests covering valid
extraction, missing section, malformed lines, invalid digest/date/commit, and
multiple job files picking the first valid one.

---

Nitpick comments:
In `@plugins/shared/scripts/create-report.py`:
- Around line 744-750: The HTML link for the commit currently hardcodes
"openshift/lvm-operator" when building the GitHub URL; update the block that
uses index_info, commit and the lines.append call to first look for repository
source metadata (e.g., index_info.get("org.opencontainers.image.source") or
index_info.get("vcs-url")), normalize the value to a GitHub "owner/repo" path
(strip protocol and .git suffix), and use that derived path when formatting the
commit link, falling back to "openshift/lvm-operator" only if no source label is
present; ensure you still escape values with _e(commit/short) and validate the
derived URL before inserting into lines.append.

ℹ️ Review info

This function is LVMS-specific (as noted in the docstring) and only processes lvm-operator catalog index images. The commit URL will always point to openshift/lvm-operator. No need to add complexity for a case that won't arise

ggiguash · 2026-06-12T08:38:56Z

+- **Source Commit:** <commit hash>
+```
+
+If `skopeo inspect` fails (e.g., image no longer exists), still report the image reference from the build log.


This needs to be scripted. Please, make sure to store all related files in a dedicated workdir sub-directory.

I have addressed your comment, When there are no lvms-catalogsource logs, it silently does nothing — no directory created, no files written, create-report.py gets None back and skips the info box.

Move the LVMS catalog index image extraction from LLM instructions in SKILL.md to a deterministic shell script (extract-index-image.sh). The script scans downloaded artifacts for lvms-catalogsource build logs, extracts the image reference, runs skopeo inspect, and writes JSON to a dedicated ${WORKDIR}/index-image/ subdirectory. doctor.sh calls it in finalize/refresh before HTML generation, and create-report.py reads the JSON files directly instead of parsing markdown from agent reports. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

plugins/shared/scripts/extract-index-image.sh (1)
1-163: ⚠️ Potential issue | 🟠 Major

Add test coverage for extract-index-image.sh parsing/validation

Per CONTRIBUTING.md: “any parsing/validation logic … must include tests with positive and negative cases.” There are no tests for plugins/shared/scripts/extract-index-image.sh, including extract_image_ref(), process_release(), and inspect_and_write().

Required test coverage:

extract_image_ref(): valid extraction, missing LVM_INDEX_IMAGE, malformed/missing build-log.txt content, invalid/empty image reference

process_release(): release with no jobs file, jobs with zero jobs, no lvms-catalogsource logs, invalid/empty artifacts_dir values, successful path

inspect_and_write(): skopeo missing, skopeo inspect failure, successful inspection producing expected JSON (digest/built/commit) and omission of empty fields
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/shared/scripts/extract-index-image.sh` around lines 1 - 163, Add a
test suite that exercises extract_image_ref, process_release, and
inspect_and_write with positive and negative cases: for extract_image_ref add
tests for a log containing "LVM_INDEX_IMAGE is set to <img>", missing line,
malformed/empty content, and whitespace-only image; for process_release simulate
(via temporary WORKDIR and crafted jobs JSON) a missing jobs file, zero-length
jobs array, artifacts entries with missing/invalid artifacts_dir, no
lvms-catalogsource logs, and a successful end-to-end run that writes
release-<version>.json; for inspect_and_write test behavior when skopeo is
absent (e.g. remove from PATH or stub command), when skopeo inspect returns
non-zero, and when it returns valid JSON—assert output JSON contains
digest/built/commit only when present and omits empty fields. Use temporary
directories/files and cleanup, invoke the functions (sourcing the script) and
assert stderr messages and output JSON contents; mock/sketch skopeo by creating
a wrapper executable in PATH that emits controlled output to cover success and
failure paths.
Source: Coding guidelines

🧹 Nitpick comments (1)

plugins/shared/scripts/extract-index-image.sh (1)

130-145: ⚡ Quick win

Validate WORKDIR for path traversal.

Per coding guidelines: "Path traversal: canonicalize paths, reject ../"

The --workdir argument is accepted without validation. While WORKDIR itself is controlled by the caller (doctor.sh), defense-in-depth suggests validating it here to prevent misuse if called directly.

Recommendation:

Canonicalize WORKDIR using realpath
Optionally validate it's an absolute path

🛡️ Proposed validation

     if [[ -z "${WORKDIR}" ]]; then
         echo "Error: --workdir is required" >&2
         echo "Usage: $(basename "$0") --workdir DIR <release1,release2,...>" >&2
         return 1
     fi
+    
+    # Canonicalize workdir
+    WORKDIR=$(realpath -m "${WORKDIR}" 2>/dev/null || echo "")
+    if [[ -z "${WORKDIR}" || "${WORKDIR}" != /* ]]; then
+        echo "Error: invalid --workdir path" >&2
+        return 1
+    fi

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@plugins/shared/scripts/extract-index-image.sh` around lines 130 - 145,
Validate and canonicalize the WORKDIR in main(): after parsing --workdir and
before using it, use realpath to canonicalize WORKDIR (e.g.,
WORKDIR_REAL=$(realpath -m "${WORKDIR}" 2>/dev/null)), check realpath succeeded
and that WORKDIR_REAL is an absolute path and a directory (test -d), and
reject/exit with an error if it contains path traversal or is not resolvable;
replace subsequent uses of WORKDIR with the canonical WORKDIR_REAL and ensure
main() returns a non-zero code on validation failure.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@plugins/shared/scripts/create-report.py`:
- Around line 698-706: The function extract_index_image accepts an unsanitized
version and constructs a path directly (used by CLI code that passes user
input), enabling path traversal; fix by validating/sanitizing version (allow
only a tight pattern such as semantic-version or a whitelist of [0-9A-Za-z._-],
reject any '../' or path-separator characters), canonicalize the final path with
os.path.normpath/os.path.abspath and verify it stays inside the workdir (compare
commonprefix or check that abspath startswith workdir abspath) before calling
load_json(path); additionally add unit tests for extract_index_image covering
positive case (valid version and existing JSON), negative cases (missing file
returns None or expected behavior, malformed JSON raises/parses appropriately),
and explicit path-traversal attempts (e.g., version containing ../) to assert
that such inputs are rejected.

In `@plugins/shared/scripts/extract-index-image.sh`:
- Around line 79-124: process_release currently uses the untrusted release
parameter and artifacts_dir values directly in filesystem paths; validate inputs
to prevent path traversal by (1) ensuring release matches an allowed pattern
(e.g., regex like ^(main|[0-9]+\.[0-9]+)$) and rejecting any release containing
.. or slashes before constructing jobs_file or output paths, (2) canonicalizing
each artifacts_dir with realpath or readlink -f and verifying the resolved path
is a descendant of "${WORKDIR}/artifacts" (reject and skip if not), and (3)
applying the same release validation before creating
"${WORKDIR}/index-image/release-${release}.json"; on validation failure echo an
error and return. Use the process_release function, the jobs_file/artifacts_dirs
handling, and where inspect_and_write is called to locate and apply these
checks.
- Around line 27-32: The extract_image_ref function currently greps/seds the log
without validating format; replace the brittle pipeline with a robust
regex-based extraction that matches a valid container image reference (e.g.,
optional registry, repository/path, and either :tag or `@sha256`:digest) and
validate the captured value before returning (reject or return empty on
no-match); update extract_image_ref to accept the log path parameter (build_log)
and perform pattern matching using a single grep -Po or awk/Perl-compatible
regex to validate the image ref, and add unit tests covering: valid extraction,
no match (missing line), malformed lines, and invalid image formats to ensure
regressions are caught.

---

Outside diff comments:
In `@plugins/shared/scripts/extract-index-image.sh`:
- Around line 1-163: Add a test suite that exercises extract_image_ref,
process_release, and inspect_and_write with positive and negative cases: for
extract_image_ref add tests for a log containing "LVM_INDEX_IMAGE is set to
<img>", missing line, malformed/empty content, and whitespace-only image; for
process_release simulate (via temporary WORKDIR and crafted jobs JSON) a missing
jobs file, zero-length jobs array, artifacts entries with missing/invalid
artifacts_dir, no lvms-catalogsource logs, and a successful end-to-end run that
writes release-<version>.json; for inspect_and_write test behavior when skopeo
is absent (e.g. remove from PATH or stub command), when skopeo inspect returns
non-zero, and when it returns valid JSON—assert output JSON contains
digest/built/commit only when present and omits empty fields. Use temporary
directories/files and cleanup, invoke the functions (sourcing the script) and
assert stderr messages and output JSON contents; mock/sketch skopeo by creating
a wrapper executable in PATH that emits controlled output to cover success and
failure paths.

---

Nitpick comments:
In `@plugins/shared/scripts/extract-index-image.sh`:
- Around line 130-145: Validate and canonicalize the WORKDIR in main(): after
parsing --workdir and before using it, use realpath to canonicalize WORKDIR
(e.g., WORKDIR_REAL=$(realpath -m "${WORKDIR}" 2>/dev/null)), check realpath
succeeded and that WORKDIR_REAL is an absolute path and a directory (test -d),
and reject/exit with an error if it contains path traversal or is not
resolvable; replace subsequent uses of WORKDIR with the canonical WORKDIR_REAL
and ensure main() returns a non-zero code on validation failure.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: feba05f3-9443-4b03-99f7-889e1cd8a855

📥 Commits

Reviewing files that changed from the base of the PR and between a7dd0be and 14fd30e.

📒 Files selected for processing (4)

plugins/lvms-ci/scripts/extract-index-image.sh
plugins/shared/scripts/create-report.py
plugins/shared/scripts/doctor.sh
plugins/shared/scripts/extract-index-image.sh

The actual build log uses 'LVM_INDEX_IMAGE is set to: <image>' (with a colon) which the sed pattern did not handle, causing the colon to be included in the image reference. Update the pattern to strip both 'set to ' and 'set to: ' variants. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2026

coderabbitai Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 11, 2026

coderabbitai Bot requested changes Jun 11, 2026

View reviewed changes

Comment thread plugins/shared/scripts/create-report.py Outdated

kasturinarra changed the title ~~Add catalog index image info to LVMS CI doctor report~~ OCPEDGE-2753: Add catalog index image info to LVMS CI doctor report Jun 11, 2026

Remove component name check for index image extraction

a7dd0be

The extract_index_image function returns None when no ## Index Image section exists in the report files, so the component gate is unnecessary. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai Bot removed the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 11, 2026

coderabbitai Bot reviewed Jun 11, 2026

View reviewed changes

coderabbitai Bot approved these changes Jun 11, 2026

View reviewed changes

openshift-ci Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 11, 2026

ggiguash reviewed Jun 12, 2026

View reviewed changes

coderabbitai Bot removed the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 12, 2026

coderabbitai Bot requested changes Jun 12, 2026

View reviewed changes

Comment thread plugins/shared/scripts/create-report.py

Comment thread plugins/shared/scripts/extract-index-image.sh

Comment thread plugins/shared/scripts/extract-index-image.sh

coderabbitai Bot approved these changes Jun 12, 2026

View reviewed changes

openshift-ci Bot added the ready-for-human-review Indicates a PR has been reviewed by automated tools and is ready for human review label Jun 12, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCPEDGE-2753: Add catalog index image info to LVMS CI doctor report#186

OCPEDGE-2753: Add catalog index image info to LVMS CI doctor report#186
kasturinarra wants to merge 4 commits into
openshift-eng:mainfrom
kasturinarra:add-catalog-image-to-report

kasturinarra commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

openshift-ci Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 •

edited by openshift-ci Bot

Loading

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

kasturinarra commented Jun 11, 2026

Uh oh!

ggiguash Jun 12, 2026

Uh oh!

kasturinarra Jun 12, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kasturinarra commented Jun 11, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

openshift-ci Bot commented Jun 11, 2026

Uh oh!

coderabbitai Bot commented Jun 11, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

❌ Failed checks (2 warnings)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

kasturinarra commented Jun 11, 2026

Uh oh!

ggiguash Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

kasturinarra Jun 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kasturinarra commented Jun 11, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 11, 2026 •

edited by openshift-ci Bot

Loading