Skip to content

fix: refresh OSS health snapshots monthly#531

Merged
tym83 merged 3 commits intomainfrom
fix/oss-health-monthly-refresh
May 8, 2026
Merged

fix: refresh OSS health snapshots monthly#531
tym83 merged 3 commits intomainfrom
fix/oss-health-monthly-refresh

Conversation

@tym83
Copy link
Copy Markdown
Contributor

@tym83 tym83 commented May 7, 2026

Summary

  • refresh OSS Health data snapshots for DevStats, OSS Insight, and OpenSSF
  • keep the telemetry snapshot on the last corrected Grafana-matching data instead of overwriting it from /api/overview while that API differs from Grafana
  • fix the monthly OSS Health workflow by running Python scripts via python3 instead of relying on executable bits
  • make the telemetry-only workflow manual and PR-based instead of pushing directly to protected main
  • render telemetry tabs with stable labels: Month, Quarter, Year
  • fix OpenSSF last-updated parsing by using the English status page and normalizing HTML-stripped whitespace

Root Cause

  • The May 1 scheduled update-oss-health run failed with Permission denied when make tried to execute ./hack/update_oss_health.py.
  • The telemetry workflow attempted to push directly to main and was rejected by repository rules requiring PRs and DCO.
  • Pulling telemetry from /api/overview produced lower values than the Grafana-backed snapshot, so the monthly workflow must not overwrite telemetry until that source is fixed.

Verification

  • python3 -m py_compile hack/update_oss_health.py hack/fetch_telemetry.py
  • HUGO_ENV=preview hugo --gc --minify --buildFuture -b https://deploy-preview-531--cozystack.netlify.app --destination /tmp/cozystack-site-public-telemetry-fix --cacheDir /tmp/hugo-cache

Summary by CodeRabbit

  • Data Updates

    • Refreshed OSS Health metrics including DevStats, OpenSSF Best Practices, and OSS Insight data with current statistics.
  • UI Improvements

    • Updated telemetry period selector to display consistent period names ("Month", "Quarter", "Year").
  • Bug Fixes

    • Enhanced OpenSSF data extraction reliability.

Signed-off-by: tym83 <6355522@gmail.com>
@tym83 tym83 requested review from kvaps and lllamnyp as code owners May 7, 2026 05:48
@netlify
Copy link
Copy Markdown

netlify Bot commented May 7, 2026

Deploy Preview for cozystack ready!

Name Link
🔨 Latest commit 92207e3
🔍 Latest deploy log https://app.netlify.com/projects/cozystack/deploys/69fe1ec64c86f400088fa84b
😎 Deploy Preview https://deploy-preview-531--cozystack.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 7, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Pull request updates GitHub Actions workflows to automate monthly OSS health and telemetry snapshot collection, refresh, and PR creation with change detection. Build configuration invokes Python scripts via explicit python3. Data collection scripts fix OpenSSF URL localization and HTML parsing. JSON snapshot files for DevStats, OpenSSF, and OSS Insight are refreshed for current periods. UI template switches to fixed period labels.

Changes

OSS Health & Telemetry Refresh Pipeline

Layer / File(s) Summary
Fetch Telemetry Workflow
.github/workflows/fetch-telemetry.yml
Converts to manual-only workflow via workflow_dispatch; adds pull-requests: write permission and concurrency settings; reworks commit/push to use update-telemetry branch with timestamped signed-off messages and force-push; adds PR management step that checks for existing PR and creates one if absent.
OSS Health Workflow
.github/workflows/update-oss-health.yaml
Adds explicit contents and pull-requests permissions and concurrency configuration; implements change detection via git diff --cached --quiet to emit changed=true/false to $GITHUB_OUTPUT; commits only when changes detected to update-oss-health branch; gates PR creation step on changed == 'true'.
Build Configuration
Makefile
update-services target passes --pkgdir extra flag to ./hack/update_apps.sh; update-oss-health target invokes python3 hack/update_oss_health.py instead of direct script execution.
Data Collection Scripts
hack/fetch_telemetry.py, hack/update_oss_health.py
fetch_telemetry.py docstring updated to describe manual workflow and local backfill usage; update_oss_health.py changes OpenSSF URL from localized (pt-BR) to non-localized path; parse_openssf_last_updated() strips HTML tags and normalizes whitespace before regex extraction.
DevStats Snapshots
data/oss-health/devstats.json, static/oss-health-data/devstats.json
Month, quarter, and year periods refreshed: issue counts, date ranges, summary cards, language distributions, top contributors, and top PR authors updated; updated_at timestamp advanced to 2026-05-07T04:49:37Z.
OpenSSF Snapshots
data/oss-health/openssf.json, static/oss-health-data/openssf.json
Badge and check timestamps updated: badge_last_updated_at populated, last_checked_at and document updated_at advanced to 2026-05-07T04:49:37Z.
OSS Insight Snapshots
data/oss-health/ossinsight.json, static/oss-health-data/ossinsight.json
Month, quarter, year periods refreshed with updated issue counts, date ranges (moved to May window), summary cards (commits, PRs merged), top contributors, and top PR authors; updated_at timestamp advanced.
Summary Snapshots
data/oss-health/summary.json, static/oss-health-data/summary.json
Document-level updated_at timestamp updated to 2026-05-07T04:50:59Z and 2026-05-07T04:49:37Z respectively.
UI Period Labels
layouts/_default/oss-health-app.html
Telemetry period switcher changed from using dynamic payload.periods[name].label to fixed display strings: "Month", "Quarter", "Year".

🎯 3 (Moderate) | ⏱️ ~25 minutes

🐰 Pipelines hop along, data refreshed bright,
Snapshots bundled in branches so tight,
Change detection's our friend, no wasteful PR spray,
Fixed labels hop cleanly, the UI's on display! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'fix: refresh OSS health snapshots monthly' accurately describes the main objective: setting up monthly OSS health snapshot refreshes with corrected workflows and permissions.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/oss-health-monthly-refresh

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request updates the project's OSS health and telemetry data, refreshing metrics for commits, contributors, and issues across multiple JSON data files. Key changes include updating the Makefile to automate telemetry fetching, switching the OpenSSF status URL to English, and improving the parsing of the OpenSSF last updated date by stripping HTML tags. Feedback was provided to enhance the robustness of the HTML parsing logic in hack/update_oss_health.py by unescaping entities and normalizing whitespace to ensure consistent regex matching.

Comment thread hack/update_oss_health.py Outdated
Comment on lines +360 to +361
plain_text = re.sub(r"<[^>]+>", " ", page_text)
match = re.search(r"last updated on\s+(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} UTC)", plain_text, re.IGNORECASE)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The HTML stripping and regex matching for the OpenSSF last updated date could be more robust. Normalizing whitespace after stripping tags and unescaping HTML entities (like &nbsp;) ensures the regex matches correctly even if the source formatting varies or contains non-breaking spaces.

Suggested change
plain_text = re.sub(r"<[^>]+>", " ", page_text)
match = re.search(r"last updated on\s+(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} UTC)", plain_text, re.IGNORECASE)
plain_text = " ".join(unescape(re.sub(r"<[^>]*>", " ", page_text)).split())
match = re.search(r"last updated on\s+(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2} UTC)", plain_text, re.IGNORECASE)

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hack/update_oss_health.py`:
- Around line 360-361: The regex for matching "last updated on ..." can fail
when HTML tag removal leaves extra/newline whitespace inside the timestamp;
after stripping tags into plain_text (variable plain_text produced by re.sub),
normalize whitespace (e.g., collapse all runs of whitespace to a single space
using re.sub(r"\s+", " ", plain_text)) before calling re.search so the timestamp
pattern in match reliably finds "YYYY-MM-DD HH:MM:SS UTC"; update the code
around plain_text and match to normalize whitespace prior to the re.search call.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: bc0b7d45-a905-4a29-9f55-5962e1bde124

📥 Commits

Reviewing files that changed from the base of the PR and between 3e58234 and e36d8b1.

📒 Files selected for processing (14)
  • .github/workflows/fetch-telemetry.yml
  • .github/workflows/update-oss-health.yaml
  • Makefile
  • data/oss-health/devstats.json
  • data/oss-health/openssf.json
  • data/oss-health/ossinsight.json
  • data/oss-health/summary.json
  • hack/fetch_telemetry.py
  • hack/update_oss_health.py
  • static/oss-health-data/devstats.json
  • static/oss-health-data/openssf.json
  • static/oss-health-data/ossinsight.json
  • static/oss-health-data/summary.json
  • static/oss-health-data/telemetry.json

Comment thread hack/update_oss_health.py Outdated
tym83 added 2 commits May 7, 2026 11:24
Signed-off-by: tym83 <6355522@gmail.com>
Signed-off-by: tym83 <6355522@gmail.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
layouts/_default/oss-health-app.html (1)

164-176: ⚡ Quick win

renderTimeseries still reads payload.periods[name].label without a guard

renderTelemetry was hardened by switching to static labels, but renderTimeseries (unchanged) still calls payload.periods[name].label while iterating the full order array with no existence check. If a DevStats / OSS Insight snapshot ever omits one of the three periods, this throws a TypeError and leaves the view blank. Aligning it with the pattern introduced here would close that gap.

♻️ Suggested defensive fix for renderTimeseries
  const renderTimeseries = (payload) => {
    const order = ["month", "quarter", "year"];
+   const labels = { month: "Month", quarter: "Quarter", year: "Year" };
+   const available = order.filter((name) => payload.periods[name]);
    let active = "month";

    const renderActive = () => {
      const period = payload.periods[active];
      const controls = `
        <div class="oss-health-switcher" role="tablist" aria-label="Report period">
-         ${order.map((name) => `
+         ${available.map((name) => `
            <button class="oss-health-switcher__button${name === active ? " is-active" : ""}" data-period="${name}" role="tab" aria-selected="${name === active ? "true" : "false"}">
-             ${escapeHtml(payload.periods[name].label)}
+             ${labels[name]}
            </button>
          `).join("")}
        </div>
      `;
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@layouts/_default/oss-health-app.html` around lines 164 - 176,
renderTimeseries currently assumes payload.periods[name] exists when iterating
order (["month","quarter","year"]) and calling payload.periods[name].label,
which can throw if a period is missing; update renderTimeseries to first filter
order to only include names present in payload.periods (e.g. const available =
order.filter(n => payload.periods && payload.periods[n])) and use available for
rendering and mapping, default active to available[0] when the initial "month"
is absent, and when rendering labels use a safe fallback
(payload.periods[name].label || payload.periods[name].fallbackLabel || name) so
renderActive, the controls button generation and aria/selected logic all operate
only over existing payload.periods entries.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@layouts/_default/oss-health-app.html`:
- Around line 164-176: renderTimeseries currently assumes payload.periods[name]
exists when iterating order (["month","quarter","year"]) and calling
payload.periods[name].label, which can throw if a period is missing; update
renderTimeseries to first filter order to only include names present in
payload.periods (e.g. const available = order.filter(n => payload.periods &&
payload.periods[n])) and use available for rendering and mapping, default active
to available[0] when the initial "month" is absent, and when rendering labels
use a safe fallback (payload.periods[name].label ||
payload.periods[name].fallbackLabel || name) so renderActive, the controls
button generation and aria/selected logic all operate only over existing
payload.periods entries.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: df77ddcf-14e8-4b98-8323-132fba9c7161

📥 Commits

Reviewing files that changed from the base of the PR and between 55c4f41 and 92207e3.

📒 Files selected for processing (4)
  • .github/workflows/fetch-telemetry.yml
  • Makefile
  • hack/fetch_telemetry.py
  • layouts/_default/oss-health-app.html
✅ Files skipped from review due to trivial changes (1)
  • hack/fetch_telemetry.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • .github/workflows/fetch-telemetry.yml

@tym83 tym83 merged commit 0ad94fc into main May 8, 2026
6 checks passed
@tym83 tym83 deleted the fix/oss-health-monthly-refresh branch May 8, 2026 18:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant