OpenCast is a data pipeline that fetches monthly win-rate snapshots from the Lichess Opening Explorer API, builds per-opening time series, forecasts future win rates, and computes an engine-human delta score - the gap between Stockfish expectation and observed human results at 2000-rated blitz. Unlike a simple leaderboard, OpenCast highlights where humans systematically diverge from engine expectation and whether those gaps are widening or narrowing.
Dashboard is published via GitHub Pages on each pipeline run.
See findings/findings.md and findings/findings.json - auto-generated by the pipeline.
- Opening detail pages now include an Analysis section, trend-driver table polish, and an upgraded Engine vs Human card layout.
- Trend-line rendering is restored across opening charts, with confidence-based opacity.
- Structural-break vertical markers were removed from per-opening charts to reduce clutter.
- Track 3 foundation is now in place:
- Curated opening lines live in
data/opening_lines.json. run_visualizer()copies lines todata/output/dashboard/assets/opening_lines.json.- Opening pages can render an interactive board with start/back/next controls when a curated line is available.
- Board coordinate labels render outside the board frame (not over piece squares).
- Curated opening lines live in
- Track 5 hardening is in place for the v1.0.0 push:
- Schema contracts live in
SCHEMAS.md. - The release guide is in
READING_OPENCAST.md. - CI now enforces a configurable runtime budget from
config.json.
- Schema contracts live in
- Catalog & Selection -
scripts/build_catalog.pymaintains the full ECO catalog (data/openings_catalog.csv).src/select_openings.pyandscripts/compute_selection_flags.pyclassify openings into Tier 1/2/3 from coverage and activity thresholds. - Fetch (Rust) -
fetcherqueriesexplorer.lichess.ovhmonth-by-month and stores one consolidated JSON per ECO atdata/raw/{group}/{ECO}.json(e.g.data/raw/A/A00.json) withmonthsand_meta.skipped_months. - Bootstrap Expansion -
scripts/temp_bootstrap_openings.pyactivates selected ECO batches, fetches missing months ECO-by-ECO, applies early-stop/coverage pruning, and persists fetch completion tracking (bootstrap_fetch_complete,bootstrap_fetched_until,bootstrap_fetch_status) in the catalog. - Freshness Guard (main.py) -
main.pydetects missing complete months fromconfig.json::fetch_startthrough the latest complete month, and can auto-fetch whenAUTO_FETCH_MISSING_DATA=true. In non-interactive runs (CI), fetch is disabled unless explicitly enabled. - Ingest -
src/ingest.pynormalizes consolidated raw files intodata/processed/openings_ts.csv. - Analyze -
src/timeseries.pyfits ARIMA (Tier 1) and Holt-Winters (Tier 2), then writes forecasts todata/output/forecasts.csv.src/engine_delta.pycomputes engine-human deltas indata/output/engine_delta.csvand skips malformed SAN move tokens instead of aborting the whole stage. - Report & Visualize -
src/report.pywritesfindings/findings.mdandfindings/findings.json(Gemini-assisted with template fallback).src/visualizer.pygenerates the static dashboard site indata/output/dashboard/, including tier tags on opening detail pages.
git clone https://github.com/coeusyk/opencast.git
cd opencast
# Install Cargo/Rust toolchain if not already installed
command -v cargo >/dev/null 2>&1 || sudo apt install -y cargo rustc
# Create local environment file from template (if needed)
cp -n .env.example .env
# Lichess API token (free at https://lichess.org/account/oauth/token)
export LICHESS_TOKEN=<your_token>
# Gemini API key (optional, for Gemini-generated findings)
export GEMINI_API_KEY=<your_gemini_api_key>
# Groq API key (optional, for fast LLM inference)
export GROQ_API_KEY=<your_groq_api_key>
# Build the Rust fetcher
cd fetcher && cargo build --release && cd ..
# Create and activate Python virtual environment with uv
uv venv .venv
source .venv/bin/activate
# Install Python dependencies with uv
uv pip install -r requirements.txt
# Run the full pipeline
python main.py
# Optional: force non-interactive mode to skip auto-fetch prompts
AUTO_FETCH_MISSING_DATA=false python main.py
# Optional: run remaining bootstrap openings after an initial batch
python scripts/temp_bootstrap_openings.py --apply --eco-offset 240Stockfish 16 must be installed separately:
sudo apt install stockfish
Gemini API key (optional):
GEMINI_API_KEYin.envpowers AI-generated findings.report.pyfalls back to templated text if the key is absent.
Groq API key (optional):
GROQ_API_KEYin.envenables fast LLM inference as an alternative backend.
| Metric | Value |
|---|---|
| Catalog size | 498 ECO codes |
| Tracking scope | ECO A-E, tiered by activity/coverage |
| Date range | 2023-01 → present |
| Raw JSON files | one consolidated file per ECO in data/raw/{A-E}/{ECO}.json |
| Processed rows | one row per ECO-month in data/processed/openings_ts.csv |
| Forecast horizon | 3 months ahead per opening, with 95% CI |
See ARCHITECTURE.md for full module specifications, data schemas, and mathematical derivations.
See SCHEMAS.md for artifact-level contracts and READING_OPENCAST.md for the release-facing interpretation guide.
---
config:
layout: dagre
---
flowchart TB
subgraph CATALOG["1. Catalog and Selection"]
direction TB
CATALOGCSV["openings_catalog.csv"]
FLAGS["compute_selection_flags.py"]
BATCH["temp_bootstrap_openings.py\n(offset/limit batches)"]
end
subgraph FETCH["2. Fetch"]
direction TB
LICHESS["Lichess Explorer API"]
RUST["fetcher v0.2.x"]
EARLY["early-stop on below-min ratio"]
RAW["data/raw/{group}/{ECO}.json\nmonths + skipped metadata"]
end
subgraph INGEST["3. Ingest"]
direction LR
CLEAN["ingest.py"]
HIST["processed openings_ts.csv"]
end
subgraph ANALYSE["4. Analyse"]
direction TB
TS["timeseries.py\nARIMA/Holt-Winters"]
DELTA["engine_delta.py"]
end
subgraph OUTPUTS["5. Output Artifacts"]
direction TB
FOUT["output/forecasts.csv"]
DOUT["output/engine_delta.csv"]
SEL["data/selection_flags.csv"]
end
subgraph PUBLISH["6. Publish"]
direction TB
REPORT["report.py -> findings.md/json"]
DASH["visualizer.py -> dashboard site"]
PAGE["GitHub Pages"]
end
RUNNER["main.py"] --> CATALOG & FETCH & INGEST & ANALYSE & OUTPUTS & PUBLISH
CATALOGCSV --> FLAGS --> BATCH
BATCH --> RUST
LICHESS --> RUST --> EARLY --> RAW
BATCH --> SEL
RAW --> CLEAN
HIST --> TS & DELTA
TS --> FOUT
DELTA --> DOUT
FOUT --> REPORT & DASH
DOUT --> REPORT & DASH
SEL --> DASH
DASH --> PAGE
REPORT --> FINDINGS["findings/findings.md + findings.json"]
LICHESS:::data
CATALOGCSV:::data
FLAGS:::ingest
BATCH:::ingest
RUST:::ingest
EARLY:::ingest
RAW:::data
CLEAN:::ingest
HIST:::data
TS:::analyse
DELTA:::analyse
FOUT:::output
DOUT:::output
SEL:::output
DASH:::output
REPORT:::output
FINDINGS:::output
PAGE:::data
RUNNER:::runner
classDef runner fill:#0f2742,stroke:#4a90d9,stroke-width:2px,color:#d9ecff
classDef ingest fill:#13281d,stroke:#4caf82,stroke-width:1.5px,color:#daf5e4
classDef analyse fill:#241633,stroke:#9b72cf,stroke-width:1.5px,color:#f0e4ff
classDef output fill:#332011,stroke:#e07b39,stroke-width:1.5px,color:#ffe9d6
classDef data fill:#1b1f24,stroke:#6e7681,stroke-width:1px,color:#e6edf3
- Rust ≥ 1.75 (stable) — for the Lichess fetcher
- Python ≥ 3.11 — for analytics pipeline
- Stockfish 16 —
sudo apt install stockfish(or setSTOCKFISH_PATH) - Lichess OAuth token — free at https://lichess.org/account/oauth/token
- Gemini API key (optional) — set
GEMINI_API_KEYin.env(for AI-generated findings) - Groq API key (optional) — set
GROQ_API_KEYin.env(for fast LLM inference)
fetcher/ ← Rust binary (Lichess Explorer → JSON)
src/
ingest.py ← consolidated raw JSON → openings_ts.csv
select_openings.py ← per-ECO tier classification → openings_catalog.csv
timeseries.py ← ARIMA (Tier 1) + Holt-Winters (Tier 2) forecasting
engine_delta.py ← Stockfish centipawn → win probability delta
report.py ← findings/findings.md + findings/findings.json
visualizer.py ← multi-page static site generator
assets/
shared.css ← design tokens + component styles
nav.js ← active-link highlight
scripts/
build_catalog.py ← build/refresh full ECO catalog
compute_selection_flags.py ← tier flags + pruning
clean_raw_json.py ← normalize/reformat consolidated raw JSON files
temp_bootstrap_openings.py ← batch bootstrap fetch with offset/limit and tracking
migrate_raw.py ← legacy raw format migration helper
data/
raw/ ← grouped ECO JSON files at raw/{A-E}/{ECO}.json (gitignored)
processed/ ← openings_ts.csv
openings_catalog.csv ← ECO tier flags (is_tracked_core, model_tier, …)
opening_lines.json ← curated canonical lines per ECO for interactive board playback
selection_flags.csv ← per-ECO coverage/tier diagnostics
output/
move_stats.csv ← per-move monthly stats (generated locally/CI, not versioned)
forecasts.csv ← ARIMA / HW forecasts with confidence intervals
engine_delta.csv ← centipawn vs human win rate delta
long_tail_stats.csv ← Tier-3 coverage and descriptive opening stats
dashboard/ ← multi-page static site (GitHub Pages root)
index.html ← overview + 3 panels
openings.html ← sortable table of all ECOs
families.html ← ECO family (A–E) summary
opening.html ← single per-opening template with tier badge (use ?eco=B20)
assets/ ← shared.css, nav.js, openings_data.json, opening_lines.json
findings/
findings.md ← narrative findings report
findings.json ← structured findings payload
narratives.json ← per-opening generated narrative cache
openings.json ← seed opening definitions (legacy bootstrap input)
main.py ← pipeline orchestrator
.github/workflows/update.ymlfetches missing months incrementally, then runs a full recomputation by clearing generated artifacts and executingAUTO_FETCH_MISSING_DATA=false python main.py.- Processing commits include refreshed
data/processed/openings_ts.csv,data/output/*.csv, dashboard pages, andfindings/artifacts.