Weekly site-rules curation agent (self-improvement loop, part 4)#10
Open
myleshorton wants to merge 2 commits into
Open
Weekly site-rules curation agent (self-improvement loop, part 4)#10myleshorton wants to merge 2 commits into
myleshorton wants to merge 2 commits into
Conversation
Deploying wickproject with
|
| Latest commit: |
14edda0
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://d02ad879.wickproject.pages.dev |
| Branch Preview URL: | https://wick-curate-agent.wickproject.pages.dev |
f1135e1 to
9e2107a
Compare
Adds the judgment layer of the self-improvement loop on top of the deterministic probe harness. bench/curate-inputs.sh gathers the weekly inputs (stats with per-cause failure breakdown, latest probe traces, currently published rules) into one JSON; agent-skill/wick-curate/SKILL.md reasons about what the fixed matrix couldn't crack — sites failing EVERY cell (e.g. apkpure), high user-offline noise, seed/measured conflicts — and proposes + tests genuinely new methods (different residential country, wait_for_selector, URL rewrites, CEF+residential), then re-probes and republishes. Guardrail: never publish a rule that isn't backed by a passing probe. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
9e2107a to
33f454c
Compare
Same belt-and-suspenders as probe.sh for the host aggregation. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Weekly site-rules curation agent — the "invent new methods" layer
The judgment layer of the self-improvement loop, on top of the deterministic probe harness (#9). The harness measures a fixed matrix (
cronet | cronet+residential | cef) and corrects rules from it; this agent looks at what the matrix couldn't crack and reasons about methods it doesn't try.Stacked on #9 (base branch
self-improving-site-rules) — review/merge #9 first.What it adds
bench/curate-inputs.sh— read-only, creds-free. Gathers the weekly inputs into one JSON:failing— site-side failing hosts with a per-cause breakdown (reset/refused/timeout/403…), so the agent reasons about why.hard— hosts where the latest probe sweep failed every cell (verified live:apkpure.com).published_rules— what clients are currently served.agent-skill/wick-curate/SKILL.md— reasons about anomalies and proposes + tests new methods in priority order: different residential country (geo-blocks),wait_for_selector(un-hydrated SPAs), URL rewrites (à laold.reddit), CEF+residential (DataDome-class), else file an issue.Guardrails
offlinesignal — high user-offline fraction means it's the users' network, not the site; excluded from "hard sites."probe.sh,publish-rules.sh,/residential-proxy); methodology caveats inbench/PROBE.md.🤖 Generated with Claude Code