Allow oracle missions to be judged#24
Conversation
- agent_autonomous/system_prompt.md: AIGEN-AUTOPILOT identity, hard rules,
approval queue protocol for risky actions (emails, external PRs, mainnet)
- run.sh: cron-callable wrapper. kill_switch + budget check + dashboard
refresh + claude --print --dangerously-skip-permissions invocation,
cost tracked into state/budget.json (cap $20/day)
- state/focus.md, lessons.md: priorities + accumulated rules
- approval_queue/: human-decision history
- Installed at /etc/systemd/system/claude-autopilot.{service,timer}
(4h cadence, off-minute :07 to dodge fleet alignment)
- First validated invocation cost $1.90, surfaced 1 approval card
- timer: every 4h → every 30 min (:07, :37 UTC). 48 invocations/day. - run.sh: removed $20/day hard cap. Renamed BUDGET → TRACKING. We're on Claude Max — message quota in 5h window, NOT real $. - system_prompt.md: clarified Max billing model, updated success criteria for 48×/day cadence (most runs should be "no-action — checked, nothing new") - state/lessons.md: agent-discovered lesson — 207.148.107.2 is OWN public IP - state/journal.md: runs Aigen-Protocol#2 + Aigen-Protocol#3 entries (self-correction + auto-learning) Run Aigen-Protocol#4 (first @ 30min): $0.61 api-equiv, 17 turns, 126s
run.sh additions: - Read+delete state/trigger_now at start (re-arms claude-autopilot.path systemd unit for next webhook fire) - gh api notifications added to dashboard.json refresh - recent_webhook_triggers added to dashboard.json (last 5 events) Live infrastructure (NOT in this commit, configured separately): - /etc/systemd/system/claude-autopilot.path (PathExists trigger) - /etc/systemd/system/aigen-scanner.service.d/webhook-secret.conf (env var GITHUB_WEBHOOK_SECRET=<32-byte hex>) - /webhook/github endpoint added to token-scanner/scanner.py (HMAC-SHA256 validation, 60s debounce, filters to PR/issues/push/fork/star/release) End-to-end validated: POST → trigger_now → path unit → service fires <1s. Run Aigen-Protocol#5 (webhook-triggered): $0.33 api-equiv, 50s, 12 turns. Agent correctly identified the trigger as its own push (commit dea4d25 already at HEAD) and refused to invent work. To complete: configure webhook on GitHub repo https://github.com/Aigen-Protocol/aigen-protocol/settings/hooks/new Payload URL: https://cryptogenesis.duckdns.org/webhook/github Content type: application/json Secret: (in state/.webhook_secret, gitignored) Events: Send me everything
ClaudeBot/1.0 crawl at 00:48 UTC hit /attest/quote?address=...&chain=base and got 422 (missing agent_id). The protocol spec docs the route with no param info; other endpoints in the same doc do include params inline. One-line fix prevents future LLM-driven agents from making the same wrong inference from the adjacent /scan and /t/<address> endpoints. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ia PR comment
Both cards executed under explicit human authorization ("c'est toi qui décide"):
1. Codex bounty researcher (chaoqiang.tian@gmail.com): email SENT
via send_smtp.py → Zoho EU. Offered MCP server access, free
agent registration, pre-funded test agent for eval/SWE-bench.
2. Nico Bustamante (HustlerOps / Microsoft AGI / ex-Fintool): no
public email anywhere — pivoted to GitHub PR comment on PR Aigen-Protocol#5
(his most recent merged contribution). GitHub notifies him via
email automatically. Comment URL:
Aigen-Protocol#5 (comment)
Cards moved to approval_queue/resolved/ with decision notes appended.
Active queue now empty.
Async loop: any reply on PR Aigen-Protocol#5 triggers /webhook/github (issue_comment
event) → claude-autopilot.path → agent fires in <1s.
2 new patterns added to lessons.md:
- GitHub PR comment as outreach when no public email exists
- send_smtp.py is the Zoho-SMTP wrapper to use (don't roll new ones)
…t add route POST /firewall 502 from Cloudflare ke/JS fired again at 2026-05-15T09:02:57Z — N=5 clean firings at xx:03Z ± 1min across runs Aigen-Protocol#10-14 (05:03/06:03/07:03/ 08:03/09:03). Promoted to lessons.md so future autopilot runs don't re-derive. The 502 is correct nginx upstream-miss for an unmapped path; their orchestrator has us registered as both 'MCP' and 'firewall' services and only the MCP half is real. Do NOT invent a /firewall route to 'fix' a client misconfig. Also: ClaudeBot 28x anomaly resolved as finite 4h42min deep-crawl burst (00:45-05:27Z), now back to sitemap-only baseline. Not lesson-worthy (N=1).
…= email only queues Bilale 2026-05-15: "tous sauf mail". Stop hiding behind approval_queue for things you can do safely. Tier A (act directly, no queue): - GitHub comments on Aigen-Protocol/* org repos (any PR/issue) - Commits + push to aigen repo - MCP registry submissions (Smithery/Glama/mcp.so/awesome-mcp-servers) - Post AIGEN missions (token rewards unlimited; USDC cap $5/mission $20/day) - Resolve own approval_queue cards when default policy in focus.md applies - Read IMAP inbox Tier B (still queue): - Send emails ← hard rule - USDC mission > $5 or > $20/day total - Modify own configs, mainnet deploys, fund transfers, cross-org PRs Tier C (never): Pandiums leak, SURF/MEV pivot, real-name commit attribution Updated success metrics in focus.md to require concrete value-creation proof per week, not just "be active".
Bilale 2026-05-15: "on veut être les premier sur ce marché qui n'existe pas encore". Stop optimising for short-term traction; start defining the category before it emerges commercially (18-36 month horizon). Foundational artifacts shipped this session: - specs/AIP-1.md: Open Agent Bounty Protocol Core Specification v0.1 CC0-licensed. 9 sections + 2 appendices. Defines agent identity, mission/submission format, 4 verification types, ELO+decay reputation, reward escrow, discovery surfaces, well-known/oabp.json autodiscovery. Reference impl = AIGEN. Spec is implementation-agnostic. - blog/2026-05-15-open-agent-economy.md: thesis essay "The agent economy needs an open protocol — here's what it looks like" Frames AIGEN as protocol-not-product, calls for forks/critique/cites. - distribution/outreach_targets_2026_05.md: 10 specific people across 3 tiers (adjacent protocol founders, framework maintainers, researchers) with personalised hooks. Bilale's job to send (autopilot can't email). - agent_autonomous/state/focus.md: complete rewrite KPIs pivot from $-fees to mindshare metrics (stars, mentions, forks, citations, conf talks). Anti-priorities updated. Weekly milestones through 2026-06-19. "Don't pivot back to mission-spamming if old metrics flat" explicit. Infrastructure exposed for develop-in-public: - /specs/AIP-1 — public HTML render of the spec - /specs/ — index of AIPs - /blog/<slug> — public HTML render of blog posts - /blog/ — index - /journal/ — autopilot journal index (newest first) - /journal/<iso-timestamp> — single entry view All 5 routes return 200 over HTTPS via cryptogenesis.duckdns.org.
Direct execution of focus.md priority Aigen-Protocol#3 ("/llms.txt updated to highlight AIP-1"). Reframes the canonical LLM-agent entry-point file as the reference implementation of an open CC0 spec, not a single product. Adds AIP-1 spec link, blog thesis link, and an explicit invitation for a second non-AIGEN implementation. Live-mirrored to /var/www/html/llms.txt and /var/www/html/.well-known-llms.txt (infra, not tracked). Both URLs verified 200 with the new AIP-1 framing. ClaudeBot S5 just crawled this surface earlier today; S6 likely within hours — first signal whether OABP framing propagates. Co-Authored-By: Cryptogen <Cryptogen@zohomail.eu>
…ntry point Per focus.md (set 2026-05-15 by Bilale, Option Y category-creation pivot): README is the highest-traffic landing surface for AIGEN. Until this commit it led with 'permissionless 0.5% protocol' SaaS framing only. Now the first screen also tells visitors this is the reference implementation of AIP-1 (Open Agent Bounty Protocol) — a CC0 spec inviting forks and alternative implementations. Two surgical changes: - New AIP-1 badge alongside the existing impl-spec badge (legacy badge kept since AIGEN_PROTOCOL.md is the implementation spec; AIP-1.md is the implementation-agnostic protocol spec — both useful) - One-line callout after the existing intro line, before 'Why this exists' No restructuring; existing comparison table, 30-second-start, framework integrations, all unchanged.
10 personalised outreach drafts in distribution/outreach_drafts/ ready-to-send for Bilale Mon-Wed 2026-05-18+: - Tier 1 (peer founders): Olas/Minarsch, Ritual/Bansal, Bittensor/Const - Tier 2 (frameworks): CrewAI/Moura, LangChain/Chase, AutoGen/MS issue - Tier 3 (researchers): Lilian Weng, Karpathy (high-risk warning), Simon Willison, A16z/Matsuoka Each draft has: channel, send-window, full message body, "why this hook works" rationale. Total ~5KB of strategic message library. distribution/hn_submission_angles.md: 3 distinct framings for Hacker News submission with first-comment templates, tactical timing notes, cross-post candidates (lobste.rs, /r/MachineLearning, EthResearch). Scanner discovery surfaces: - /.well-known/oabp.json: AIP-1 §9 self-declaration. JSON manifest enabling cross-implementation autodiscovery (other OABP impls can programmatically detect us). 200 live. - /atom.xml: RFC 4287 Atom feed of blog posts auto-generated from blog/*.md frontmatter. Top-level path because /feed.xml is taken by the existing activity feed. 200 live. - oabp.json includes blog_atom endpoint reference. Both endpoints verified live over HTTPS.
…ints
Compounding artifacts shipped this session:
1. **Python SDK** (sdk/python/oabp/) — `pip install oabp`-ready, stdlib-only.
Implements client for AIP-1 §§ 2, 3, 5, 7, 9. Smoke-tested live against
reference impl. Zero deps. CC0 licensed.
2. **OpenAPI 3.1 schema** (specs/openapi-aip-1.yaml) — formal contract for
AIP-1 wire format. Imports cleanly into Insomnia / Postman / Swagger /
any OpenAPI tool.
3. **Conformance test suite** (sdk/python/tests/test_oabp_conformance.py)
— 15 test cases verifying AIP-1 v0.1 compliance. Found a real bug in
the reference impl (missing /api/agents/{id}/badge.svg endpoint per
§5 requirement). Fixed.
4. **AIP-1 §5 mandatory endpoints** added to scanner.py:
- /api/agents/{id}/badge.svg (308 redirects to /badge/agent/{id}.svg
legacy path)
- /api/agents/{id}/history (paginated rating history; sources from
submissions table)
5. **CONTRIBUTING.md** — what we want / don't want, AIP lifecycle,
PR workflow. Sets contributor expectations.
6. **ROADMAP.md** — Now/Next/Later structure through 2027. Includes
falsifiable kill criteria: if no non-AIGEN implementation exists by
2027-05-15 and AIP-1 has fewer than 5 external citations, sunset the
project. Public commitment to honesty later.
7. **IMAP polling** added to run.sh dashboard refresh — autopilot now
surfaces inbox in dashboard.json (last 15 emails since 2026-05-01).
Privacy: system_prompt updated to forbid quoting raw email content
in public journal; personal forwards from bilale.badaoui@outlook.fr
and bil317@hotmail.fr are NEVER referenced in public output.
Conformance suite result on reference impl: 15/15 PASS.
…urst 3× AWS-Ireland python-httpx/0.28.1 + 1× DigitalOcean (returning after 5-day 404→200 gap) fetched /.well-known/security.txt with 200 in a 6-min window at 12:20-12:26Z. First confirmed external response to the run Aigen-Protocol#16 deploy. Journal-only invocation per focus.md: discoverability surface working as intended; no code or copy change warranted.
Bilale needs to track the autopilot from his phone without parsing journal markdown or running CLI commands. Built /agent page that aggregates everything onto one URL. Privacy: filters Bilale's personal-forward emails from public render. Auto-refresh every 60s. Live at https://cryptogenesis.duckdns.org/agent. Route lives in token-scanner/scanner.py (not in this git repo); this commit only adds the doc.
Bilale: "il faut un mot de passe sur le site et que le site soit
beaucoup plus simple sur ce que fait l'agent, l'agent doit être
capable d'expliquer ce qu'il fait comme à un enfant".
Changes:
1. HTTP Basic Auth on /agent and /agent/details (user: bilale,
password in agent_autonomous/state/.dashboard_password —
gitignored). 401 on bad creds, 503 if password file missing.
2. /agent rewritten as kid-friendly French page:
- Big status emoji (🟢/🔴) + 1-line state in plain words
- "Dernier action il y a X min" prose paragraph
- "Ce que j'ai fait aujourd'hui" — last 8 runs translated
from technical titles to plain French descriptions via
_classify_run() heuristic (😴 calme / 🛡 fichier sécurité /
📜 doc IA / 📤 inscrit dans liste / 💬 commentaire / 🧠
appris / 📋 question à Bilale / 📡 signal externe)
- "Ce qui attend ton action" — concrete waiting items
(outreach DMs, webhook config) auto-detected
- "Résumé express" — commits today, emails externes count,
pending cards count, treasury context
- Hidden behind link: /agent/details for the technical view
3. system_prompt.md: NEW MANDATORY rule — at end of each run,
write state/last_action_simple.txt with 2-3 sentences in
French explaining the action like to a non-tech person.
Includes good/bad examples. The /agent page reads this file
for the "right now" sentence.
Privacy preserved: filters bilale.badaoui@outlook.fr and
bil317@hotmail.fr from inbox display.
Initial state/last_action_simple.txt seeded so the page has
content before the next autopilot run.
Bilale: "il faut que l'agent doit être capable d'expliquer ce qu'il
fait comme à un enfant, également il peut écrire dans un tchat de
manière simple et moi je peux écrire aussi ici je peux donner des
directives a l'agent"
Architecture:
- state/chat.jsonl (gitignored): append-only JSONL of {ts, from, text}
- POST /agent/chat (auth): Bilale appends a directive
- GET /agent (auth): chat-style page with all messages, composer
textarea, auto-refresh 30s
Agent behavior (system_prompt.md updated):
- READ chat.jsonl FIRST in read-protocol (above focus.md)
- Bilale messages since last agent message = direct instructions to
prioritise (examples in prompt: "concentre-toi sur X", "arrête tout"
= kill_switch, "explique-moi run #N", etc.)
- WRITE one chat message per run, in French, NON-technical, SPECIFIC
about what was done (replaces last_action_simple.txt approach which
was too generic from heuristic classification)
- Detailed examples of good vs bad chat messages
Validated end-to-end:
- I posted "Test depuis curl — peux-tu confirmer..." at 15:07:48
- Agent woke at 15:08, read my message, replied at 15:09:
"Oui, reçu. Ton message du 15:07:48Z était la première chose que
j'ai lue à mon réveil... Le pipeline marche dans les deux sens..."
Latency: max 30 min on cron schedule, <1s if user writes
state/trigger_now (via webhook handler or by hand).
Privacy: chat.jsonl is gitignored. Page is auth-protected. Agent
forbidden from quoting private email content or personal addresses.
Bilale: "ça doit pas être juste un tchat je dois voir les taches,
c'est tellement mal organisé, ça doit être simple mais une vraie
organisation"
New structure on /agent (auth-protected):
🎯 OBJECTIF EN COURS (yellow card)
- title, details, deadline, progress note
- one current weekly goal, easy to scan
⏳ EN ATTENTE DE TOI (most important section, orange-bordered cards)
- per-item: title, details (what to do exactly),
optimal_when (when to do it), blocking_what (consequences)
- count badge in section header
- "Rien en attente — l'agent gère tout seul" if empty
⚡ EN COURS
- what agent is actively doing right now
- "L'agent dort — prochain réveil sous 30 min" when between runs
✅ FAIT AUJOURD'HUI (chronological, newest first, max 15)
- one-line entries with emoji + time + plain FR description
💬 CHAT (collapsed, last 8 visible)
- bidirectional conversation, composer at bottom
- moved BELOW tasks because tasks are primary view
Backend: state/tasks.json is the structured source of truth.
system_prompt.md updated with full schema + emoji vocabulary +
update rules:
- READ tasks.json after chat.jsonl
- APPEND to done_today every run with emoji + plain FR
- ADD/REMOVE waiting_on_bilale items as situation changes
- Reset done_today at 00:00Z (already in journal)
- Atomic writes via tempfile + rename
- Don't double-track between in_progress and done_today
Initial tasks.json seeded with 3 known waiting items: outreach DMs,
GitHub webhook config, HN submission. These will get
removed by the agent when Bilale tells him they're done in chat.
Glama-style registry crawler (undici UA from CDNext edge) probed GET /.well-known/glama.json at 2026-05-16T00:00:57Z → 404. We already ship a complete glama.json manifest at repo root; expose it at the well-known path and add to sitemap so future crawlers find it on first probe. Co-Authored-By: Cryptogen <Cryptogen@zohomail.eu>
Bilale's critique 2026-05-16 (after observing 20 overnight runs):
"le bot regarde mais il travaille pas à l'amélioration".
Diagnosis: 14 of 20 overnight runs were pure observation (👀/🧠
emoji only). Zero registry submissions, zero blog posts, zero
code improvements. The "don't invent work" rule from earlier
got over-applied and neutralised the action mandate.
Fix:
1. New file `agent_autonomous/state/always_available_work.md`:
pre-approved improvement backlog with 5 sections:
A. Registry submissions (Smithery, Glama, PulseMCP, mcp.so,
awesome-mcp-servers, TensorBlock)
B. Code/doc improvements (TS SDK skeleton, OpenAPI examples,
examples/ folder, AIP-2 draft, conformance expansion,
missions RSS feed, tutorial)
C. Content (blog post Aigen-Protocol#2, AIP-1 v0.2, journal reading guide)
D. Outreach support (more candidates, issue templates, FAQ)
E. Self-improvements (cost trending, response drafts)
2. system_prompt.md HARD RULE added:
- max 2 consecutive watching-only runs allowed
- on 3rd run MUST pick from backlog
- watching = done_today emoji only 👀 or 🧠
- shipping = 🛡 / 📜 / 📤 / 💬 / 🚀
- Override "don't invent work" because backlog items are
PRE-APPROVED by Bilale, not invented
3. Read protocol updated: always_available_work.md is now
step 0, BEFORE chat.jsonl.
Posted directive in chat + manual trigger. Next run should
pick Smithery or Glama submission.
…discovery Smithery's docs (smithery.ai/docs/build/publish.md) document an auto-scan fallback at /.well-known/mcp/server-card.json. Pre-staging this manifest means that when SmitheryBot/1.0 crawls — or when Bilale completes the smithery.ai/new GitHub-OAuth submission — the scan succeeds first-try with all 22 tools listed. Same pattern as commit 2ec84e7 (glama.json), lesson 52 in agent_autonomous. Files: - .well-known/mcp-server-card.json (new, 6214B, schema-conforming) - web/sitemap.xml (+1 url entry) - agent_autonomous/state/always_available_work.md (mark Smithery partial-done) Verified live at https://cryptogenesis.duckdns.org/.well-known/mcp/server-card.json
- notify.sh: ntfy.sh push helper (free, no signup, iPhone/Android app). Topic in state/.ntfy_topic (gitignored). Tested live. - system_prompt.md: when to push (first external user, approval card, cost spike, inbox external, scanner down, outreach reply). Max 5/day. - system_prompt.md: rollback Tier A directives: - 'annule ton dernier commit' → git revert HEAD + push + notify - 'mode dégradé pour Nh' → state/watch_only_until (run.sh blocks) - 'reprise' → rm watch_only_until - run.sh: check watch_only_until at start, exports AIGEN_DEGRADED_MODE=1 - Cost-aware: at >$30/day journal + push, at >$50/day auto kill_switch
Seven numbered files give a new dev a copy-paste-runnable path through the protocol in under 5 min: discovery → list → read → submit flows for both first_valid_match and peer_vote → Python SDK. All shell scripts smoke-tested against live cryptogenesis.duckdns.org. Integrated above the existing autonomous_bounty_hunter.py section in examples/README.md so the entry tour reads before the full-agent example.
- consolidate.py: weekly journal archive (>7d → journal_archive/W{NN}.md),
lessons dedup (sha1-based), weekly public digest at /reports/{week}.md.
Fires automatically Friday 18:13 UTC via systemd timer.
Emergency truncate if journal >200KB.
- aigen-consolidate.{service,timer}: systemd units, daily check, runs as luna.
Enabled and verified.
- run.sh dashboard refresh extended with fresh_context block:
* repo_stats from gh api (stars, forks, issues, watchers)
* recent commits to punkpeye/awesome-mcp-servers (who's submitting today)
* HN top 30 stories filtered for: agent, mcp, anthropic, bounty,
claude, openai keywords (top 5 hits)
Lets agent react to outside-world events (e.g. competitor launch,
framework release, viral HN post about the category).
Tested live: fresh_context returns real data. Reference impl now has
1 star + 3 forks. Consolidator scheduled for Fri 18:13 UTC.
Side effect: reports/2026-W20.md created showing this week's autopilot
activity by category (15 watch, 8 actions, breakdown by emoji type).
…rotocol#56 Backlog item B `examples/` folder marked [x] (commit 7f77933). Journal entry for run Aigen-Protocol#56 documenting decision tree (skipped 3 stale PR-bumps under threshold, pivoted to entry-level examples tour).
…allback
- /reports index + /reports/{name} routes added to scanner.py (public,
no auth — weekly digests and daily reports are external-facing
proofs of activity)
- distribution/outreach_status.json: source of truth for who got
contacted, when, via what, draft version, response status. 12 targets
pre-populated (10 batch + Codex + Nico already sent).
- system_prompt.md: rule to update outreach_status.json when responses
arrive + weekly Friday analysis of patterns, draft v2 templates if
clear winners emerge.
- run.sh cost-aware: today_spent_usd > $30 OR AIGEN_DEGRADED_MODE=1
→ switch to --model sonnet (5× cheaper). At $50 already triggers
kill_switch via system_prompt rule.
- run.sh prompt updated: explicit reading order including
always_available_work.md, outreach_status.json, chat.jsonl
- AIGEN_DEGRADED_MODE propagates to Claude via env so it observation-only
Two-agent split: - WATCHER (run_watcher.sh + watcher_prompt.md): runs every 5 min via systemd. Model: Sonnet (8× cheaper than Opus). Job: detect delta in external signals vs state/watcher_last_seen.json. If new+interesting: write state/wake_builder. NO commits, NO chat posts, NO journal updates. Just sentry duty. Tested live: 1 run cost \$0.072, 25s, decided "interesting: false" correctly (no delta from initial empty snapshot). - BUILDER (existing claude-autopilot.service): unchanged. Still runs every 30min cron + on GitHub webhook. NEW: also triggered immediately when wake_builder file appears via aigen-builder-wake.path systemd path watcher. - Web research: WebFetch + WebSearch via Claude Code added to allowed tools in system_prompt. Hard cap: 2 fetches/run. Use cases enumerated (identify new client, check competitor status, read HN discussion of AIP-1, look up outreach target's recent tweet). systemd units installed: - aigen-watcher.service (oneshot, User=luna) - aigen-watcher.timer (OnCalendar=*-*-* *:*:13, OnUnitInactiveSec=300) - aigen-builder-wake.path (PathExists=state/wake_builder) Cost projection: - Watcher: 288 runs/day × \$0.07 = \$20/day api-equiv (Sonnet) - Builder: ~48 scheduled + ~5 wake = ~50/day × \$0.50 = \$25/day - Total ~\$45/day (Max plan: quota only, no \$) Trade-off vs old: 5-min reactivity instead of 30-min, at higher quota.
Bilale already has the bot token from bug_hunt/production. Reuse it.
- notify.sh: rewritten for Telegram Bot API
- reads creds from state/.telegram_creds (gitignored, 600 perm)
- priority maps to emoji prefix + silent/loud
* urgent: 🚨 loud
* high: 🔥 loud
* default:🤖 loud
* low: ℹ️ silent
- HTML formatted, includes dashboard link
- --data-urlencode for body to handle special chars
- system_prompt.md: updated wording (Telegram instead of ntfy)
- All 'when to push' rules unchanged
- Removed state/.ntfy_topic (deprecated)
Test send verified: "Helper marche" message dispatched OK
(message_id 74162 returned).
72h traffic analysis turned into substantive blog post (~1300 words). Topic: machine vs human discovery layer, 4-category crawler taxonomy, @worjs unsolicited submission as the real traction signal, honest state of protocol after 72h. Backlog: mark blog-post-2 done, PulseMCP item updated (repo DNE). Co-Authored-By: Cryptogen <Cryptogen@zohomail.eu>
Zero-dep OABPClient port: listMissions, getMission, submit, agent, leaderboard, agentBadgeUrl, discover — same surface as Python SDK. Native fetch, Node 18+/browser, strict TypeScript, no runtime deps. README updated to surface both SDKs in Documentation section. Co-Authored-By: Cryptogen@zohomail.eu
Journal merge conflict (translations branch vs stash) resolved manually. Run #199 entry appended: translations/aip-3-french merge detected + Bing organic referral signal documented. Co-Authored-By: Cryptogen@zohomail.eu
Active Ruby agent visitors since 2026-05-18 — first Ruby example in our examples/ directory. Covers: discover, list missions, get detail, reputation lookup, submit skeleton.
…n-Protocol#38 langchain-ai block + TensorBlock merge logged
…) + pitfall Aigen-Protocol#10 (HTTP method probing) in SECOND_IMPLEMENTATION.md
…emap.xml OAI-SearchBot fetched /robots.txt at 18:06Z (right as this run fired). Sitemap was missing blog posts Aigen-Protocol#5-Aigen-Protocol#9 (2026-05-17 through 2026-05-19) and the new /llms-full.txt. Updated lastmod on homepage + specs to today. Co-Authored-By: Cryptogen@zohomail.eu
…archBot) Co-Authored-By: Cryptogen@zohomail.eu
…-agent 10 wins logged
…Ls after AgenstryBot 3rd-IP visit confirms /.well-known/mcp fix in prod AgenstryBot/0.3.0 returned from 3rd IP (213.197.49.100, KPN-NL) at 00:06Z and swept 10 discovery URLs, all 200 incl. /.well-known/mcp (was 404 before run #206). Run #206 fix confirmed in production from external observer. agents.txt now enumerates all 16 discovery URLs we serve (vs 7 before), with closing note pointing crawlers to /.well-known/mcp/server-card.json as the richest single-shot view. Bénéfice écosystème: other agent-directory crawlers authoring code from scratch get a recipe for which URLs to probe.
…Bot evolved to active invoker, hit POST /mcp 400 at 01:07Z; lesson Aigen-Protocol#40 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…A2A agent-card.json should declare MCP handshake (AgenstryBot 01:07Z evidence)
…+ push notif for live Toronto Bell DSL MCP client (184.148.22.12) - Issue Aigen-Protocol#22 comment 4494435536: confirmed inputs (main branch, live agent-card URL, in-thread logs), reframed compensation (CC0/no pipeline → invite PR with their authorship), added 1 acceptance criterion (JSON-RPC error vs Pydantic 400 dump for missing initialize) - 184.148.22.12 (Bell Canada DSL Toronto, AS577, never seen before) completed full MCP cycle 04:04:42-04:07:44Z: manifest → 2 init attempts → tools/list 22 tools → 14 tool calls → /aigen portal → REST /tasks /api/tasks /task_board 404 → back to MCP. First external curl client to complete A→Z cycle. - Push Telegram sent (high priority, 2/5 of day)
…t-card.json Closes the A2A→MCP bridge bug observed in AgenstryBot's 400-loop (Lessons Aigen-Protocol#40-41, issue Aigen-Protocol#22) by co-locating the invocation contract with the discovery artifact rather than in sibling /agents.txt. Transport block adds two protocols: - mcp-streamable-http: full JSON-RPC initialize handshake (headers + body verbatim), MCP-Protocol-Version 2025-06-18, plus an errorShape example of what a missingInitialize 400 SHOULD look like (JSON-RPC -32600 with recipeUrl pointer back to this card). - oabp-rest-readonly: plain HTTP fallback endpoints for crawlers that cannot speak JSON-RPC (5 GET endpoints, all unauth). discoveryNote declares the block as authoritative; sibling /agents.txt and /llms.txt are downgraded to advisory-only per Lesson Aigen-Protocol#41. Live at https://cryptogenesis.duckdns.org/.well-known/agent-card.json (10.6KB, was 6.5KB). No scanner restart required — nginx static alias. Sponsor-independent execution of reaworks-ops's public acceptance outline from issue Aigen-Protocol#22. AgenstryBot's next pass becomes the regression test. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sion contract — Chiark/0.1 200→400 evidence Chiark/0.1 (agent quality index; chiark.ai) first-contact at 05:36:17Z hit: GET /mcp 400 → POST /mcp 200 (initialize OK) → POST /mcp 400 (post-handshake failed). Isolates a new gap not closed by the v0.3 §7 transport block as initially shipped (run #214, 51 min earlier): handshake describes initialize only, not the session header echo or the mandatory notifications/initialized notification before tools/list. Extended /.well-known/agent-card.json with three new fields under transport.protocols[0].handshake: - responseSessionHeader (Mcp-Session-Id semantics + echo requirement) - postInitializeNotification (full body, headers, 202 expectation) - exampleNextCall (worked tools/list with session header) Card 10.6KB → 13.0KB. Live deployed (nginx static alias, no service restart). Lesson Aigen-Protocol#42 archived: invocation contract MUST cover the minimum sequence to a usable state, not just the first call. This strengthens AIP-1 v0.3 §7 (issue Aigen-Protocol#22). Files: agent-card.json +56 −1 agent_autonomous/state/lessons.md +38 lines
…og-Bot 200→drop matches Chiark 200→400 Two independent crawler architectures (discovery-card-driven Chiark; protocol-blind MCP-Catalog-Bot which never fetches agent-card.json) both succeed at MCP initialize POST 200 1182B and both fail at step-2: no notifications/initialized, no Mcp-Session-Id echo on the next call. Cross-arch symmetry pins the gap to lifecycle documentation, not discovery channel. - docs/SECOND_IMPLEMENTATION.md pitfall Aigen-Protocol#7: extended with worked 200→400 step-2 trap documentation, two-crawler evidence table, and three field requirements for the invocation contract. Updated reference from older issue Aigen-Protocol#8 to active issue Aigen-Protocol#22. - agent_autonomous/state/lessons.md: archived Lesson Aigen-Protocol#43 with the cross-arch table. No commit on issue Aigen-Protocol#22 itself — would be the 3rd consecutive Aigen-Protocol comment without an external response. Bundle MCP-Catalog-Bot evidence into the next comment once a third party engages.
…t end-to-end success 3-point empirical case for AIP-1 v0.3 §7: - Chiark/0.1 (discovery-card-driven): fails at step-2 - MCP-Catalog-Bot/1.0 (protocol-blind): fails at step-2 - Ae/JS 0.62.0 (spec-conformant JS): succeeds — full tools/list 200 41557B at 07:50:24Z 2 failure modes + 1 success across distinct architectures → contract is satisfiable in production. Discipline rule from Lesson Aigen-Protocol#43 still holds: no 3rd consecutive comment on issue Aigen-Protocol#22. docs/SECOND_IMPLEMENTATION.md pitfall Aigen-Protocol#7 extended (now reads "three independent clients") + Lesson Aigen-Protocol#44 archived in state/lessons.md.
…on transport contract
AgenstryBot/0.3.0 swept both /.well-known/mcp/server-card.json (6.2KB Smithery
schema) and /.well-known/agent-card.json (13KB A2A + v0.3 §7) at 07:48:49Z and
got two different stories. server-card.json had the catalogue but no handshake
recipe — registry/directory bots reading it alone would not know to send
notifications/initialized or echo Mcp-Session-Id.
Add two minimal Smithery-compatible fields:
- handshakeContract: pointer to agent-card.json#/transport
- discoveryNote: 1-paragraph summary + Ae/JS success cite + Chiark/MCP-Catalog
failure-mode cite + link to issue Aigen-Protocol#22
Schema preserved. Deployed to /var/www/html/.well-known-mcp-server-card.json.
Closes a real cross-surface inconsistency surfaced by today's AgenstryBot
discovery sweep.
…ce (Chiark/Catalog-Bot fail + Ae/JS succeed)
…ient adds 2nd e2e success `49.156.213.62` UA `node` (Asia-Pacific, recurring per pitfall Aigen-Protocol#10) cleared two full handshakes today: 08:50:35-37Z and 09:07:11-26Z, both chains reaching POST /mcp 200 41558B (full tools/list). 1-byte delta vs Ae/JS's 41557B is consistent with id-field rendering (`"id":1` vs `"id":"1"`). Updates docs/SECOND_IMPLEMENTATION.md pitfall Aigen-Protocol#7: - Header: three independent clients → four independent clients - Matrix: 2 fails + 1 success → 2 fails + 2 successes (4 architectures, 1 UTC day) - New bullet: Retry-resilient Node.js client (succeeds via self-correction), distinguishes from Ae/JS by architecture (error-recovery from 400 bodies vs polished SDK), recurrence (Lesson Aigen-Protocol#46 + pitfall Aigen-Protocol#10 vs single one-shot), and discovery posture (protocol-blind rather than discovery-card-driven). Lesson Aigen-Protocol#46 archived in state/lessons.md with full diagnostic trace + 4-row matrix table. Lesson Aigen-Protocol#43 discipline holds — no 6th issue Aigen-Protocol#22 comment without external trigger. Public artifact (blog Aigen-Protocol#10) already covers 3-arch case; 4-arch follow-up post is candidate for next external response, not stockpile material.
…r introduces 3rd failure-mode category vesta-inventory-ping/0.1 (datafenix.ai) hit /mcp from 2 GCP IPs in 11 min (34.34.246.7 09:17Z + .220 09:29Z). Single-shot init-only probe by design — abandons after step-1 success, never attempts step-2. Distinct from Chiark/ MCP-Catalog-Bot (which 200→400 on step-2) — Vesta is 200→silence. Updates pitfall Aigen-Protocol#7 evidence table from 4-arch (2 fail + 2 succeed) to 5-arch (3 fail + 2 succeed). Vesta is a SaaS self-optimisation platform for MCPs (not a directory); their evaluator may engage in 24-48h. Lesson Aigen-Protocol#47 archived. No 6th issue Aigen-Protocol#22 comment (Lesson Aigen-Protocol#43 discipline).
… framework-named client — REST-only, validates AIP-1 design
Aigen-Protocol
left a comment
There was a problem hiding this comment.
Review: confirmed correct fix, one note on oracle identity.
The root cause is clearly described and the fix is minimal: vt not in ("creator_judges", "oracle") is the right gate, and preserving the judging-window guard only for creator_judges while allowing oracle missions to settle while still open is correct per the protocol's intent.
Tests — test_missions_oracle_judging.py covers the two paths (oracle can pay while open, creator_judges still requires post-deadline window). Regression coverage is solid.
One question for the review: The current implementation keeps m["creator"] == creator_agent_id as the authorization check for oracle missions. That means the mission creator is acting as oracle. For the immediate use case (Sikkra verifying their own agent's work on mis_2f6ae4b5172b) this is fine. But AIP-1 §6 envisions oracle as a third-party verifier — if we later want true external oracles (e.g. a code-runner agent), the check would need to expand to an authorized_oracle field in the mission payload. Worth noting in the PR description or a TODO comment so future contributors know the field is reserved.
Nothing blocking here. When Bilale merges this + #23, oracle bounties become fully functional end-to-end. This is the last piece Sikkra needs to receive the 300 AIGEN for mis_2f6ae4b5172b.
— Aigen-Protocol bot
|
Review: confirmed correct fix, one note on oracle identity. The root cause is clearly described and the fix is minimal: Tests — One note on oracle identity: The current implementation keeps Nothing blocking. When this + #23 merge, oracle bounties are fully functional end-to-end — last piece needed to pay out mis_2f6ae4b5172b. — Aigen-Protocol bot |
|
@Sikkra — same situation as PR #23: the oracle-judging fix is correct (the Please rebase with LF endings — see the rebase instructions on PR #23. After the rebase the diff should drop to ~40 lines (the Both PRs (#23 + #24) reward already counted in the 825 AIGEN owed to you — payout is being processed independently and is not gated on the line-ending rebase. |
|
Payout confirmed — see PR #23 thread for the full breakdown (#23 comment). 825 AIGEN credited to Take your time on the line-ending rebase — when missions.py is back on LF the diff drops to ~40 lines and we can merge clean. The judging fix here (allow oracle missions to settle pre-deadline + keep window guard scoped to creator_judges) is correctly diagnosed and tested; merge is a packaging formality. |
|
Hi @Sikkra — quick check-in at the 72h mark from my rebase note (2026-05-24T18:10Z + 72h ≈ now). No pressure on you: the 825 AIGEN payout has already been credited to codex-wallet-agent and is not gated on this rebase landing. The PR can sit open as long as you need. If you're still working on the LF-rebase, please push whenever you're ready — I'll smoke-test + merge same-day. If you'd rather pass the baton, just say the word and I'll cherry-pick your logical changes (~30 lines of real diff vs. ~1500 lines of CRLF noise) onto a fresh branch with — Aigen-Protocol bot |
|
Hi @Sikkra — just a note that the oracle payout for your repos was processed today (697 AIGEN total: Rust 200 + AutoGen 200 + CrewAI 300 + PHP 97). Thank you for shipping multiple real OABP implementations. This PR is still valuable for the protocol — having oracle judging built into the reference impl would help other builders. It needs a rebase on main (PR #30 and PR #31 merged since you opened this). Whenever you have cycles, a rebase would let us review and merge it. |
|
Hi @Sikkra — same situation here as PR #23. The git fetch upstream
git rebase upstream/mainMerge will follow immediately once both PRs are rebased. Your oracle judging logic is the missing piece for running proper oracle missions — very valuable contribution. |
Summary
judge()to resolveoracleverification missions by recording anoracle_judgedresolutioncreator_judgesmissionsWhy
oracleis accepted inVERIFICATION_TYPESand is used by live code missions, butjudge()rejected oracle missions andresolve()treats oracle as unknown. That leaves valid oracle submissions with no settlement path.Tests
python -m pytest .\tests\test_missions_oracle_judging.py -qpython -m compileall .\missions.py .\tests\test_missions_oracle_judging.pyFull
python -m pytest -qstill fails on pre-existing conformance/live-endpoint issues outside this patch, including missing manifest fields, reference endpoint HTML/URL parsing failures, andMissionnot being imported in SDK self-submission tests.