Skip to content

feat: autonomous owner harness — self-prompting controllers over flow#71

Merged
anshulsao merged 15 commits into
mainfrom
owner-harness
Jun 11, 2026
Merged

feat: autonomous owner harness — self-prompting controllers over flow#71
anshulsao merged 15 commits into
mainfrom
owner-harness

Conversation

@anshulsao

@anshulsao anshulsao commented Jun 9, 2026

Copy link
Copy Markdown
Contributor

What

Adds owners — durable, named, repo-scoped self-prompting controllers that take ongoing responsibility for an outcome (re-wake, re-evaluate, act until done), vs. flow's one-shot flow do --auto and manual playbooks.

An owner is not a single Claude session. It is a charter (operating manual) + a journal + a clock. Each tick is a fresh headless run that reads its charter + journal, reviews what it owns, orchestrates (never executes inline — it routes work through tasks/playbook-runs that self-close with the flow done KB sweep), parks human decisions as question-tasks, self-paces its next wake, and journals.

Design doc: docs/superpowers/specs/2026-06-08-autonomous-owner-harness-design.md

How ticks fire (no daemon, portable)

  • flow stays pure CLI — no daemon, no OS-specific scheduler code. It only provides flow owner tick-due (scan due owners → dispatch detached ticks).
  • A launchd agent set up per-host by the flow skill (not compiled in) polls it every ~60s. Linux → systemd-timer/cron via the same skill recipe.

Commands

flow add owner "<name>" --work-dir <p> [--every <dur>] [--project] [--slug]
flow owner list | show | start | pause     # list/show also verb-first:
flow list owners                  # alias of: flow owner list
flow show owner <slug>            # alias of: flow owner show <slug>
flow owner tick <slug>            interactive (you drive); --auto = headless now
flow owner next <slug> --in <dur> | --at <when>   self-pace the next wake
flow owner retire <slug> [--delete]
flow owner tick-due               scheduler entry (+ internal flow __owner-tick)

Model highlights

  • Orchestrate, never inline — ticks are sessionless (no flow done sweep), so they route work through tasks (one-time) / playbooks (recurring) that self-close and capture learnings.
  • Tags, not new tables — the only new table is owners; owner↔task linkage and the question marker reuse the existing tag system (owner:<slug>, question).
  • Self-paced scheduling — each tick sets its own next wake (flow owner next); --every is just a fallback heartbeat floor.
  • Cross-tick journalowners/<slug>/updates/, read at tick start, written before exit (the owner's memory across blank-session ticks).
  • Live tick indicator + overlap guard (skip owners with a running tick) + dead-pid reconciliation.

Tests

All TDD (RED → GREEN). Full suite green, no regressions.

Status

Draft — opening for review before merge. Dogfooding live: an af-stability owner is already watching Agent Factory prod (found a stuck scheduled job, autonomously raised a GitHub issue + TDD'd PR, stopped before merge).

🤖 Generated with Claude Code

@anshulsao

Copy link
Copy Markdown
Contributor Author

Addressed an independent review pass:

  • Blocking race fixed (cea8462): the scheduler and the detached tick both wrote the owner row via full-row UpdateOwner, so a fast-finishing tick had its last_tick_status clobbered (shown 'running' with a dead pid until reconcile), and a tick that self-paced mid-run had its next_wake_at overwritten. Replaced with targeted column updates that commute across the two writers, plus a guarded reconcileOwnerTick (WHERE tick_pid IS NOT NULL) mirroring the --auto path. Two TDD regressions added (RED→GREEN).
  • CLI surface (85a95e3): added verb-first flow list owners / flow show owner <slug> as aliases of the grouped forms (list/show only; lifecycle verbs stay grouped).

Full suite green, vet clean.

@anshulsao

Copy link
Copy Markdown
Contributor Author

Closed out the independent review's test-gap section (44338e3): DueOwners garbage-timestamp skip, tick-due invalid---every skip, second-pass no-redispatch, owner next --at date expressions, and a sessionless-env assertion (no CLAUDE_CODE_SESSION_ID leak into a tick). No production code changes — coverage only.

All review findings now addressed (blocking race, bonus self-paced-next-wake clobber, all nits, all test gaps). Full suite green, vet clean. Marking ready for review.

@anshulsao

Copy link
Copy Markdown
Contributor Author

Known follow-up (not in this PR — it's a behavior change, not a review fix): #75 — owner ticks don't capture a session id / transcript, so "what happened in the last tick" isn't inspectable the way playbook runs are. Proposed fix is to mint + record a fresh per-tick session id (last_tick_session_id) instead of discarding it.

@anshulsao

Copy link
Copy Markdown
Contributor Author

Second independent review (Fable 5) pass addressed in bfcc0f7:

Two HIGH gaps fixed:

  • retire→start zombieownerStart now clears archived_at (ActivateOwner), so start un-retires instead of leaving an active-but-archived owner that's hidden and never ticks.
  • --tag didn't intersectflow list tasks --tag a --tag b is now repeatable + ANDs, so the skill's ledger query --tag owner:<slug> --tag question returns only that owner's questions.

Mediums: absolute charter/journal paths in the tick prompt (headless tick no longer creates a stray owners/ in the repo); a tick_started staleness cap so pid-reuse after reboot can't stall an owner forever; lifecycle commands (next/start/pause) converted to targeted column writes; the status-guard skip clears its stale pid.

Minors: flow help now lists owner commands; owner show drops done tasks from 'in flight'.

8 new TDD regressions, full suite green, vet clean. Verified no regressions to existing surfaces.

Note: branch is behind origin/main (bg backend #74 etc. merged since) — will rebase before merge.

anshulsao and others added 14 commits June 11, 2026 11:31
…r flow

Adds `owner`: a durable, repo-scoped self-prompting controller that takes
ongoing responsibility for an outcome. Not a single session — it is a
charter + a clock that wakes on an interval, runs a fresh headless tick,
acts, and reschedules itself.

Data layer (internal/flowdb):
- owners table (additive migration) + Owner CRUD: Create/Get/List/Update
- DueOwners with timezone-correct, parse-based time comparison
- live-tick bookkeeping columns (tick_pid / tick_started)

CLI (internal/app):
- flow add owner / owner list (with next-tick) / show / start / pause
- the tick: `flow owner tick-due` (scheduler scan) -> detached
  `flow __owner-tick` (fresh sessionless headless run via SkipPermissionsRun)
- live tick indicator: `tick: running (pid ...)` in owner show, with
  read-time pid reconciliation (dead -> last tick)
- flow add task --tag (repeatable); owner show splits out playbook runs

Model: owners ORCHESTRATE, never execute inline. A tick is sessionless and
gets no `flow done` KB/project sweep, so work is routed through
playbook-runs (recurring) and tasks (one-time) that self-close with the
sweep. Human questions are tasks tagged `question` + `owner:<slug>`.
Owner<->task linkage and the question marker reuse the existing tag system
(the only new table is `owners`).

Docs: design spec at docs/superpowers/specs/2026-06-08-autonomous-owner-harness-design.md.
Skill section 4.17 teaches the ownership model to harness sessions.

All TDD (RED -> GREEN); full suite green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The tick is a fresh sessionless run, so all continuity must live on disk.
Two fixes:

- Review step now uses `flow owner show <slug>` (lists in-flight tasks,
  playbook runs, AND questions with status) instead of `flow list tasks
  --tag` — which defaults to kind=regular and HID the owner's own
  playbook-runs, so a tick could re-spawn a run already in progress.

- Tick journal: each tick now READS its recent notes under
  owners/<slug>/updates/ at the start (to recover what it dispatched / is
  waiting on) and WRITES a short note before exiting (what it observed,
  what it dispatched with slugs, what the next tick should check). Same
  updates/ convention as tasks and playbooks — the owner's durable memory
  across blank-session ticks.

Skill §4.17 + §4.5 updated to teach the journal + owner-show review.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… wake

Adds an on-demand way to wake an owner now, instead of only the scheduler:

  flow owner tick <slug>          interactive: spawns a tab the user drives
                                  (may use AskUserQuestion, refine the charter live)
  flow owner tick <slug> --auto   headless on-demand (same as a scheduler tick)

Mirrors `flow do` vs `flow do --auto`. Interactive ticks mint a throwaway
session and reuse the spawner; they get a distinct prompt that permits the
human to navigate and folds lessons back into the charter. The on-demand
tick is extra — it does not disturb next_wake_at.

Use case: run the FIRST tick interactively to shepherd the owner and tune
the charter before it runs unattended (skill §4.17 nudges this when an
owner has never ticked). Scheduled ticks stay headless.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Replaces the fixed recurring interval with dynamic, per-run scheduling:

- `flow owner next <slug> --in <dur> | --at <when>` sets next_wake_at.
  Each tick calls it before exiting to choose when it next needs to run
  (e.g. +15m while watching a deploy, +1d when idle).
- `--every` is demoted to a fallback heartbeat floor: optional, defaults
  to 24h, and only re-wakes the owner if a tick crashes or never sets its
  next wake. It is no longer a rigid cron cadence.
- Both tick prompts gain a self-pace step; the agent sets the cadence
  itself (no user confirmation, even on an interactive tick).
- Intake asks only for a loose fallback interval, not a fixed schedule.

Skill §4.17 + command reference updated.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
flow stays pure CLI with no OS-specific scheduler code. Firing ticks is
delegated to the host, set up by the skill — not compiled into the binary.

- Overlap guard (portable Go): `flow owner tick-due` now skips any owner
  whose tick is still running (live tick_pid), so a slow tick never gets a
  second one stacked on it. A dead pid falls through and re-dispatches.

- Scheduler is a SKILL recipe (§4.17), not a flow command: check → install
  if missing → verify → respawn, per host. macOS launchd LaunchAgent polling
  `flow owner tick-due` every 60s; Linux systemd-user-timer / cron variant.
  Idempotent, opt-in, easy to unload (stops all owners). No launchd code in
  flow — the OS glue lives in the skill, adapting per-OS at runtime.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
An interactive (`flow owner tick`) launch spawned the tab but never
recorded anything, so `flow owner show` reported `last tick: (never)` even
after a full interactive tick ran, journaled, and dispatched work. Now the
interactive path stamps last_tick_at + last_tick_status='interactive' after
a successful spawn. (Completion isn't tracked — the user drives the tab —
but the session self-paces via `flow owner next` and journals as before.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…chd plist

Hit in practice: launchd runs with a minimal PATH, so a dispatched tick
failed with `exec: "claude": executable file not found in $PATH`. The
ensure-scheduler recipe now requires embedding the user's full interactive
$PATH via EnvironmentVariables.PATH in the plist (claude/gh/git live in
~/.local/bin, homebrew, etc.).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds a proper lifecycle end for owners (replacing hand-deleting DB rows):

- `flow owner retire <slug>` — graceful: status=retired + archived. Stops
  ticking (DueOwners requires active), drops off the default list; charter,
  journal, tick logs, and owned tasks are preserved.
- `flow owner retire <slug> --delete` — hard: removes the owner row AND the
  owners/<slug>/ directory. Owned tasks (tag owner:<slug>) are independent
  and left intact either way.

flowdb: RetireOwner + DeleteOwner. Skill §4.17 lifecycle + reference updated
(confirm via AskUserQuestion before retiring/deleting). All TDD.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The owner skill section had grown verbose with repeated rationale
(orchestrate-never-inline stated 3×, self-pacing in 3 places) and a
restated tick procedure that duplicates the tick prompt. Condensed to ~39%
smaller while keeping every command, the tag contract, the journal, the
scheduler recipe (incl. the launchd PATH gotcha), lifecycle, and all four
anti-patterns. No behavior change — pure density.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Owners are a peer object of task/project/playbook, so they should be
reachable via the same verb-first surface, not only the grouped
`flow owner <verb>` form. Add `flow list owners` and `flow show owner
<slug>` as aliases that delegate to the existing ownerList/ownerShow
renderers (zero duplication — same function call).

Scope is deliberately just list + show: those are the read/inspect
verbs owners share with the other top-level objects, and the surface
mismatch users actually hit. Lifecycle verbs (start/pause/next/retire/
tick) stay grouped under `flow owner` since they have no task analogue
and read naturally as `flow owner pause` etc. `flow add owner` is
unchanged.

Tests assert the alias output is byte-identical to the grouped form so
the two surfaces can't drift. Skill command reference notes both forms.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The scheduler (parent, `flow owner tick-due`) and the detached tick
(child, `flow __owner-tick`) both wrote the owner row via the full-row
`flowdb.UpdateOwner`. Writing a stale in-memory struct clobbered the
other writer's columns:

- A tick that finished before the scheduler recorded its pid had its
  `last_tick_status` ('ok'/'error') overwritten with the pre-tick value,
  and showed "running" with a dead pid until the next reconcile.
- A tick that self-paced mid-run (`flow owner next`) had its new
  `next_wake_at` clobbered back to the value loaded when the tick
  started — so self-pacing silently didn't stick.

Replace the full-row writes with targeted column updates that each touch
only the columns that writer owns, so the writes commute:

  dispatch/start → tick_pid, tick_started (+ next_wake_at for scheduler)
  finish         → last_tick_at, last_tick_status, clear tick_pid/started
  interactive    → last_tick_* only (leaves next_wake_at for self-pace)

`reconcileOwnerTick` is now guarded (`WHERE tick_pid IS NOT NULL`) so it
can't overwrite a genuine result with 'dead' if a finish lands first —
mirroring the `--auto` path's `reconcileAutoRun` guard.

The one remaining overlap (a stale dead tick_pid from a fast-finishing
child) is self-healing via the dead-pid reconcile + DueOwners overlap
guard; no durable result is lost.

TDD: TestOwnerTickDueFastFinishingChildKeepsResult and
TestCmdOwnerTickPreservesSelfPacedNextWake both RED before, GREEN after.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…endly slug error

Knocks out the non-blocking findings from the independent review:

- `flow owner next` now rejects a wake time in the past (stale --at or a
  negative --in) instead of leaving the owner perpetually due and ticking
  every scheduler pass.
- Tick paths guard on owner status: the detached `flow __owner-tick`
  skips a non-active owner cleanly (covers the dispatch→exec window and
  manual --auto on a paused owner), and the hand-triggered `flow owner
  tick` refuses a paused/retired owner with resume guidance instead of
  silently spawning a session.
- `flow add owner --slug <existing>` now fails with a friendly "slug
  already exists" message rather than a raw UNIQUE-constraint error.
- Skill + Owner doc-comment wording: drop the stale "fixed interval"
  phrasing for `--every` in favor of "fallback heartbeat floor" (it
  self-paces via `flow owner next`), matching the §4.17 body.

(The reconcile-guard nit was already handled in the prior race-fix
commit.) TDD: four regressions added (RED→GREEN), full suite green,
vet clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Adds the test cases the independent review flagged as missing — all
exercise existing defensive/behavioral paths that were previously
unasserted:

- DueOwners skips an unparseable next_wake_at (doesn't crash the pass).
- tick-due skips an owner with an invalid --every (CreateOwner/direct
  writes don't validate the interval like `flow add owner` does).
- A second tick-due pass does not re-dispatch an owner whose next_wake
  was just advanced (proves the schedule-advance, with the pid forced
  dead so the overlap guard isn't what's masking it).
- `flow owner next --at` accepts a date expression (tomorrow), not just
  RFC3339.
- The detached tick's child env is sessionless — CLAUDE_CODE_SESSION_ID
  is stripped, so each tick is a fresh run (the central owner invariant).

No production code changes. Full suite green, vet clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…, tick paths + robustness

Fixes the functionality gaps from the second independent review:

- **retire→start zombie (HIGH):** `ownerStart` now clears `archived_at`
  (new `flowdb.ActivateOwner`), so `flow owner start` un-retires a retired
  owner instead of leaving it active-but-archived (hidden from the list,
  never dispatched by DueOwners). Skill updated: retire is reversible via
  start.
- **`--tag` didn't intersect (HIGH):** `flow list tasks --tag a --tag b`
  is now repeatable and ANDs the tags (new `TaskFilter.Tags`), so the
  skill-documented ledger query `--tag owner:<slug> --tag question`
  returns only that owner's questions instead of every owner's.
- **Tick prompt relative paths (MED):** the prompt now points at the
  ABSOLUTE `$FLOW_ROOT/owners/<slug>/{charter.md,updates/}` (via
  ownerDirFor) and warns it's not under work_dir — a headless tick
  (cwd=work_dir) no longer fails to find its charter or creates a stray
  owners/ dir in the user's repo.
- **PID-reuse staleness cap (MED):** a tick_pid that looks alive long
  after tick_started (reboot pid-reuse, incl. EPERM=alive) no longer
  stalls an owner forever — `ownerTickStale` (1h cap) lets the overlap
  guard re-dispatch and reconcile heal it.
- **Lifecycle full-row writes (LOW-MED):** `ownerNext`/`Start`/`Pause`
  now use targeted column writes (SetOwnerNextWake/ActivateOwner/
  PauseOwner) so they can't clobber concurrent tick bookkeeping (extends
  the cea8462 fix to the human-facing commands).
- **Status-guard skip leaked a stale pid:** the non-active skip in
  `cmdOwnerTick` now clears tick_pid/tick_started so reconcile doesn't
  later mislabel a clean skip as 'dead'.
- **Minors:** `flow help` now lists the owner commands; `owner show`
  drops done tasks from the "in flight" section.

TDD: 8 new regressions (RED→GREEN) across flowdb + app; one existing test
updated (a "running tick" needs a recent tick_started now that the
staleness cap exists). Full suite green, vet clean. No regressions to
existing surfaces.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…l pattern

Adds the documented path for making an owner event-driven (reactive on
demand) while staying a cheap poller by default — the "own the whole
latency curve" capability.

- `flow owner tick --auto` now carries the same overlap guard as the
  scheduler: it refuses to stack a headless tick on top of one already
  running (dead/stale pid falls through). This is the prerequisite for
  event triggers — a watcher's `owner tick --auto` can now race a
  scheduled tick without double-firing. (Closes the deferred Fable nit.)
- Skill §4.17: new "Event-driven owners" subsection. A tick can't hold a
  Monitor (it's headless and exits), so it dispatches a BOUNDED watcher
  task (`flow do --auto` + Monitor tool) that, on the event, writes a
  focus note to the journal and fires a focused `flow owner tick --auto`,
  then exits — back to spaced polling. Framed as bounded-only (a watcher
  is a living session; never make it permanent).

TDD: TestOwnerTickManualAutoRefusesWhileTickRunning (RED→GREEN). Full
suite green, vet clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@anshulsao anshulsao merged commit bb81c1a into main Jun 11, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant