Phase 2.5: migration 020 - bucket_roots shared root-pointer table (FM-1)#48
Open
ehsan6sha wants to merge 12 commits into
Open
Phase 2.5: migration 020 - bucket_roots shared root-pointer table (FM-1)#48ehsan6sha wants to merge 12 commits into
ehsan6sha wants to merge 12 commits into
Conversation
…c deposit credit, cron leader-lease Federated masters (Phase 1.5, Stage A) groundwork. All flag-gated, default OFF - single-master behavior is byte-identical when dark: - migration 018: partial UNIQUE index (user_id, reference_id) WHERE tx_type=hourly_deduction (CONCURRENTLY; pre-checks duplicates; .down.sql) - deductionJob (BILLING_IDEMPOTENCY=true): deterministic reference_id hour:YYYY-MM-DDTHH (UTC) and the history INSERT becomes the dedup gate (ON CONFLICT DO NOTHING) BEFORE the balance update - N masters deduct exactly once per (user, hour) - blockScanner (BILLING_IDEMPOTENCY=true): deposit insert + creditUserTx + claimed_at in ONE transaction - a crash can no longer strand a recorded-but-uncredited tx; creditService gains creditUserTx (caller-owned transaction; creditUser delegates, zero behavior change) - leaderLease (CRON_LEADER_LEASE=true): Postgres session advisory lock on a dedicated client gates every cron tick; holder crash frees the lock so a standby master takes over on its next tick; SIGTERM releases explicitly - tests: hour-bucket determinism/UTC/collision, fee formula unchanged, flag-off no-op gate (DB-free; multi-master paths covered by Phase 1.5 e2e) Part of #46 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… replication sweep Phase 1.5 Stage A deliverables (#46): - docker/master: compose stack (postgres-pinning, pinning OpenAPI from main_postgres.go, pinning-webui with cron family) - host networking + 127.0.0.1 binds like prod, healthchecks + restart policies + label-scoped watchtower; optional fula-gateway profile (auto-enabled when image exists) - update-scripts/join-as-master.sh v0: capstone installer first cut - detect/adopt-or-halt (adopts the Phase-1 kubo+cluster writer; halts on a foreign postgres-pinning), ordered migrations with halt-on-error + marker, idempotent re-runs, params persisted to .env (phase-common pattern) - update-scripts/pinset-snapshot.sh: signed (ed25519) authoritative pinset dumps + --verify/--restore/--install-cron (early FM-3 restore path) - update-scripts/replication-sweep.sh: below-REPL_MIN detection + recover + alert log + --strict for drills (closes the S4 sweep gap) - test seams: processUserDeduction exported; cron intervals env-overridable (SCANNER_INTERVAL_MS/DEDUCTION_INTERVAL_MS, defaults unchanged) - tests: fm2-billing-integration (live-Postgres; skips cleanly without DB) - concurrent same-hour deduction races deduct once; replayed deposit credits once and leaves no recorded-but-uncredited state Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ver, snapshots, sweep D1 stack health, D2 migration-018 presence, D3 two webui masters -> one leader + one standby + exactly one hourly_deduction per (user, hour), D4 kill -9 leader -> standby acquires lease, STILL one row (idempotency under failover), D5 live-Postgres vitest integration, D6 snapshot take/verify/tamper-reject/unpin+restore, D7 sweep clean -> forced under-replication detected (--strict) + alerted -> reconverged clean. Part of #46 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…_PORT default) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… auto-generate + persist Found by the live installer run (webui FATALs without them; container crash-looped silently). join-as-master.sh now generates both once (openssl rand) and persists to .env; compose fails fast with a clear message if absent; drill webuis receive them too. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Found by the live Phase 1.5 e2e run: - migration 019: user_wallets.wallet_address DROP NOT NULL - post-PII linkWallet stores hash-only (wallet_address=NULL) so EVERY fresh install rejected wallet links; 012 relaxed user_email but missed this column (guarded .down.sql). Real fresh-deploy bug, not test-only. - compose: mount fula-gateway-state at /var/lib/fula-gateway - the gateway durable state paths are hardcoded there; without the volume the S2 pin queue silently degrades to fire-and-forget and the bucket registry resets on restart. - drills: D5 runner no longer swallows vitest exit (pipefail + explicit 2-passed check); D6 takes the FIRST cid from the snapshot stream (was capturing a multiline list) and polls 60s for the restore. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
join-as-master.sh now installs the pinset-snapshot (6h) and replication-sweep (30min) cron entries and takes the first snapshot immediately - the restore path exists from minute one, per the safeguards invariant (S4/S6 must be scheduled, not manual). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…pass count The tests passed (2/2 on live Postgres) but the colored output broke the literal match - strip escapes, then assert. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Arbiter table for multi-master bucket-root CAS (fula-api#32; consumed by the fula-gateway when FULA_BUCKET_ROOT_CAS is on). Purely additive - nothing touches it until the gateway flag is enabled. Guarded .down.sql. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…rofile A vetted third party provides STORAGE + verified byte ingress with NO cluster write authority and NO database/billing: follower kubo + ipfs-cluster in FOLLOWERMODE trusting ONLY the master peer ids (never itself, so it replicates but cannot mutate the pinset - mass-unpin blast radius stays first-party until FM-3) + fula-ingest (verified blake3 CID, quota-gated on a master storage API). Dockerized compose (healthchecks + restart + watchtower); idempotent, adopt-or-halt, .env-persisted, auto- resolves master identity from the pools/masters API. This is the FM-1+FM-4 Stage-B gate's operator tooling. Part of #48 / fula-api#32 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…te test Two gateways (separate registries) vs one Postgres/kubo/cluster. CAS OFF: stale gateway B clobbers A => split bucket (lost update reproduced). CAS ON: B opens at the shared root first => both objects on both gateways (no lost update). No concurrency-timing flake - exploits the stale-registry hazard directly. Needs fula-gateway:p25 (built from phase-2.5-multimaster). Part of #48 / fula-api#32 Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Member
Author
Phase 2.5 — final verification complete (all green on the test stack)Live two-gateway drill (
Combined Phase 2.5 evidence:
Plus the storage-only Stage-B operator profile ( Phase 2.5 is feature-complete and verified. Ready for your review. 🤖 Generated with Claude Code |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Phase 2.5 — migration 020:
bucket_rootsshared root-pointer table (FM-1)The arbiter table for multi-master bucket-root compare-and-swap. With more than one federated master serving S3 writes, each bucket's root CID needs a store every master can CAS against; the gateways' in-process locks can't see each other.
The fula-gateway (
FULA_BUCKET_ROOT_CAS, functionland/fula-api#41) performs a single-statement upsert-CAS:Purely additive — nothing reads or writes this table until the gateway flag is enabled; flag-off masters are unaffected. Guarded
.down.sql(refuses to drop while the flag is in use).Verified green: the gateway's
PgRootStoreintegration test runs against this exact table on the live stack DB — claim → stale-conflict → retry-wins → version increment, plus an 8-way concurrent "exactly one winner" race (functionland/fula-api#41).Part of the Phase 2.5 multi-master-safety milestone.
🤖 Generated with Claude Code