Skip to content

fix(kiloclaw): compute orphan volume grace period in SQL#3431

Open
St0rmz1 wants to merge 1 commit into
mainfrom
fix/kiloclaw-orphan-volume-destroy-grace-sql
Open

fix(kiloclaw): compute orphan volume grace period in SQL#3431
St0rmz1 wants to merge 1 commit into
mainfrom
fix/kiloclaw-orphan-volume-destroy-grace-sql

Conversation

@St0rmz1
Copy link
Copy Markdown
Contributor

@St0rmz1 St0rmz1 commented May 22, 2026

Summary

The admin orphan volume cleanup tool rejected volumes that were well past
the 7 day grace period, reporting that the instance was "destroyed too
recently". A destroyed instance over 60 days old was still refused.

The destroy path measured the grace period in JavaScript: it read the
stored destroyed_at timestamp and passed it to the Date constructor.
That value arrives as Postgres native timestamp text rather than ISO
8601, and the Date constructor handles that text inconsistently across
the Vercel and Cloudflare runtimes the two destroy gates run in, so the
elapsed time came out wrong and the grace gate misfired.

Both destroy gates now compute the grace decision entirely in Postgres:

  • The web destroyOrphanVolume procedure and the kiloclaw worker
    orphan-volume-destroy route each select a grace_period_elapsed
    boolean, derived from max(destroyed_at) across the (user, sandbox)
    versus the grace window.
  • No database timestamp is parsed in JavaScript anymore.
  • false or a null result (no destroyed row, already ruled out by the
    preceding gate) both fail closed.

Grace is still measured from the latest destruction of the sandbox, so a
reprovisioned sandbox whose shared volume was re-destroyed recently is
still held. The change is limited to the orphan volume destroy path; no
other procedure or route is affected.

Verification

  • Destroyed a volume for an instance destroyed well over 7 days ago
    from the admin tool and confirmed it now succeeds instead of being
    rejected as too recent.
  • Confirmed an instance destroyed within the last few days is still
    rejected at the grace gate.

Visual Changes

N/A. This PR changes server query logic and test coverage only. There
are no UI changes.

Before After

Reviewer Notes

Focus areas:

  • The grace_period_elapsed SQL expression: `extract(epoch from (now()
    • max(destroyed_at))) * 1000` compared against the grace constant in
      milliseconds. Confirm the boundary matches the previous JavaScript
      comparison.
  • Both gates read the precomputed boolean and treat anything other than
    true as not elapsed, so a null result fails closed.
  • Tests: the web router test runs against Postgres and covers the grace
    boundary, the reprovisioned sandbox case, and a long destroyed
    instance stored as Postgres native timestamp text. The worker test
    mocks the database, so it consumes the precomputed flag directly; the
    SQL itself is exercised end to end by the web router test.

@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 22, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Executive Summary

The PR correctly moves the orphan-volume grace period check from runtime JS Date parsing into a single SQL expression evaluated by Postgres, fixing a cross-runtime timestamp format bug while keeping all safety gates fail-closed.

Files Reviewed (4 files)
  • apps/web/src/routers/admin-kiloclaw-instances-router.ts
  • apps/web/src/routers/admin-kiloclaw-instances-router.test.ts
  • services/kiloclaw/src/routes/platform.ts
  • services/kiloclaw/src/routes/platform-orphan-volume.test.ts

Notes:

  • The SQL arithmetic is correct: extract(epoch from ...) * 1000 > ORPHAN_VOLUME_GRACE_PERIOD_MS (604,800,000 ms = 7 days).
  • Fail-closed behavior is preserved: when the correlated subquery returns NULL (no destroyed rows), NULL * 1000 > constant evaluates to NULL in Postgres, which Drizzle returns as null in JS, and null !== true correctly blocks the operation.
  • The removed test case ('refuses (409) when a newer destruction of the same sandbox is within grace') was not lost — the semantic is now validated end-to-end against real Postgres in the web router regression test ('clears the grace gate for a long-destroyed instance stored as Postgres timestamp text'). The worker unit test simply verifies it honors the precomputed flag.
  • The remaining latest_sandbox_destroyed_at references in the router file are in the separate scan/classification query (lines ~3851–3909, 4043) and are unrelated to this change.

Fix these issues in Kilo Cloud


Reviewed by claude-4.6-sonnet-20260217 · 427,449 tokens

Review guidance: REVIEW.md from base branch main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant