Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .changeset/watchdog-worker-and-verbose-logs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
"playground-cli": patch
---

Harden the deploy memory watchdog, add diagnostic logging for freezes / runaway RSS, and fix the phone-signer approval counter when a PoP upgrade is required.

- **Watchdog now runs in a `worker_threads` Worker**, not a `setInterval` on the main thread. Under heavy microtask load (polkadot-api block subscriptions, bulletin-deploy retry loops) the main thread's macrotask queue can be starved for long enough that RSS climbs to 10+ GB between samples — at which point macOS jetsam delivers SIGKILL and the user sees a mystery `zsh: killed` with no guidance. The worker has its own event loop that can't be starved by the main thread, so the 4 GB cap now actually fires with a clear abort message. Sampling rate is also tightened from 5 s → 1 s now that it's off the hot path.
- **New `DOT_DEPLOY_VERBOSE=1` env var** writes every bulletin-deploy log line (chunk progress, broadcast / included / finalized transitions, nonce traces, RPC reconnects) to stderr with a `[+<seconds>s]` timestamp. Previously the interceptor swallowed everything that wasn't a phase banner or `[N/M]` chunk line to keep the TUI clean; that made "deploy froze at chunk 2/6" reports diagnostically opaque. Pair with `DOT_MEMORY_TRACE=1` to correlate log events with RSS growth.
- **Asset Hub client is now destroyed immediately after preflight** instead of lingering until deploy cleanup. Nothing in the deploy flow (build, bulletin-deploy's storage + DotNS, our playground publish) uses it between preflight and the publish step — and holding an idle polkadot-api client with a live best-block subscription for the full deploy window was measurable background pressure. Playground publish calls `getConnection()` which auto-re-establishes a fresh client at that point.
- **Phone-signer approval count now matches reality.** For a PoP-gated name registered with a signer below the required tier, bulletin-deploy submits an extra `setUserPopStatus` tx before `register()` — so `dot deploy --signer phone --playground` actually fires 5 sigs, not 4. The summary card used to advertise "4 approvals" and the phone prompt later said "approve step 5 of 4". Fixed by predicting `needsPopUpgrade` during the availability check (via `getUserPopStatus` + mirrored `simulateUserStatus` logic) and threading that prediction into `resolveSignerSetup`, so the approvals list (and the derived summary, and the signing-proxy labels) are variable-length. Added: a belt-and-braces clamp in `createSigningCounter` that grows `total` when `step > total`, so even if our prediction mis-estimates for any reason the TUI never shows "step 5 of 4" again.
- **Re-deploy path now shows a minimal phone tap count.** When the availability check reports the domain is already owned by the signer, bulletin-deploy skips `register()` entirely and only fires `setContenthash`. The summary card and counter now reflect that (1 DotNS tap instead of 3).
16 changes: 15 additions & 1 deletion src/commands/deploy/DeployScreen.tsx
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ import {
type DeployEvent,
type DeployOutcome,
type DeployPhase,
type DeployPlan,
type SignerMode,
type DeployApproval,
type SigningEvent,
Expand Down Expand Up @@ -72,6 +73,10 @@ export function DeployScreen({
const [domain, setDomain] = useState<string | null>(initialDomain);
const [publishToPlayground, setPublishToPlayground] = useState<boolean | null>(initialPublish);
const [domainError, setDomainError] = useState<string | null>(null);
// Captured from the availability check; feeds `resolveSignerSetup` so
// the summary card shows the correct phone-approval count (register +
// PoP upgrade = 4 DotNS taps, vs register alone = 3, vs update = 1).
const [plan, setPlan] = useState<DeployPlan | null>(null);
const [stage, setStage] = useState<Stage>(() =>
pickInitialStage(initialMode, initialBuildDir, initialDomain, initialPublish),
);
Expand Down Expand Up @@ -168,6 +173,7 @@ export function DeployScreen({
ownerSs58Address={userSigner?.address}
onAvailable={(result) => {
setDomain(result.fullDomain);
setPlan(result.plan);
advance(mode, buildDir, result.fullDomain);
}}
onUnavailable={(reason) => {
Expand Down Expand Up @@ -196,6 +202,7 @@ export function DeployScreen({
<ConfirmStage
inputs={resolved}
userSigner={userSigner}
plan={plan}
onProceed={() => setStage({ kind: "running" })}
onCancel={() => {
onDone(null);
Expand All @@ -208,6 +215,7 @@ export function DeployScreen({
projectDir={projectDir}
inputs={resolved}
userSigner={userSigner}
plan={plan}
onFinish={(outcome, chunkTimings) => {
setStage({ kind: "done", outcome });
// Surface completion on the terminal tab so users can glance over.
Expand Down Expand Up @@ -331,11 +339,13 @@ function ValidateDomainStage({
function ConfirmStage({
inputs,
userSigner,
plan,
onProceed,
onCancel,
}: {
inputs: Resolved;
userSigner: ResolvedSigner | null;
plan: DeployPlan | null;
onProceed: () => void;
onCancel: () => void;
}) {
Expand All @@ -345,14 +355,15 @@ function ConfirmStage({
mode: inputs.mode,
userSigner,
publishToPlayground: inputs.publishToPlayground,
plan: plan ?? undefined,
});
} catch (err) {
return {
approvals: [] as DeployApproval[],
error: err instanceof Error ? err.message : String(err),
};
}
}, [inputs, userSigner]);
}, [inputs, userSigner, plan]);

const view = buildSummaryView({
mode: inputs.mode,
Expand Down Expand Up @@ -442,12 +453,14 @@ function RunningStage({
projectDir,
inputs,
userSigner,
plan,
onFinish,
onError,
}: {
projectDir: string;
inputs: Resolved;
userSigner: ResolvedSigner | null;
plan: DeployPlan | null;
onFinish: (outcome: DeployOutcome, chunkTimings: number[]) => void;
onError: (message: string) => void;
}) {
Expand Down Expand Up @@ -509,6 +522,7 @@ function RunningStage({
mode: inputs.mode,
publishToPlayground: inputs.publishToPlayground,
userSigner,
plan: plan ?? undefined,
onEvent: (event) => handleEvent(event),
});
if (!cancelled) onFinish(outcome, chunkTimingsRef.current);
Expand Down
12 changes: 12 additions & 0 deletions src/commands/deploy/index.ts
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,16 @@ export const deployCommand = new Command("deploy")
return;
}

// Release the Asset Hub client we opened for preflight mapping +
// allowance checks. Nothing else in the deploy path (build, chunk
// upload, bulletin-deploy's own DotNS preflight + registration)
// touches `getConnection()` — and holding an idle polkadot-api client
// with a live best-block subscription for the entire deploy window
// was a measurable contributor to background memory pressure. The
// playground publish step calls `getConnection()` which auto-creates
// a fresh client at that point.
destroyConnection();

try {
const nonInteractive = isFullySpecified(opts);
if (nonInteractive) {
Expand Down Expand Up @@ -223,6 +233,7 @@ async function runHeadless(ctx: {
mode,
userSigner: ctx.userSigner,
publishToPlayground,
plan: availability.plan,
});
const view = buildSummaryView({
mode,
Expand All @@ -240,6 +251,7 @@ async function runHeadless(ctx: {
mode,
publishToPlayground,
userSigner: ctx.userSigner,
plan: availability.plan,
env: ctx.env,
onEvent: (event) => logHeadlessEvent(event),
});
Expand Down
89 changes: 85 additions & 4 deletions src/utils/deploy/availability.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,13 @@ import { describe, it, expect, vi } from "vitest";
// Mock bulletin-deploy's DotNS class. Ownership check is now driven by the
// caller's H160 (derived from SS58 via `@polkadot-apps/address::ss58ToH160`),
// so the mock needs to reflect the full `{ owned, owner }` shape the caller
// sees when they DO pass a user address.
// sees when they DO pass a user address. `getUserPopStatus` + `isTestnet`
// feed the `needsPopUpgrade` prediction that's threaded into the summary
// card's phone-approval count.
const classifyName = vi.fn();
const checkOwnership = vi.fn();
const getUserPopStatus = vi.fn(async () => 2); // default: user already has Full PoP → no upgrade fires
const isTestnet = vi.fn(async () => true);
const connect = vi.fn(async () => {});
const disconnect = vi.fn();

Expand All @@ -14,6 +18,8 @@ vi.mock("bulletin-deploy", () => ({
connect,
classifyName,
checkOwnership,
getUserPopStatus,
isTestnet,
disconnect,
})),
}));
Expand All @@ -27,6 +33,10 @@ import { checkDomainAvailability, formatAvailability } from "./availability.js";
beforeEach(() => {
classifyName.mockReset();
checkOwnership.mockReset();
getUserPopStatus.mockReset();
getUserPopStatus.mockResolvedValue(2); // default: Full PoP → no upgrade
isTestnet.mockReset();
isTestnet.mockResolvedValue(true);
connect.mockClear();
disconnect.mockClear();
});
Expand All @@ -39,10 +49,15 @@ describe("checkDomainAvailability", () => {
classifyName.mockResolvedValue({ requiredStatus: 0, message: "" });

const result = await checkDomainAvailability("my-app");
// No ownerSs58Address passed → we can't check user's current PoP, so
// we default the plan to the common path (register + no PoP upgrade).
// The signing counter's clamp-up behavior fixes the summary at
// runtime if we under-estimated.
expect(result).toEqual({
status: "available",
label: "my-app",
fullDomain: "my-app.dot",
plan: { action: "register", needsPopUpgrade: false },
});
});

Expand Down Expand Up @@ -142,6 +157,65 @@ describe("checkDomainAvailability", () => {
}
});

it("predicts needsPopUpgrade=true when the user's current PoP is below what the label demands", async () => {
// Regression: the TUI used to hard-code "3 DotNS taps" for phone mode
// and print "step 5 of 4" on the phone prompt once bulletin-deploy
// fired its `setUserPopStatus` tx. Fix: plumb the predicted PoP
// transition from availability → `resolveSignerSetup` so the summary
// card and runtime counter agree on the real count.
classifyName.mockResolvedValue({ requiredStatus: 2, message: "PoP Full" }); // name wants Full
checkOwnership.mockResolvedValue({ owned: false, owner: null });
getUserPopStatus.mockResolvedValue(1); // user only has Lite
isTestnet.mockResolvedValue(true);

const result = await checkDomainAvailability("short", { ownerSs58Address: ALICE_SS58 });
expect(result.status).toBe("available");
if (result.status === "available") {
expect(result.plan).toEqual({ action: "register", needsPopUpgrade: true });
}
});

it("predicts needsPopUpgrade=false when the user already has ≥ the required PoP", async () => {
classifyName.mockResolvedValue({ requiredStatus: 2, message: "PoP Full" });
checkOwnership.mockResolvedValue({ owned: false, owner: null });
getUserPopStatus.mockResolvedValue(2); // already Full
isTestnet.mockResolvedValue(true);

const result = await checkDomainAvailability("short", { ownerSs58Address: ALICE_SS58 });
if (result.status === "available") {
expect(result.plan.needsPopUpgrade).toBe(false);
}
});

it("re-deploy: plan is { action: 'update', needsPopUpgrade: false } — only setContenthash fires", async () => {
classifyName.mockResolvedValue({ requiredStatus: 0, message: "" });
checkOwnership.mockImplementation(async (_label: string, checkAddress: string) => ({
owned: true,
owner: checkAddress,
}));

const result = await checkDomainAvailability("my-existing-site", {
ownerSs58Address: ALICE_SS58,
});
if (result.status === "available") {
expect(result.plan).toEqual({ action: "update", needsPopUpgrade: false });
}
});

it("falls back to a safe default when getUserPopStatus throws", async () => {
// RPC flake on the PoP query shouldn't block the whole availability
// check — under-counting is recoverable via the counter's clamp.
classifyName.mockResolvedValue({ requiredStatus: 2, message: "PoP Full" });
checkOwnership.mockResolvedValue({ owned: false, owner: null });
getUserPopStatus.mockRejectedValue(new Error("RPC hiccup"));

const result = await checkDomainAvailability("short", { ownerSs58Address: ALICE_SS58 });
expect(result.status).toBe("available");
if (result.status === "available") {
expect(result.plan).toEqual({ action: "register", needsPopUpgrade: false });
}
});

it("returns 'unknown' and disconnects when the RPC call throws", async () => {
classifyName.mockRejectedValue(new Error("RPC down"));

Expand All @@ -159,9 +233,15 @@ describe("checkDomainAvailability", () => {

describe("formatAvailability", () => {
it("renders a friendly sentence for each result kind", () => {
expect(formatAvailability({ status: "available", label: "x", fullDomain: "x.dot" })).toBe(
"x.dot is available",
);
const freshRegisterPlan = { action: "register" as const, needsPopUpgrade: false };
expect(
formatAvailability({
status: "available",
label: "x",
fullDomain: "x.dot",
plan: freshRegisterPlan,
}),
).toBe("x.dot is available");
expect(
formatAvailability({
status: "reserved",
Expand All @@ -176,6 +256,7 @@ describe("formatAvailability", () => {
label: "x",
fullDomain: "x.dot",
note: "Requires Proof of Personhood (Lite). Will be set up automatically.",
plan: freshRegisterPlan,
}),
).toMatch(/Proof of Personhood \(Lite\)/);
expect(
Expand Down
Loading
Loading