Container status API: compute live status instead of static DB column#352
Merged
Conversation
runleveldev
commented
Jun 16, 2026
Contributor
There was a problem hiding this comment.
Pull request overview
This PR removes the persisted Containers.status column and replaces it with a live, computed container status derived from Proxmox state, job state, and config drift, while keeping status present on existing container API payloads. It also updates the UI and docs to reflect the expanded status set.
Changes:
- Add a shared status resolver (
utils/container-status.js) and wire it into container list/show/create responses. - Drop
Containers.statusfrom the Sequelize model + DB schema; remove status writes from job scripts/import paths. - Update templates generation filters and frontend types/rendering to support the new live status values.
Reviewed changes
Copilot reviewed 10 out of 13 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| create-a-container/utils/container-status.js | New live status resolver (Proxmox snapshot + job lookup + drift detection). |
| create-a-container/routers/templates.js | Switch template container selection from status='running' to a provisioned/proxy predicate. |
| create-a-container/routers/api/v1/nodes.js | Stop persisting status during node/container import. |
| create-a-container/routers/api/v1/containers.js | Embed computed status in list/show; return creating on create response; stop persisting status updates. |
| create-a-container/README.md | Remove status from ER diagram and document the live status field and enum values. |
| create-a-container/openapi.v1.yaml | Add ContainerStatus enum schema and reference it from Container.status. |
| create-a-container/models/container.js | Remove status attribute from the Container model definition. |
| create-a-container/migrations/20260615000000-remove-container-status.js | Migration to drop the status column (with rollback restoring a default). |
| create-a-container/client/src/pages/containers/ContainersListPage.tsx | Render typed live status via StatusBadge and map new states to badge variants/labels. |
| create-a-container/client/src/lib/types.ts | Introduce ContainerStatus union and type container/status-bearing payloads accordingly. |
| create-a-container/bin/reconfigure-container.js | Stop updating container status; keep MAC/IP updates and adjust logs. |
| create-a-container/bin/json-to-sql.js | Stop setting/updating container status during JSON import/update. |
| create-a-container/bin/create-container.js | Replace the old “pending” guard with a VMID presence guard; stop persisting creating/running/failed status. |
ccc31d4 to
793ba81
Compare
91a3ddf to
e790cfb
Compare
runleveldev
added a commit
that referenced
this pull request
Jun 19, 2026
… + DummyApi
Run the Manager standalone on localhost (SQLite, `make dev`) and create
containers without Proxmox, keeping the real provisioning code path.
- NodeApi abstraction: Node.api() dispatches on a new `nodeType` column.
`proxmox` (default) returns ProxmoxApi; `dummy` returns a new DummyApi that
implements the same surface and simulates Proxmox calls. Creation still runs
POST /containers -> Job -> job-runner -> bin/create-container.js -> node.api().
- DummyApi.clusterResources('lxc') derives the running set from the DB so the
live container-status resolver (#352) reports dummy containers correctly;
dev-bootstrap gives the dummy node placeholder creds so nodeHasCreds() passes.
- Provider-agnostic node selection that only excludes unprovisionable nodes
(dummy, or proxmox with apiUrl/tokenId/secret); DELETE routes teardown through
node.api(). Node.api() validates apiUrl alongside credentials.
- Postgres-safe migration adding `nodeType` (existing rows default to proxmox).
- Fix the erroneous UNIQUE on Containers.nodeId via patch-package (v6 backport of
sequelize/sequelize#17583); replaces the SQLite-specific migration.
- bin/dev-bootstrap.js seeds a localhost site + dummy node + external domain,
no-op if any Site exists, so it never clobbers the docker-compose bootstrap.
- Make idempotent seeders (groups, push-notification) so re-running them in
`make dev` doesn't violate unique constraints.
- Gate Sequelize SQL query logging behind LOG_LEVEL=trace; off otherwise.
- Replace the unmaintained sqlite3 with @vscode/sqlite3 (dev-only, builds from
source via dialectModule) so there is no prebuilt-binary glibc mismatch; prod
(Postgres) never installs or loads it.
- `make dev` lives in create-a-container/Makefile (root delegates): self-contained
install (--no-package-lock, so it never rewrites the lockfile), migrate, seed,
then run server (nodemon) + job-runner + client watch together.
- Docs: "Run the Manager Locally (make dev)" section; README + release-pipeline
pointers; LOG_LEVEL documented in example.env.
Resolves #350. Container status was a static column updated only by the create/reconfigure job scripts, so it drifted from reality whenever Proxmox changed out of band or a job died mid-run. This replaces it with status computed on demand. New utils/container-status.js resolves status from three sources: - Proxmox: does the LXC exist, and is it running/stopped? - Jobs: active create/reconfigure job, or did the last create job fail? - Config: does the live LXC config match what the API server expects? Resolved statuses: running, offline, creating, restarting, failed, missing, out-of-sync (plus unknown when Proxmox is unreachable). New endpoint: GET /api/v1/sites/:siteId/containers/:id/status -> { data: { status } } Non-breaking: the same live status is still embedded on the list/show/create payloads, so existing consumers keep working. The static Containers.status column is dropped (migration + model). All former writers were updated to stop persisting status; templates.js now selects provisioned containers by VMID + IPv4 instead of status='running'. The list page queries the per-container status endpoint (seeded by the list value for instant paint, polling while creating/restarting). OpenAPI and README updated.
Live status is already embedded in the existing list/show/create container payloads, so the dedicated GET /containers/:id/status endpoint is redundant. Per-row polling + hydration on the index page is not needed at this time, so revert the ContainerStatusBadge to a simple presentational badge that renders the live status from the list response. Removes the endpoint, the client query/key/result type, the OpenAPI path, and the README endpoint section (the status-value reference is retained as a field description). The ContainerStatus enum and the live resolver remain.
- out-of-sync is a warning, not a danger, badge variant - templates.js routes containers by ipv4Address only (a VMID isn't required to route)
- container-status.js: match --container-id as a whole argument when looking up
create/reconfigure jobs, so container 12 no longer matches container 123's
jobs (which could misclassify status). LIKE narrows at the DB level; a regex
confirms the id is terminated by a space or end-of-string.
- resource-requests.js: the applyResourceToExistingContainers query filtered on
the now-removed Containers.status column ("Unknown column 'status'" after the
migration). Use containerId != null (a VMID is required to reconfigure anyway).
- README: note the create endpoint also returns the live status field.
The calling convention for create/reconfigure job commands is stable and deterministic, so match the command string exactly in the query rather than fetching LIKE candidates and regex-filtering in JS: - create: exact match on `node bin/create-container.js --container-id=<id>` (create commands never carry trailing args). - reconfigure: exact match OR `<base> %` (exact base followed by a space), which covers the optional ` --<resource>=<value>` flags while still preventing id 12 from matching id 123.
Resolving the container list status was O(N) DB queries because every container ran its own create/reconfigure job lookup (up to 2 queries each). Add prefetchLatestJobs() which fetches the latest create + reconfigure job for the whole set of containers in a single query (bounded WHERE clauses, independent of N), buckets them per container, and feeds the result into computeContainerStatus via an optional `jobs` param. computeContainerStatuses now issues one job query total instead of ~2N. The single-container show path is unchanged (still queries per container when no prefetched jobs are passed).
…ng, out-of-sync)
Resolving the container list was still slow because each container did per-item
work: a command-string job lookup for 'restarting', and a Proxmox lxcConfig call
for 'out-of-sync'. Both are inherently O(N) (Proxmox has no bulk config endpoint),
which dominated index latency.
Drop both statuses:
- 'restarting': not used by known API clients (launchpad only checks == running),
and reconfigure jobs have no FK so detecting it required command-string queries.
- 'out-of-sync': required one lxcConfig call per in-Proxmox container; no bulk
Proxmox config API exists to batch it.
Status now derives only from:
- Proxmox run-state, via one clusterResources('lxc') snapshot per node, and
- the create job, resolved from the eager-loaded creationJob association (or the
creationJobId FK) — no command-string queries.
Result: the index makes one Proxmox call per node and zero per-container Proxmox
or DB job queries (verified at 30 containers). Remaining statuses: running,
offline, creating, failed, missing, unknown. PUT still enqueues a reconfigure job
and returns its jobId; only the 'restarting' status label is removed.
Updates the resolver, containers router (eager-load creationJob on the index),
client types/badges, OpenAPI enum, and README.
e790cfb to
db202a0
Compare
cmyers-mieweb
approved these changes
Jun 22, 2026
cmyers-mieweb
left a comment
Collaborator
There was a problem hiding this comment.
Approved, if deletion occurs asynchronously in the future we'll have to re-review to add the state.
runleveldev
added a commit
that referenced
this pull request
Jun 22, 2026
… + DummyApi
Run the Manager standalone on localhost (SQLite, `make dev`) and create
containers without Proxmox, keeping the real provisioning code path.
- NodeApi abstraction: Node.api() dispatches on a new `nodeType` column.
`proxmox` (default) returns ProxmoxApi; `dummy` returns a new DummyApi that
implements the same surface and simulates Proxmox calls. Creation still runs
POST /containers -> Job -> job-runner -> bin/create-container.js -> node.api().
- DummyApi.clusterResources('lxc') derives the running set from the DB so the
live container-status resolver (#352) reports dummy containers correctly;
dev-bootstrap gives the dummy node placeholder creds so nodeHasCreds() passes.
- Provider-agnostic node selection that only excludes unprovisionable nodes
(dummy, or proxmox with apiUrl/tokenId/secret); DELETE routes teardown through
node.api(). Node.api() validates apiUrl alongside credentials.
- Postgres-safe migration adding `nodeType` (existing rows default to proxmox).
- Fix the erroneous UNIQUE on Containers.nodeId via patch-package (v6 backport of
sequelize/sequelize#17583); replaces the SQLite-specific migration.
- bin/dev-bootstrap.js seeds a localhost site + dummy node + external domain,
no-op if any Site exists, so it never clobbers the docker-compose bootstrap.
- Make idempotent seeders (groups, push-notification) so re-running them in
`make dev` doesn't violate unique constraints.
- Gate Sequelize SQL query logging behind LOG_LEVEL=trace; off otherwise.
- Replace the unmaintained sqlite3 with @vscode/sqlite3 (dev-only, builds from
source via dialectModule) so there is no prebuilt-binary glibc mismatch; prod
(Postgres) never installs or loads it.
- `make dev` lives in create-a-container/Makefile (root delegates): self-contained
install (--no-package-lock, so it never rewrites the lockfile), migrate, seed,
then run server (nodemon) + job-runner + client watch together.
- Docs: "Run the Manager Locally (make dev)" section; README + release-pipeline
pointers; LOG_LEVEL documented in example.env.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Resolves #350.
Summary
Container status was a static
Containers.statuscolumn updated only by the create/reconfigure job scripts, so it drifted from reality whenever Proxmox changed out of band or a job died mid-run. This replaces it with status computed live on demand, embedded directly on the existing container API payloads. The static column is dropped.Live status resolver —
create-a-container/utils/container-status.jsStatus is derived from two cheap sources of truth:
clusterResources('lxc')snapshot per node.creationJobIdforeign key, so this is resolved from the eager-loadedcreationJobassociation (zero extra queries) or a single PK lookup.Resolved statuses:
runningofflinecreatingfailedmissingunknownWhere status is returned
The live
statusis embedded on the existing list, show, and create container payloads — no new endpoint.Performance (why the scope was trimmed)
Resolving the list must stay cheap. Two things were inherently O(N):
restarting) — reconfigure jobs have no FK, so detecting them needed per-container command-string queries.out-of-sync) — Proxmox has no bulk config endpoint, so this needed onelxcConfigcall per in-Proxmox container.Both were dropped. The index now issues one Proxmox call per node and zero per-container Proxmox or DB job queries (verified at 30 containers). The create job comes free from the eager-loaded association.
Compatibility
statusremains present on the same endpoints, so existing consumers keep working. Checkedmieweb/launchpad(an external API client): it only checksstatus == "running"and never usesrestarting/out-of-sync, so dropping them is safe there.maincould emit (maincould returnrestarting), so it's not a strict superset anymore — but no known client depends on the removed values.Removed the static column
20260615000000-remove-container-status.jsdropsContainers.status; removed from the model.create-container.js,reconfigure-container.js,nodes.jsimport,json-to-sql.js). Failure is now captured by the job'sfailurestatus.status !== 'pending'guard became a Proxmox-VMID-presence guard.routers/templates.js(nginx/dnsmasq config) previously filteredwhere: { status: 'running' }; it now selects routable containers byipv4Address != null.routers/api/v1/resource-requests.js(applyResourceToExistingContainers) previously filteredwhere: { ..., status: 'running' }— that would have thrown "Unknown column 'status'" after the migration. It now filters oncontainerId != null(a VMID is required to reconfigure anyway).Frontend
ContainerStatusunion; the list page renders the live status from the list response via a small presentationalStatusBadge(friendly labels + status → badge-variant mapping). No per-row polling.Docs
ContainerStatusenum, referenced from theContainerschema.status) and a container-status field description.Testing
No test framework exists in the repo, so logic was validated with standalone Node assertion scripts (not committed):
node --check; OpenAPI YAML parses; clienttype-checkandbuildpass.Notes / follow-ups
out-of-sync) can be revisited later if Proxmox gains a bulk config endpoint or if drift is computed lazily on the detail view / cached.prove/test harness if the team wants committed tests.