Feat/cluster ops suite by OneNoted · Pull Request #4 · OneNoted/pvt

OneNoted · 2026-04-20T15:54:34Z

Expands pvt into a usable operational surface for Proxmox + Talos clusters and replaces the zig TUI with a Rust Ratatui TUI because I like ratatui sorry!

TUI rewrite — vitui reimplemented in Rust; preserves the four operational views, the pvt.yaml
contract, and the pvt tui entrypoint.
Centralized health — shared health snapshot for configured clusters; status now reads from the same
path, and a new doctor command covers local setup checks.
Drift + machine config diffs — plan-only drift detection with known remediation output and normalized
Talos machine config diffs so operators can review desired vs. live state before acting.
Safe backup & upgrade workflows — Proxmox backup views/pruning scoped to configured VMIDs, plan-first
node lifecycle commands, and upgrade preflight/postflight reports that fail on unhealthy results and honor
the configured health timeout.
Cleanup — removed stale status helpers left by the health migration, reused shared config loading in
upgrade, made backup VMID filtering directly testable.

Design notes

Remediation is plan-only — applying qm/talosctl changes is environment-specific and gated on
operator approval.
Lifecycle commands use structured argv (not reparsed display strings) so config values can't leak as
extra flags.
Talos config path and kubeconfig are kept distinct; talos.config_path is never reused as kubectl --kubeconfig.
No secret-bearing args on the Proxmox API auth command line.

…tui app The existing TUI path was unstable enough to be effectively unusable, so this change rewrites vitui in Rust, keeps the four operational views, preserves the pvt.yaml contract, and rewires `pvt tui` to launch the new binary. Constraint: Preserve the existing pvt config contract and `pvt tui` entrypoint while removing the Zig TUI as the primary path Rejected: Full Go-to-Rust CLI rewrite now | too broad for the immediate stability problem Rejected: Native HTTP/SDK rewrites for every integration | higher risk than parity-first subprocess/API compatibility Confidence: medium Scope-risk: broad Reversibility: messy Directive: Keep Talos config and kubeconfig handling distinct; do not reuse `talos.config_path` as `kubectl --kubeconfig` Directive: Do not reintroduce secret-bearing command-line args for Proxmox API auth Tested: cargo test; cargo build --release; cargo run -- --help; go test ./... Not-tested: Live Proxmox/Talos/Kubernetes connectivity against a real cluster

Create a shared health snapshot for configured clusters and add a doctor command for local setup checks. Status now reads from the same snapshot so later commands can build on one observation path. Constraint: Live Proxmox and Talos calls must degrade into reportable warnings where possible. Rejected: Keep status on a separate Talos-only path | that would duplicate health collection and weaken later drift checks. Confidence: medium Scope-risk: moderate Tested: go test ./...; go build ./...; go vet ./... Not-tested: Live Proxmox, Talos, and Kubernetes infrastructure

Add drift detection, known remediation plan output, and normalized Talos machine config diffing so operators can compare desired and live-adjacent state before making changes. Constraint: Remediation must remain plan-only because applying qm/talosctl changes is environment-specific. Rejected: Auto-apply drift fixes | too destructive without a cluster-specific approval model. Confidence: medium Scope-risk: moderate Tested: go test ./...; go build ./...; go vet ./... Not-tested: Live drift against production Proxmox or Talos clusters

Round out the Go operational surface with Proxmox backup retention commands, plan-first node lifecycle commands, and upgrade preflight/postflight reports. Backup views and pruning are scoped to configured VMIDs, postflight reports fail on unhealthy results, and the upgrade safety gate honors the configured health timeout. Constraint: Node and backup operations can be destructive in real clusters. Constraint: gRPC v1.76.0 has a critical advisory and must be updated. Rejected: Enable lifecycle mutation by default | operators need reviewable plans before host or backup changes. Rejected: Reparse executable lifecycle commands from display strings | structured argv avoids config values becoming extra flags. Confidence: medium Scope-risk: broad Tested: gofmt clean; go test -count=1 ./...; go build ./...; go vet ./...; cargo fmt --check; cargo test; cargo check; cargo clippy --all-targets -- -D warnings; validator approvals from architect/security/code-reviewer Not-tested: Live backup deletion, kubectl drain, talosctl reboot, or live upgrade reports

Tighten the recent operational feature paths without changing behavior. Removed stale status helpers left behind by the health snapshot migration, reused shared config loading in the upgrade command, and made backup VMID filtering directly testable. Constraint: Preserve behavior locked by existing Go and Rust quality gates. Rejected: Broader command refactors | outside the cleanup scope and unnecessary for the identified smells. Confidence: high Scope-risk: narrow Tested: gofmt clean; go test -count=1 ./...; go build ./...; go vet ./...; cargo fmt --check; cargo test; cargo check; cargo clippy --all-targets -- -D warnings Not-tested: Live Proxmox, Talos, and Kubernetes workflows

OneNoted added 6 commits April 20, 2026 17:42

fix: prefer config.yaml for tui discovery

a4f939a

OneNoted merged commit 8a8863b into main Apr 20, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/cluster ops suite#4

Feat/cluster ops suite#4
OneNoted merged 6 commits intomainfrom
feat/cluster-ops-suite

OneNoted commented Apr 20, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

OneNoted commented Apr 20, 2026

Design notes

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant