Skip to content

[FEA][one-click] Validate N+1 compute cluster registration with one-click deployment #17

@sbaum1994

Description

@sbaum1994

Is your feature request related to a problem? Please describe.

The one-click quickstart supports a control plane plus GPU cluster layout, and the self-managed stack separates the control-plane lifecycle from the per-cluster NVCA operator lifecycle. We need explicit validation that adding an additional compute cluster after the first install works as expected.

This matters because a self-managed NVCF control plane should support more than one GPU cluster without clobbering registration state or reinstalling the control plane.

Describe the solution you'd like

Validate and document the N+1 cluster flow for one-click deployment.

The validation should cover:

  • Install control plane and compute cluster A.
  • Add compute cluster B using a separate kube context and cluster name.
  • Confirm both clusters remain registered and healthy.
  • Confirm functions can deploy to the expected cluster target.
  • Confirm uninstalling cluster B does not remove cluster A or the control plane.

Describe alternatives you've considered

Operators can use the manual Helmfile and NVCA registration flow, but one-click is the preferred fresh-install path and should support the common multi-cluster expansion workflow.

Additional context

Suggested acceptance criteria:

  • One-click flow succeeds for the first compute cluster.
  • One-click or follow-up command succeeds for the second compute cluster.
  • Cluster-specific registration values remain isolated.
  • nvcf-cli self-hosted status shows both clusters correctly.
  • Docs include the verified N+1 workflow and cleanup commands.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions