Skip to content

feat: allow engine-controller canister to update cloud engines directly#10431

Open
NikolaMilosa wants to merge 6 commits into
masterfrom
nim-update-subnet
Open

feat: allow engine-controller canister to update cloud engines directly#10431
NikolaMilosa wants to merge 6 commits into
masterfrom
nim-update-subnet

Conversation

@NikolaMilosa

Copy link
Copy Markdown
Contributor

The users should be able to control their own settings for the engines they use. For now we are scoping that to only subnet_admins and replica_version_id but in the future we might want to allow more things.

@NikolaMilosa NikolaMilosa changed the title feat: allow engine-controller canister to call registry to update cloud engine subnet records feat: allow engine-controller canister to update cloud engines directly Jun 10, 2026
@github-actions github-actions Bot added the feat label Jun 10, 2026
/// Builds a registry that already has a `CloudEngine` subnet, suitable
/// for exercising the engine-controller permission checks on
/// `do_update_subnet`.
fn make_registry_with_cloud_engine_subnet() -> (Registry, SubnetId) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the test helpers, we already have such a method: https://github.com/dfinity/ic/blob/master/rs/registry/canister/tests/common/test_helpers.rs#L216
You might want to use that.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those are not accessible here because they are for integration tests 😟

Comment on lines +87 to +88
/// Creates a registry with a single subnet of the given `subnet_type`.
fn make_registry_with_subnet(subnet_type: SubnetType) -> (Registry, SubnetId) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, you could probably use the test_helper method:
https://github.com/dfinity/ic/blob/master/rs/registry/canister/tests/common/test_helpers.rs#L216 it creates a registry with a system and a cloud engine subnet

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But those are integration test level things? I will add the same thing in the shared crate but I cannot use those in unit tests here

Comment thread rs/engine_controller/canister/canister.rs Outdated

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR expands the registry canister’s authorization model so the engine controller canister can directly perform limited subnet-management operations for CloudEngine subnets, enabling user-facing controls to be implemented via the engine controller without governance proposals.

Changes:

  • Allow the engine controller canister to call update_subnet and deploy_guestos_to_all_subnet_nodes on the registry canister.
  • Enforce that engine-controller-initiated calls only target CloudEngine subnets (governance remains unrestricted).
  • Expose proxy payload types and add engine controller API surface + tests to forward these operations.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
rs/registry/canister/unreleased_changelog.md Documents the newly allowed callers and CloudEngine-only restriction.
rs/registry/canister/src/mutations/do_update_subnet.rs Adds caller-aware logic to restrict engine-controller updates to CloudEngine subnets; updates tests accordingly.
rs/registry/canister/src/mutations/do_deploy_guestos_to_all_subnet_nodes.rs Adds caller-aware CloudEngine-only restriction for engine-controller deploys; adds tests.
rs/registry/canister/canister/canister.rs Updates registry canister endpoints to authorize engine controller and pass through the caller principal to mutation handlers.
rs/engine_controller/src/lib.rs Re-exports registry payload types for clients of the engine controller.
rs/engine_controller/engine_controller.did Extends the engine controller public interface with update_subnet and deploy_guestos_to_all_subnet_nodes plus required type definitions.
rs/engine_controller/canister/tests.rs Adds unit tests for the engine controller’s update_subnet payload validation and admin normalization helpers.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread rs/registry/canister/src/mutations/do_update_subnet.rs Outdated
Comment thread rs/registry/canister/src/mutations/do_update_subnet.rs
pietrodimarco-dfinity and others added 2 commits June 11, 2026 16:52
## What

Stacked on top of #10431. That PR lets the engine-controller canister
proxy
`update_subnet`, but its field allow-list
(`ensure_only_subnet_admins_set`)
currently permits **only** `subnet_admins`. This PR widens that
allow-list to
also permit `is_halted`, so the engine controller can halt / unhalt the
cloud-engine subnets it manages.

## Changes

- `rs/engine_controller/canister/canister.rs`
- Renamed `ensure_only_subnet_admins_set` →
`ensure_only_allowed_fields_set`
and moved `is_halted` into the allowed group; every other field still
must
    be left at its default (`None` / `false`) or the call is rejected.
- Updated the validator error message and the `update_subnet` doc
comments.
- `rs/engine_controller/canister/tests.rs`
- Renamed the validator tests, added coverage that `is_halted` (both
`true`
and `false`) and `subnet_admins + is_halted` are accepted, and switched
the
    "rejects other fields" test to assert on a still-disallowed field.

## Scope notes

- The registry side is unchanged: `do_update_subnet` already accepts
`is_halted`
  and only gates the engine controller to `CloudEngine` subnets, so no
  registry-level changes are needed.
- `halt_at_cup_height` is intentionally **not** added — only the direct
`is_halted` flag. Adding a new allowed field stays a conscious,
code-level
  decision, matching the design intent of the base PR.
- No `.did` change: `is_halted` is already part of the mirrored
`UpdateSubnetPayload`, and the candid interface-equality test still
passes.

## Testing

`cargo test -p ic-engine-controller --bin engine-controller-canister` —
all 10
unit tests pass (including the candid interface-match test). `cargo fmt
--check`
is clean.

> Draft because it stacks on the not-yet-merged #10431; base is set to
that
> PR's branch (`nim-update-subnet`).
@NikolaMilosa NikolaMilosa marked this pull request as ready for review June 11, 2026 15:41
@NikolaMilosa NikolaMilosa requested review from a team as code owners June 11, 2026 15:41

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request changes code owned by the Governance team. Therefore, make sure that
you have considered the following (for Governance-owned code):

  1. Update unreleased_changelog.md (if there are behavior changes, even if they are
    non-breaking).

  2. Are there BREAKING changes?

  3. Is a data migration needed?

  4. Security review?

How to Satisfy This Automatic Review

  1. Go to the bottom of the pull request page.

  2. Look for where it says this bot is requesting changes.

  3. Click the three dots to the right.

  4. Select "Dismiss review".

  5. In the text entry box, respond to each of the numbered items in the previous
    section, declare one of the following:

  • Done.

  • $REASON_WHY_NO_NEED. E.g. for unreleased_changelog.md, "No
    canister behavior changes.", or for item 2, "Existing APIs
    behave as before.".

Brief Guide to "Externally Visible" Changes

"Externally visible behavior change" is very often due to some NEW canister API.

Changes to EXISTING APIs are more likely to be "breaking".

If these changes are breaking, make sure that clients know how to migrate, how to
maintain their continuity of operations.

If your changes are behind a feature flag, then, do NOT add entrie(s) to
unreleased_changelog.md in this PR! But rather, add entrie(s) later, in the PR
that enables these changes in production.

Reference(s)

For a more comprehensive checklist, see here.

GOVERNANCE_CHECKLIST_REMINDER_DEDUP

@zeropath-ai

zeropath-ai Bot commented Jun 11, 2026

Copy link
Copy Markdown

No security or compliance issues detected. Reviewed everything up to 9d47bef.

Security Overview
Detected Code Changes

| Change Type | Relevant files

... (code changes summary truncated to fit VCS comment limits.)

@schneiderstefan schneiderstefan left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implementation-wise LGTM, but some high-level concerns.

impl Registry {
pub fn do_update_subnet(&mut self, payload: UpdateSubnetPayload) {
println!("{}do_update_subnet: {:?}", LOG_PREFIX, payload);
pub fn do_update_subnet(&mut self, caller: PrincipalId, payload: UpdateSubnetPayload) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there rate limiting in place, ideally in the registry canister itself? It could be dangerous if the end user is allowed to make changes and can do as many as they want.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For transparency: at the moment we aren't even allowing this call, we need this for something completely different that isn't user facing. So users for now cannot do anything with this specifically and won't be able for some time still until we figure out everything. And we won't be constantly changing any of these values. I expect that per engine it could be 2-3 times per month. Basically we want to be able to halt a subnet if an engine payment is overdue. This will eventually go into this canister (the whole accounting logic), but in the interest of time we have to take it bit by bit

@@ -15,14 +16,32 @@ use serde::Serialize;
impl Registry {
pub fn do_deploy_guestos_to_all_subnet_nodes(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that any pair of blessed versions might be incompatible at any time. So it's easy to shoot yourself in the foot if we give too many options to the end users.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely! Again this is not exposed to the end users. This is still heavily controlled by us. I highly doubt that any user will ever care about that so this will most likely go away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants