Skip to content

Conversation

@jmcarp
Copy link
Contributor

@jmcarp jmcarp commented Jan 21, 2026

At the moment, omicron-dev starts crdb on port 0, which tells the database to use any available port. This is fine for most use cases, but less fine for running a second nexus alongside an existing run-all: https://github.com/oxidecomputer/omicron/blob/main/docs/how-to-run-simulated.adoc#using-both-omicron-dev-run-all-and-running-nexus-manually. This process involves copying the randomly selected ports from the output of run-all and writing them to a custom nexus config. To make the process simpler, this patch allows the user to configure the crdb listen port, defaulting to the current choice of 0.

cc @lgfa29. We've been talking about this to simplify automated testing of nexus upgrades for the terraform provider. Note that we'd want to be able to configure ports for a few additional services as well to make that happen—just starting with a small change at first.

@jmcarp jmcarp force-pushed the jmcarp/dev-configure-crdb-port branch from 2cda6fd to 1bdd25f Compare January 21, 2026 18:31
@jmcarp jmcarp marked this pull request as draft January 21, 2026 19:14
@jmcarp jmcarp force-pushed the jmcarp/dev-configure-crdb-port branch from 1bdd25f to c2fa9c3 Compare January 21, 2026 20:41
At the moment, omicron-dev starts crdb on port 0, which tells the database to
use any available port. This is fine for most use cases, but less fine for
running a second nexus alongside an existing run-all:
https://github.com/oxidecomputer/omicron/blob/main/docs/how-to-run-simulated.adoc#using-both-omicron-dev-run-all-and-running-nexus-manually.
This process involves copying the randomly selected ports from the output of
run-all and writing them to a custom nexus config. To make the process simpler,
this patch allows the user to configure the crdb listen port, defaulting to the
current choice of 0.
@jmcarp jmcarp force-pushed the jmcarp/dev-configure-crdb-port branch from c2fa9c3 to 03a67c7 Compare January 21, 2026 20:59
@jmcarp jmcarp marked this pull request as ready for review January 21, 2026 21:02
@davepacheco
Copy link
Collaborator

I think this makes sense. If I remember right, the fully manual instructions wind up using a fixed set of ports for a lot of things, which is helpful because you don't have to edit config files as part of that process, but then they use random-available ports for other things. I've long meant to clean that up. Would it make sense for omicron-dev to have a flag that says "use fixed ports for everything" vs. "use any-available ports for everything"? The first mode is more useful for a lot of development because, for example, you don't have to change configs of things pointed at it (like omdb). The second mode is more useful for automation because it works reliably when it's already running, etc.

@jmcarp
Copy link
Contributor Author

jmcarp commented Jan 22, 2026

For context, my first thought was to add something like a --persistent flag to run-all, which would configure the local environment to use fixed ports for all services, and persist crdb (and maybe clickhouse) state across runs. That would give us a simple interface for e.g. testing omicron updates with the terraform provider. That turned out to be more complicated than I was hoping, so I thought I would try configuring ports instead.

Would it make sense for omicron-dev to have a flag that says "use fixed ports for everything" vs. "use any-available ports for everything"?

That would work for my use case. For some reason, it feels a little strange to me to tell the xtask to use some known, fixed port for each service rather than asking for specific ports for the services that we care about, but it would also be more concise. I think my weak preference would be to expose ports for the three services that I actually care about for my use case, but I'm happy to try either alternative.

...the fully manual instructions...

@lgfa29 and I also thought about just running the parts that we need for our automated testing, but it looks like that's known broken at the moment. Making that work could also be useful, but this seemed like the path of least resistance for now.

@davepacheco
Copy link
Collaborator

That's interesting. For what it's worth, the underlying mechanism for running cockroach does support attaching it to an existing database rather than starting a new one so it's not a stretch to imagine omicron-dev run-all letting you point it at a directory in which to store the database files and resume a previous one. But I think it's likely that there's other state that wouldn't get linked up properly. For example, if the simulated sled agent or omicron-dev itself generates random uuids for various internal services, the re-run would regenerate those and then they wouldn't match what's in the database and everything would be confused.


For some reason, it feels a little strange to me to tell the xtask to use some known, fixed port for each service rather than asking for specific ports for the services that we care about, but it would also be more concise.

Yeah, I hear what you mean. In the limit you could imagine having all of these modes:

  • fully concurrent-safe: uses any available port for everything
  • fully configured (and predictable): you give it all the specific ports
  • fully configurable: start with either of the above but let you override any port you want

This isn't as complicated as it sounds. One simple way to do this is to accept a configuration file that specifies all the ports and ship with two config files: one that uses 0 for all the ports (the concurrency-safe option) and the other that uses fixed ports for everything. My proposal above was essentially to do the same thing with CLI options, where one of the options is analogous to picking one or the other config file. With either of these, you can still change individual ports.

I think my weak preference would be to expose ports for the three services that I actually care about for my use case, but I'm happy to try either alternative.

This is fine, too. It doesn't get in the way of building the other stuff and we don't have to do all that other stuff right now.

@jmcarp jmcarp force-pushed the jmcarp/dev-configure-crdb-port branch from 73d557a to 970301a Compare January 23, 2026 15:46
@jmcarp jmcarp force-pushed the jmcarp/dev-configure-crdb-port branch from 970301a to cf8c300 Compare January 23, 2026 16:22
@jmcarp
Copy link
Contributor Author

jmcarp commented Jan 23, 2026

All right, I added the other ports used in the "second nexus" workflow. It should be easy enough to group these into a config file that we can we can point to with a --fixed-ports flag later on if useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants