Skip to content

Conversation

@thijshberg
Copy link
Collaborator

@thijshberg thijshberg commented Feb 9, 2026

This adds the jupyter capability for soarca.
This is specified in cacao (section 5.8 of cacao v2).

Implementation

The implementation relies on a separate container set up to run jupyter, which is restarted on each invocation of jupyter. The container contains a tiny python http server to forward requests to jupyter; python was chosen as this was already present in the container. The jupyter capability of soarca communicates with this container over http. The restarting of the container is handled by docker compose.

Jupyter is called using the nbconvert function, which allows us to execute a notebook in place if we convert to a notebook. Jupyter was really not made for running notebooks from a CLI or as standalone as we do here, so this is scarcely document/clunky.

Alternative implementions

A number of alternative implementation options were considered/explored.

Local execution

Executing jupyter directly from soarca is easier, but running arbitrary code in the same machine as soarca is not that secure.

Calling/restarting containers from soarca

Creating/restarting/running containers using e.g. containerd from soarca should be possible, but is practically quite hard. There were lots of errors in different configurations. There are multiple under-documented and obscure hurdles (e.g. how to interact with the stdin/stdout of a container without emulating a shell), and the resulting setup would be quite complex.

Deviations from Cacao

Setting variables

The current setup supports setting output variables in jupyter notebooks, by printing "soarca::var=value" in the output of one of the cells. Cacao does not support variables.

Setting targets

The current setup allows referencing net-device targets in the step, which soarca then uses to determine the url for the jupyter service. Cacao does not define how jupyter is to run, so this is a liberty we take given how we implemented it.

Future work on this feature

  • Support https communication with jupyter
  • Speed-up; firing up jupyter (which is implemented using the notebook conversion function) takes a second or two, which is sad. We might be able to start a jupyter server upon container start-up, and then upload and run the notebook on that server, which frontloads jupyter startup time.

@thijshberg thijshberg linked an issue Feb 9, 2026 that may be closed by this pull request
This adds a small http server inside the jupyter container that answers
requests to run a single notebook by running the notebook, and then
promptly exits. The docker compose handles the fact that the container
is restarted.

The http server is in python, as that is already present in the
container due to the fact that it is set up for jupyter.
This adds both unit tests for the functions, and an integration test
using the actual jupyter container.
@thijshberg thijshberg force-pushed the feature/367-jupyter-actions branch from cff5e72 to cb25383 Compare February 9, 2026 14:41
@thijshberg thijshberg marked this pull request as ready for review February 9, 2026 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add jupyter notebook actions

1 participant