To use these hooks, you need to have pre-commit installed. You can then edit the .pre-commit-config.yaml file in your project root to include the hooks defined in this directory. Currently, we have one group of hooks that are designed to run in the following sequence:
-
nbstripout-preserve-timestamp: Strips output from Jupyter notebooks while preserving the original file timestamp to facilitate smooth interoperation withjupytext --synctype functionality (which relies on the file timestamp). -
jupytext-enforce-pairing: Ensures every Jupyter notebook has a corresponding Python (.py) file that is tracked bygit. Automatically generates missing paired files. -
jupytext-smart-sync: Synchronises content between Jupyter notebooks and their paired Python files when either is modified, maintaining consistency between the two formats, relies on underlying file timestamps andjupytext --syncto determine the correct direction of sync; identifies pairs based on the file name and extension and only runs on one of the two files (not both).
To use all three of these hooks, you can add the following to your .pre-commit-config.yaml file:
repos:
- repo: https://github.com/nestauk/pre-commit-hooks
rev: v1.2.0
hooks:
- id: nbstripout-preserve-timestamp
- id: jupytext-enforce-pairing
- id: jupytext-smart-syncThere are two suggested ways to use these hooks:
- All three hooks together fora full workflow, or
jupytext-enforce-pairingon its own if you only need pairing
This project uses uv for virtual environment management. If you are new to uv, you can find the quickstart guide here.
We also utilise direnv via the .envrc file to automatically:
- Import your environment variables from
.env - Unset
UV_INDEXso this public project does not inherit private PyPI index settings - Activate your virtual environment (only if you comment out the relevant lines in
.envrc)
After installing direnv and uv on your system (we recommend doing this via brew on macOS), you must run the following commands in your terminal to set up the project:
direnv allow
uv syncTechnical and working style guidelines
Project based on Nesta's data science project template (Read the docs here).