Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
0b6dd4e
Add 'compilers' field to metadata
thomashoneyman Nov 8, 2023
e15e4a8
Add utilities for building with many compilers
thomashoneyman Nov 11, 2023
d8e7e41
Remove PackageSource and require all packages to solve/compile
thomashoneyman Nov 11, 2023
8e069b6
Determine all compilers for package in publish pipeline
thomashoneyman Nov 11, 2023
5348ee2
Initial cut at discovering compiler in legacy import
thomashoneyman Nov 12, 2023
630c0bf
Always look up metadata / manifests in each publishing step
thomashoneyman Nov 12, 2023
77d6e68
Testing the pipeline...
thomashoneyman Nov 13, 2023
8749bea
Better reporting of failures
thomashoneyman Nov 13, 2023
be93d18
Update union of package set / spago / bower deps, consider ranges in …
thomashoneyman Nov 14, 2023
5a15433
Include spago.yaml files in legacy import
thomashoneyman Nov 15, 2023
559275c
Retain compilation in cache
thomashoneyman Nov 15, 2023
09d515a
Consider compilers when solving
thomashoneyman Nov 16, 2023
98ef892
Rely on solver per-compiler instead of looking at metadata for compat…
thomashoneyman Nov 16, 2023
ae621da
Adjust unused dependency pruning to replace used transitive deps
thomashoneyman Nov 17, 2023
5c54103
Remove unused functions
thomashoneyman Nov 17, 2023
441b960
wip
thomashoneyman Nov 17, 2023
3495edb
Use cache when finding first suitable compiler
thomashoneyman Nov 19, 2023
7ceab4c
WIP: Include missing direct imports
thomashoneyman Nov 19, 2023
3b85cd5
No longer try to insert missing dependencies
thomashoneyman Nov 19, 2023
3fa90b5
Address internal comments
thomashoneyman Nov 20, 2023
628fdf0
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Nov 20, 2023
0d3cef9
Re-enable comment
thomashoneyman Nov 20, 2023
4e8cb87
Remove unnecessary
thomashoneyman Nov 20, 2023
d7c4180
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Dec 1, 2023
81c85a4
Fix 'removed packages' stats
thomashoneyman Dec 1, 2023
10bccee
Feedback
thomashoneyman Dec 1, 2023
26c5aa0
Always print publish stats
thomashoneyman Dec 1, 2023
b11917e
tweaks
thomashoneyman Dec 4, 2023
3ddde82
Better publish stats formatting and write removals
thomashoneyman Dec 4, 2023
ec388d1
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Dec 4, 2023
5b17cb3
Update flake
thomashoneyman Dec 5, 2023
f924b31
Integrate inserting missing dependencies
thomashoneyman Dec 7, 2023
3cdb9b9
Tweaks for efficiency
thomashoneyman Dec 7, 2023
d0181e5
(hopefully) final run of the importer
thomashoneyman Dec 8, 2023
6f9f0cd
Update spec to note transitive dependencies requirement.
thomashoneyman Dec 8, 2023
2721c6a
attempt to discover publish compiler with both legacy and current ind…
thomashoneyman Dec 8, 2023
f8d0f80
Tweaks
thomashoneyman Dec 10, 2023
e2d6e87
Patch some legacy manifests
thomashoneyman Dec 10, 2023
b8a21a8
Range tweaks for bolson/deku/rito
thomashoneyman Dec 11, 2023
3d7ab49
Update to fix darwin support for spago builds
thomashoneyman Dec 18, 2023
6bc8d09
Clean up publish stats
thomashoneyman Dec 18, 2023
9acbc94
Enforce an explicit 0.13 date cutoff / core org cutoff
thomashoneyman Dec 19, 2023
d2c3b9a
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Jan 5, 2024
bea2013
Move location check above manifest parse
thomashoneyman Jan 17, 2024
c942722
Merge branch 'master'
thomashoneyman Jul 29, 2024
637a757
format
thomashoneyman Jul 29, 2024
ab184f2
Fix octokit codec merge error
thomashoneyman Jul 29, 2024
9cc56e7
Revert "Fix octokit codec merge error"
thomashoneyman Jul 29, 2024
c05fcb9
Set compiler explicitly to 0.15.5
thomashoneyman Jul 29, 2024
637488d
Tweaks
thomashoneyman Jul 29, 2024
662dd00
Set all purs test compilers to 0.15.4 range
thomashoneyman Jul 29, 2024
8156aa2
Update retry logic to fix integration test
thomashoneyman Jul 30, 2024
ed7913c
Complete run of legacy importer
thomashoneyman Aug 26, 2024
ec8e3ff
Format
thomashoneyman Aug 26, 2024
d7d5e49
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Aug 29, 2024
7d74da3
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Oct 25, 2024
a3f086b
Update SPEC.md
f-f Jun 7, 2025
de7c6e3
Add 'purescript' to the list of reserved packages
f-f Jun 7, 2025
8c8d728
Move to NonEmpty in dhall types
f-f Jun 7, 2025
7b57771
compilers is a NonEmptyArray
f-f Jun 8, 2025
e3d484b
Merge branch 'master' into trh/compilers-in-metadata
f-f Jun 8, 2025
343b90b
Fix tests
f-f Jun 8, 2025
67c7cb5
Fix tests
f-f Jun 8, 2025
88dcffe
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Dec 2, 2025
dec066b
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Dec 6, 2025
e863d6b
fix e2e tests
thomashoneyman Dec 6, 2025
f27cd33
Merge branch 'master' into trh/compilers-in-metadata
thomashoneyman Dec 17, 2025
beb8d93
update flake to latest
thomashoneyman Dec 17, 2025
a55bef6
add an agents file
thomashoneyman Jan 3, 2026
5ab364a
add archive seeder script
thomashoneyman Jan 4, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,8 @@ result*
*.sqlite3
*.sqlite3-wal
*.sqlite3-shm

TODO.md
.spec-results

# Keep it secret, keep it safe.
Expand Down
129 changes: 129 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# AGENTS.md

The PureScript Registry implements a package registry for PureScript. See @SPEC.md for the registry specification and @CONTRIBUTING.md for detailed contributor documentation.

## Development Environment

This project uses Nix with direnv. You should already be in the Nix shell automatically when entering the directory. If not, run:

```sh
nix develop
```

### Build and Test

The registry is implemented in PureScript. Use spago to build it and run PureScript tests. These are cheap and fast and should be used when working on the registry packages.

```sh
spago build # Build all PureScript code
spago test # Run unit tests
```

Integration tests require two terminals (or the use of test-env in detached mode). The integration tests are only necessary to run if working on the server (app).

```sh
# Terminal 1: Start test environment (wiremock mocks + registry server on port 9000)
nix run .#test-env

# Terminal 2: Run E2E tests once server is ready
spago run -p registry-app-e2e
```

Options: `nix run .#test-env -- --tui` for interactive TUI, `-- --detached` for background mode.

#### Smoke Test (Linux only)

The smoke test verifies that the server comes up properly and tests deployment. Only run this test if you are making changes which could break the deployment of the server.

```sh
nix build .#checks.x86_64-linux.smoke -L
```

#### Continuous Integration via Nix Checks

There is a full suite of checks implemented with Nix which verify that packages build, formatting is correct, registry types are Dhall-conformant, and more. This is the primary check run in CI.

```sh
nix flake check -L
```

## Formatting

```sh
# Format PureScript
purs-tidy format-in-place app app-e2e foreign lib scripts
purs-tidy check app app-e2e foreign lib scripts

# Format Nix files
nixfmt *.nix nix/**/*.nix
```

## Project Structure

- `app/` — Registry server implementation.
- `app-e2e/` — E2E tests for the server API.
- `lib/` — **Public library** for consumers (Spago, Pursuit, etc.). Only types and functions useful to external tools belong here. Avoid implementation-specific code.
- `foreign/` — FFI bindings to JavaScript libraries.
- `scripts/` — Runnable modules for registry tasks (LegacyImporter, PackageTransferrer, PackageSetUpdater, etc.). Run via `nix run .#legacy-importer`, etc.
- `test-utils/` — Shared test utilities.
- `db/` — SQLite schemas and migrations (use `dbmate up` to initialize).
- `types/` — Dhall type specifications.
- `nix/` — Nix build and deployment configuration.

## Scripts & Daily Workflows

The `scripts/` directory contains modules run as daily jobs by the purescript/registry repository:

- `LegacyImporter` — imports package versions from legacy Bower registry
- `PackageTransferrer` — handles package transfers
- `PackageSetUpdater` — automatic daily package set updates

Run scripts via Nix: `nix run .#<kebab-case-name>` (e.g., `nix run .#legacy-importer`). All scripts support `--help` for usage information.

## Scratch Directory & Caching

The `scratch/` directory (gitignored) is used by scripts for:
- `.cache/` — Cached API responses, downloaded packages, etc.
- `logs/` — Log files
- `registry/`, `registry-index/` — Local clones for testing, also modified and optionally committed to by scripts

Caching is critical for the legacy importer due to the expense of downloading packages. The `Registry.App.Effect.Cache` module handles caching.

## PureScript Conventions

### Custom Prelude

Always use `Registry.App.Prelude` in `app/` and `app-e2e/` directories:

```purescript
import Registry.App.Prelude
```

### Effects via Run

Use the `run` library for extensible effects. Do NOT perform HTTP calls, console logs, or other effects directly in `Aff`. Check for existing effects in `app/src/App/Effect/` or consider adding one.

### Import Style

Import types unqualified, values qualified. Use shortened module names:

```purescript
import Registry.App.Prelude

import Data.Array as Array
import Data.String as String
import Node.FS.Aff as FS.Aff
import Parsing (Parser)
import Parsing as Parsing
import Parsing.Combinators as Parsing.Combinators
import Registry.Operation (AuthenticatedData)
import Registry.SSH as SSH
```

## Deployment

Continuous deployment via GitHub Actions on master. Manual deploy:

```sh
colmena apply
```
5 changes: 3 additions & 2 deletions SPEC.md
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ Note:

- Globs you provide at the `includeFiles` and `excludeFiles` keys must contain only `*`, `**`, `/`, `.`, `..`, and characters for Linux file paths. It is not possible to negate a glob (ie. the `!` character), and globs cannot represent a path out of the package source directory.
- When packaging your project source, the registry will first "include" your `src` directory and always-included files such as your `purs.json` file. Then it will include files which match globs indicated by the `includeFiles` key ([always-ignored files](#always-ignored-files) cannot be included). Finally, it will apply the excluding globs indicated by the `excludeFiles` key to the included files ([always-included files](#always-included-files) cannot be excluded).
- Dependencies you provide at the `dependencies` key must exist in the registry, and the dependency ranges must be solvable (ie. it must be possible to produce a single version of each dependency that satisfies the provided version bounds, including any transitive dependencies).
- Dependencies you provide at the `dependencies` key must exist in the registry, the dependency ranges must be solvable (ie. it must be possible to produce a single version of each dependency that satisfies the provided version bounds, including any transitive dependencies), and transitive dependencies are not allowed (ie. any modules you import in your code must come from packages listed in your dependencies).

For example:

Expand All @@ -234,11 +234,12 @@ For example:

All packages in the registry have an associated metadata file, which is located in the `metadata` directory of the `registry` repository under the package name. For example, the metadata for the `aff` package is located at: https://github.com/purescript/registry/blob/main/metadata/aff.json. Metadata files are the source of truth on all published and unpublished versions for a particular package for what there content is and where the package is located. Metadata files are produced by the registry, not by package authors, though they take some information from package manifests.

Each published version of a package records three fields:
Each published version of a package records the following fields:

- `hash`: a [`Sha256`](#Sha256) of the compressed archive fetched by the registry for the given version
- `bytes`: the size of the tarball in bytes
- `publishedTime`: the time the package was published as an `ISO8601` string
- `compilers`: compiler versions this package is known to work with. This field can be in one of two states: a single version indicates that the package worked with a specific compiler on upload but has not yet been tested with all compilers, whereas a non-empty array of versions indicates the package has been tested with all compilers the registry supports.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be tidier to only allow a non-empty array instead of several possible types? After all, the state with multiple compilers listed is going to be a superset of the first state.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with the non-empty array is that it isn't clear whether an array of a single element represents one of:

  • a package that has been published with the given compiler, but which hasn't been tested against the full set of compilers
  • a package that has been tested against the full set of compilers and only works with one

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When are we going to end up in a situation where we don't test the package against the whole set of compilers? My reading of the PR is that we always do?

In any case, we'll always have packages that are not "tested against the full set of compilers": when a new compiler version comes out, then all packages will need a retest, and if a package doesn't have the new compiler in the array then we don't know if it's not compatible or if it hasn't been tested yet.

Maybe we need another piece of state somewhere else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When are we going to end up in a situation where we don't test the package against the whole set of compilers? My reading of the PR is that we always do?

Yes, as implemented here we just go ahead and test everything as soon as we've published. However, I split out the state because in our initial discussions we worried about how long it takes for the compiler builds to run (it takes publishing from N seconds to N minutes in some cases — large libraries or ones that leverage a lot of type machinery). We'd originally talked about the compiler matrix being a cron job that runs later in the day. I just made it part of the publishing pipeline directly because it was simpler to implement.

If we decide that it's OK for publishing to take a long time then we can eliminate this state and just test the compilers immediately. In that case we'd just have a non-empty array.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any case, we'll always have packages that are not "tested against the full set of compilers": when a new compiler version comes out, then all packages will need a retest, and if a package doesn't have the new compiler in the array then we don't know if it's not compatible or if it hasn't been tested yet.

Yea, that's a good point. You don't know if the metadata you're reading just hasn't been reached yet by an ongoing mass compiler build to check a new compiler.

Maybe we need another piece of state somewhere else?

Off the top of my head I don't know a good place to put some state about possible compiler support; the metadata files are not helpful if a new compiler comes out and we're redoing the build since they're only aware of the one package.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we decide that it's OK for publishing to take a long time then we can eliminate this state and just test the compilers immediately. In that case we'd just have a non-empty array.

I'm cool with this if you are.

We'll always have packages that are not "tested against the full set of compilers" [...] maybe we need another piece of state somewhere else?

We could either a) say that the supported list of compilers for a package can potentially be missing the current compiler if the matrix is currently running and not bother with state or b) put a JSON file or something in the metadata directory that indicates whether the compiler matrix is running. Then consumers can look at that.

Personally the matrix runs infrequently enough (just new compiler releases!) that I would rather opt for (a).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pondered this for a few days and I think it's complicated?

Since we're going towards a model where we'd only run one registry job at a time and queue the rest (to prevent concurrent pushes to the repo), I'm afraid that running the whole matrix at once would make publishing very slow.
Something that we could do to counteract this could be to split the "publish" and the "matrix runs": on publishing we'd just add the package metadata with one compiler, and at the end of the publishing job we'd queue a series of "compiler matrix" jobs, each testing one compiler. These jobs would be of low priority, so new publishes would get in front of the queue, and things can stay snappy.

Personally the matrix runs infrequently enough (just new compiler releases!) that I would rather opt for (a).

The approach detailed above implies that we're in a world where we do (a), i.e. the list of compilers is always potentially out of date, and that's fine.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional note about the above: since the above would be introducing an "asynchronous matrix builder", we need to consider the dependency tree in our rebuilding: if a package A is published with compiler X, and then a package B depending on it is immediately published after it (a very common usecase since folks seem to publish their packages in batches), then we'd need to either make sure that matrix-build jobs for B are always run after matrix-build jobs for A, or retry them somehow.


Each unpublished version of a package records three fields:

Expand Down
3 changes: 1 addition & 2 deletions app/fixtures/github-packages/effect-4.0.0/bower.json
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,6 @@
"package.json"
],
"dependencies": {
"purescript-prelude": "^6.0.0",
"purescript-type-equality": "^4.0.0"
"purescript-prelude": "^6.0.0"
}
}
12 changes: 12 additions & 0 deletions app/fixtures/github-packages/transitive-1.0.0/bower.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
{
"name": "purescript-transitive",
"homepage": "https://github.com/purescript/purescript-transitive",
"license": "BSD-3-Clause",
"repository": {
"type": "git",
"url": "https://github.com/purescript/purescript-transitive.git"
},
"dependencies": {
"purescript-effect": "^4.0.0"
}
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
module Transitive where

import Prelude

uno :: Int
uno = one
8 changes: 6 additions & 2 deletions app/fixtures/registry/metadata/prelude.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,12 @@
},
"published": {
"6.0.1": {
"bytes": 31142,
"hash": "sha256-o8p6SLYmVPqzXZhQFd2hGAWEwBoXl1swxLG/scpJ0V0=",
"bytes": 31129,
"compilers": [
"0.15.9",
"0.15.10"
],
"hash": "sha256-EbbFV0J5xV0WammfgCv6HRFSK7Zd803kkofE8aEoam0=",
"publishedTime": "2022-08-18T20:04:00.000Z",
"ref": "v6.0.1"
}
Expand Down
8 changes: 6 additions & 2 deletions app/fixtures/registry/metadata/type-equality.json
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,12 @@
},
"published": {
"4.0.1": {
"bytes": 2184,
"hash": "sha256-Hs9D6Y71zFi/b+qu5NSbuadUQXe5iv5iWx0226vOHUw=",
"bytes": 2179,
"compilers": [
"0.15.9",
"0.15.10"
],
"hash": "sha256-3lDTQdbTM6/0oxav/0V8nW9fWn3lsSM3b2XxwreDxqs=",
"publishedTime": "2022-04-27T18:00:18.000Z",
"ref": "v4.0.1"
}
Expand Down
Loading