Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 18 additions & 6 deletions .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -5,18 +5,30 @@
# Host port exposed by compose.yaml for the local PostGIS service.
OI_DB_PORT=5434

# Optional external data workspace. Useful when `/` is space-constrained.
# Parent directory for hash-addressed PBF workspaces.
# Default:
# OI_DATA_PARENT=/Volumes/goose-drive/openinterstate
#
# With the default layout, a source PBF with SHA-256 <sha> uses:
# $OI_DATA_PARENT/workspaces/pbf-sha256/<sha>
# Raw source downloads are cached under:
# $OI_DATA_PARENT/source-cache
# Cargo build cache is shared under:
# $OI_DATA_PARENT/cache/cargo

# Optional explicit workspace override. Leave this unset unless you need to pin
# the build to a specific directory instead of using the PBF SHA-derived path.
# Example:
# OI_DATA_ROOT=/Volumes/goose-drive/openinterstate-data
# OI_DATA_ROOT=/Volumes/goose-drive/openinterstate/workspaces/pbf-sha256/<sha>

# Optional release output root. Defaults to $OI_DATA_ROOT/releases.
# Example:
# OI_RELEASE_DIR=/Volumes/goose-drive/openinterstate-releases
# OI_RELEASE_DIR=/Volumes/goose-drive/openinterstate/releases

# Optional cache roots. By default these now live under $OI_DATA_ROOT/cache so
# Docker + Cargo state can stay on an external volume too.
# Optional cache roots. Cargo cache now defaults under $OI_DATA_PARENT/cache so
# Rust build artifacts are reused across PBF workspaces.
# Example:
# OI_CARGO_TARGET_DIR=/Volumes/goose-drive/openinterstate-data/cache/cargo/target
# OI_CARGO_TARGET_DIR=/Volumes/goose-drive/openinterstate/cache/cargo/target

# Default source file used by `./bin/openinterstate build`.
OI_DEFAULT_US_PBF_URL=https://download.geofabrik.de/north-america/us-latest.osm.pbf
Expand Down
54 changes: 54 additions & 0 deletions AGENTS.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# OpenInterstate Agent Notes

## Default Workspace

- Default the managed parent root to `/Volumes/goose-drive/openinterstate`.
- In normal operation, do not choose an explicit `OI_DATA_ROOT` up front. Instead, resolve the source PBF first, hash it with SHA-256, and use the workspace path `/Volumes/goose-drive/openinterstate/workspaces/pbf-sha256/<sha256>`.
- Treat `/Volumes/goose-drive/openinterstate/source-cache` as the shared raw-source download cache.
- Treat `/Volumes/goose-drive/openinterstate/cache/cargo` as the shared Rust build cache across all PBF workspaces.
- Let release artifacts default under the selected PBF workspace, which means `<workspace>/releases`, unless the user explicitly overrides the release root.
- Only use `--data-dir` or `OI_DATA_ROOT` when the user explicitly asks to pin an exact workspace path and bypass the SHA-derived default.

## Import And Derive Workflow

- Treat canonical PBF import and derive as separate stages when planning work.
- Prefer derive-only iteration against the existing canonical database whenever the current import is still usable.
- Before proposing or running any fresh PBF import, first compute or read the PBF SHA and check the matching workspace under `workspaces/pbf-sha256/<sha256>`.
- Reuse an existing downloaded source PBF, filtered canonical PBF, and canonical osm2pgsql import whenever their inputs and mappings are still valid.
- Avoid re-importing PBF data unless it is clearly necessary because the source changed, the import mapping changed, the canonical database is missing or invalid, or the user explicitly requests a re-import.

## Operational Bias

- When discussing or running local build commands, default to `--data-parent /Volumes/goose-drive/openinterstate` rather than a repo-local `.data/` directory or a hard-coded workspace path.
- If derive-stage work is requested, start from the assumption that the existing canonical import should be preserved and reused.
- If a re-import is required, explain why it is necessary before doing it.

## Pike SQLite Follow-On

- After exporting an OpenInterstate release locally, also build Pike's SQLite pack from that exact release archive unless the user explicitly says not to.
- Use Pike's supported release-driven pipeline entrypoint from `/Users/tjohnell/projects/pike/server`: `./pike-pipeline.sh build --release-file /abs/path/openinterstate-release-<release-id>.tar.gz --reachability-snapshot /Volumes/goose-drive/pike-osrm/reachability/pike.osrm-reachability.snapshot.pgdump`.
- Let Pike keep its own default output locations unless the user asks otherwise. The current default host pack output is `/Users/tjohnell/projects/pike/server/.data/packs/pike.sqlite` and the staged build file is `/Users/tjohnell/projects/pike/server/.data/packs/pike.sqlite.new`.
- After the Pike build finishes, validate the pack with `sqlite3` by checking `PRAGMA integrity_check;` and confirming the `meta` table reports the matching `openinterstate_release_id`.

## Named Comparison: Pike Interstate Exit Coverage Diff

- If the user asks to rerun the comparison, refer to it as `Pike Interstate Exit Coverage Diff`.
- Purpose: compare the latest OpenInterstate-derived Pike pack against the latest published Pike release pack, limited to Interstate corridor and exit coverage.
- Inputs:
- OpenInterstate-derived Pike pack: `/Users/tjohnell/projects/pike/server/.data/packs/pike.sqlite`
- Latest published Pike release pack on NFS: newest `/Volumes/goose-plex-media/pike/releases/*/pike.sqlite`
- Before comparing, stage the latest published Pike release pack off NFS into `/Users/tjohnell/projects/pike/server/.data/compare/`. If a same-size, same-mtime local staged copy already exists, reuse it instead of copying again.
- Compare by `highway + canonical_direction`, starting from the OpenInterstate-derived pack's Interstate routes.
- Union exits across duplicate corridor rows for the same `highway + canonical_direction` key before counting or diffing.
- For route-level exit comparison, use distinct exit `ref` values when present. If a route has no usable `ref` values, fall back to a stable label such as `name` or `exit_id`.
- Separate findings into at least three buckets:
- likely real gaps where the published Pike release is a near-superset of the OpenInterstate-derived route
- likely real gaps where the OpenInterstate-derived route is a near-superset of the published Pike release
- likely key pollution or route conflation where one side has far more exits and low overlap
- Always report:
- route-level counts for both packs
- shared exit count
- exits only in OpenInterstate-derived pack
- exits only in published Pike release
- a short list of representative exit refs from each side for the biggest differences
- Write a durable comparison CSV into `/Users/tjohnell/projects/pike/server/.data/compare/` named like `openinterstate-<release-id>-vs-pike-<release-stamp>-route-exit-compare.csv`.
29 changes: 14 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,36 +15,35 @@ The repo is organized around one job:
If Docker is installed, this works from a fresh clone:

```bash
./bin/openinterstate build
./bin/openinterstate --data-parent /Volumes/goose-drive/openinterstate build
```

That command downloads `us-latest.osm.pbf`, starts PostGIS, imports canonical
OSM, derives product tables, and writes a release into `.data/releases/` by
default.
OSM, derives product tables, and writes a release under a workspace chosen from
the source PBF SHA-256:

If your main disk is tight, move the managed data workspace onto another volume:

```bash
./bin/openinterstate --data-dir /Volumes/goose-drive/openinterstate-data build
```text
/Volumes/goose-drive/openinterstate/workspaces/pbf-sha256/<sha256>
```

With that command, working data and release artifacts both land under
`/Volumes/goose-drive/openinterstate-data/`.

Runner caches now follow the managed data root too, so Cargo registry/git
state and the Rust target directory stay on the external volume instead of
quietly growing inside Docker-managed local storage.
Raw source downloads are shared under
`/Volumes/goose-drive/openinterstate/source-cache/`, and Cargo cache is shared
under `/Volumes/goose-drive/openinterstate/cache/cargo/` so Rust builds are
reused across PBF workspaces.

If you want release artifacts in a separate folder, set an explicit release
root:

```bash
./bin/openinterstate \
--data-dir /Volumes/goose-drive/openinterstate-data \
--release-dir /Volumes/goose-drive/openinterstate-releases \
--data-parent /Volumes/goose-drive/openinterstate \
--release-dir /Volumes/goose-drive/openinterstate/releases \
build
```

If you need to pin an exact workspace path and bypass the SHA-derived layout,
use `--data-dir` as an explicit override.

When the source PBF, import mapping, derive inputs, and release exporter are
unchanged, repeated builds now skip the already-current stages instead of
re-downloading or rebuilding them.
Expand Down
Loading
Loading