Modular Pipeline Orchestration for STAC Collections
STAC Manager is a Python library for building, orchestrating, and executing modular STAC data pipelines. It enables you to ingest STAC items from APIs or files, transform and enrich Item metadata from external input data (JSON/CSV), validate Item compliance, extend Items with extension properties, and output to various formats—all through declarative YAML configuration or a programmatic Python API.
Built on the Pipes and Filters architecture, STAC Manager provides specialized modules that compose into powerful workflows while maintaining simplicity and testability.
Important
STAC Manager is a work in progress with new functionality added frequently.
- 🔌 Modular Architecture: 7 pipeline modules (Ingest, Seed, Transform, Update, Validate, Extension, Output)
- 📝 Declarative Configuration: Define workflows in YAML with full validation
- 🔄 Streaming Pipeline: Process millions of items with constant memory usage
- 🎯 Wildcard Patterns: Bulk update assets using
assets.*syntax with template variables - ✅ STAC Compliance: Built-in validation using
stac-validator - 🎯 Matrix Strategies: Run parallel pipelines for multiple collections
- 💾 Checkpoint Resume: Recover from failures without re-processing completed items
- 🐍 Python 3.12+: Modern type hints and structural pattern matching
# workflow.yaml
name: ingest-and-validate
steps:
- id: fetch
module: IngestModule
config:
mode: api
source: https://planetarycomputer.microsoft.com/api/stac/v1
collection_id: sentinel-2-l2a
max_items: 100
- id: validate
module: ValidateModule
depends_on: [fetch]
config:
strict: true
- id: output
module: OutputModule
depends_on: [validate]
config:
base_dir: ./outputs
format: jsonRun it:
stac-manager run-workflow workflow.yamlRequirements: Python 3.12+
pip install stac-managerpoetry add stac-managergit clone https://github.com/DecentralizedGeo/stac-manager.git
cd stac-manager
poetry installVerify installation:
stac-manager --version- 📖 Installation Guide - Detailed setup instructions
- 🚀 Quickstart - Run your first workflow in 5 minutes
- 📚 Architecture Guide - Pipes and Filters model
- 🔧 Module Reference - Complete module documentation
MIT License - see LICENSE for details.
Built with:
- PySTAC - STAC data models
- pystac-client - STAC API client
- stac-validator - STAC validation
- stac-geoparquet - Parquet conversion