Skip to content

DecentralizedGeo/stac-manager

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

352 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

STAC Manager

Modular Pipeline Orchestration for STAC Collections

Python 3.12+ License


What is STAC Manager?

STAC Manager is a Python library for building, orchestrating, and executing modular STAC data pipelines. It enables you to ingest STAC items from APIs or files, transform and enrich Item metadata from external input data (JSON/CSV), validate Item compliance, extend Items with extension properties, and output to various formats—all through declarative YAML configuration or a programmatic Python API.

Built on the Pipes and Filters architecture, STAC Manager provides specialized modules that compose into powerful workflows while maintaining simplicity and testability.

Important

STAC Manager is a work in progress with new functionality added frequently.


Key Features

  • 🔌 Modular Architecture: 7 pipeline modules (Ingest, Seed, Transform, Update, Validate, Extension, Output)
  • 📝 Declarative Configuration: Define workflows in YAML with full validation
  • 🔄 Streaming Pipeline: Process millions of items with constant memory usage
  • 🎯 Wildcard Patterns: Bulk update assets using assets.* syntax with template variables
  • STAC Compliance: Built-in validation using stac-validator
  • 🎯 Matrix Strategies: Run parallel pipelines for multiple collections
  • 💾 Checkpoint Resume: Recover from failures without re-processing completed items
  • 🐍 Python 3.12+: Modern type hints and structural pattern matching

Quick Example

# workflow.yaml
name: ingest-and-validate
steps:
  - id: fetch
    module: IngestModule
    config:
      mode: api
      source: https://planetarycomputer.microsoft.com/api/stac/v1
      collection_id: sentinel-2-l2a
      max_items: 100

  - id: validate
    module: ValidateModule
    depends_on: [fetch]
    config:
      strict: true

  - id: output
    module: OutputModule
    depends_on: [validate]
    config:
      base_dir: ./outputs
      format: json

Run it:

stac-manager run-workflow workflow.yaml

Installation

Requirements: Python 3.12+

Via pip (recommended)

pip install stac-manager

Via Poetry

poetry add stac-manager

From Source

git clone https://github.com/DecentralizedGeo/stac-manager.git
cd stac-manager
poetry install

Verify installation:

stac-manager --version

Next Steps


License

MIT License - see LICENSE for details.


Acknowledgments

Built with:

About

A comprehensive Python toolkit for building, managing, and extending STAC metadata

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages