Skip to content

feat: integrates carbon estimation into API and resolves Issue #14#27

Open
grunffindor wants to merge 22 commits into
Climate-Vision:mainfrom
grunffindor:fix-issue-14
Open

feat: integrates carbon estimation into API and resolves Issue #14#27
grunffindor wants to merge 22 commits into
Climate-Vision:mainfrom
grunffindor:fix-issue-14

Conversation

@grunffindor
Copy link
Copy Markdown

This PR resolves Issue #14 by integrating the carbon stock estimation module into the main API pipeline with graceful degradation.

Key Changes:

  • Pipeline: Updated pipeline.py to execute carbon math during deforestation analysis.
  • Schemas & API: Added the enable_carbon feature flag to request schemas and forwarded it through the inference endpoints.
  • New Endpoint: Created GET /api/reports/{run_id} to return structured impact reports (hectares_lost, carbon_tonnes, and confidence intervals).
  • Testing: Added unit tests in test_carbon.py to validate the carbon math against expected IPCC factors.

Resolves #14

Goldokpa and others added 22 commits March 28, 2026 21:22
…iddleware-audit

Merging Olufemi's API middleware and auth modules
…tics-statistics

Merging Francis's analytics statistics and reporting modules
Defines responsibilities, deliverables, and collaboration guidelines for the Carbon Analytics & Validation role.

Co-Authored-By: Francis Umo <francis.umo@climatevision.org>
Defines responsibilities, deliverables, and collaboration guidelines for the API Development & Integration role.

Co-Authored-By: Olufemi Taiwo <olufemi.taiwo@climatevision.org>
…mate-Vision#7)

* feat(data): add GEE tile downloader with analysis-aware band selection

- Downloads real Sentinel-2 composites via Google Earth Engine
- Reads required bands from config.yaml per analysis_type
- Includes SCL band for downstream cloud masking
- Synthetic fallback with explicit is_synthetic flag when GEE unavailable
- Fix .gitignore so src/climatevision/data/ is no longer ignored

* feat(data): add analysis-specific Sentinel-2 band mapping utilities

- get_bands_for_analysis() reads correct bands from config.yaml
- get_band_indices() maps band names to canonical 13-band stack positions
- is_analysis_enabled() and list_enabled_analysis_types() for config validation
- Includes SCL band helpers for downstream cloud masking

* feat(data): integrate SCL cloud masking and export new pipeline modules

- apply_scl_cloud_mask() masks cloudy pixels using Sentinel-2 SCL band
- Default clear labels: vegetation, bare soils, water, snow
- Update __init__.py to expose gee_downloader and band_mapping utilities

* refactor(data): address PR review feedback

- Remove duplicated config logic in gee_downloader.py; import from band_mapping
- Cache config.yaml load in band_mapping.py via lru_cache
- Read synthetic tile size from config.yaml instead of hardcoding 256
- Remove unused json import in gee_downloader.py
- Add shape validation in apply_scl_cloud_mask

---------

Co-authored-by: Adeolu Mary Oshadare <adeolu@placeholder.com>
…ing (Climate-Vision#8)

* feat(inference): make pipeline analysis-aware with dynamic model loading

- _load_model() now accepts analysis_type and reads in_channels/num_classes from config.yaml
- Per-analysis-type model cache prevents cross-contamination between deforestation/ice/flood models
- _find_best_checkpoint() prefers config.yaml weight path per analysis type
- run_inference() accepts analysis_type, pads/crops to correct n_channels, and returns dynamic class counts
- run_inference_from_file() and run_inference_from_gee() propagate analysis_type parameter

* feat(api): wire analysis_type into prediction endpoints

- Pass body.analysis_type to run_inference_from_gee() in /api/predict
- Pass analysis_type to run_inference_from_file() in /api/predict/upload
- Enables the API to load the correct model and return correct class counts per analysis type

---------

Co-authored-by: Olufemi Taiwo <Olufemitaiwo23@gmail.com>
… flag, add config health validation

- Add cv_dev development key bypass for local testing
- Require X-API-Key on all mutation endpoints (POST predict, orgs, alerts, subscriptions)
- Surface is_synthetic at root of inference response for frontend demo banners
- Expand /api/health to validate config alignment (bands vs in_channels, classes vs num_classes)
- Add FastAPI test client fixture
- Create CI workflow for Python (flake8, pytest) and frontend (npm build)
- Bootstrap tests/ directory structure
- Parametrize UNet init for all 3 analysis types (4ch/2cl, 4ch/3cl, 3ch/3cl)
- Validate forward pass output shapes
- Add Siamese change detection forward shape test
- Link to 6 active good-first-issue and help-wanted issues
- Add claim workflow for new contributors
- Include time estimates and skill-building map
- ../components/map/ -> ../components/Map/
- Fixes vite build failure on Linux (case-sensitive filesystem)
- Fixes pip install failure for gdal and rasterio on Ubuntu runners
- Adds libgdal-dev, gdal-bin, libgl1-mesa-glx
- gdal Python package requires exact system GDAL version matching
- rasterio covers all GDAL functionality we actually use
- Simplify CI system deps to libgl1 only (for opencv runtime)
- Fixes ModuleNotFoundError: No module named 'climatevision'
- pip install -e . registers src/ as an importable package
- ForestDataset with DataLoader support
- Training/validation augmentation pipelines
- Synthetic tile generation for demo/fallback mode
- Add DONE/PENDING task list for April 2026 sprint
- Include actual .github/workflows/ci.yml code in role doc
- Update local CI check commands to match current workflow
Copy link
Copy Markdown
Collaborator

@femi23 femi23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for tackling #14 @grunffindor — wiring the new analytics.carbon.CarbonEstimator into the inference path is genuinely useful work and the feature flag (enable_carbon) is the right shape. There are a few real issues to address before this can land though, and one of them is a security regression.

Blockers

1. 🚨 Auth on POST /api/organizations is commented out.

@app.post("/api/organizations", response_model=OrganizationWithKeyResponse)
def create_org(
    body: CreateOrganizationRequest,
    # org: dict[str, Any] = Depends(require_api_key),   <-- this PR
) -> dict[str, Any]:

This can't merge. The require_api_key dependency is what prevents anonymous callers from creating organizations and minting new API keys. With it disabled, anyone on the open internet can POST /api/organizations and walk away with a fresh API key for a free organization, which then bypasses every other key-gated endpoint we have. Please drop this hunk entirely — auth on create_org is intentional. If you ran into a test setup issue with the dependency (e.g. local tests failing because there's no bootstrap key), let me know and we can sort it out separately.

2. Stray indentation / whitespace edits unrelated to the feature. A few hunks have accidental trailing whitespace and an extra blank indented line inside predict_upload's exception branch (the + \n line). Please clean these up — they'll make every future blame on those lines point at this PR.

3. The branch is behind main and needs a rebase. The diff context for create_org doesn't match current main (which already does not have the require_api_key dep on that function — main and this branch have diverged). Please git fetch origin && git rebase origin/main and re-push. After the rebase, blocker #1 may resolve itself naturally (the line you're commenting out won't be there).

Should-fix

4. /api/reports/{run_id} overlaps the existing /api/reports (PR #42) namespace. That one is admin-scoped data-quality KPIs; yours is per-run impact. They can coexist because of the path parameter, but the naming will confuse callers reading the OpenAPI spec. Consider:

  • /api/runs/{run_id}/impact-report (RESTful, scoped under runs)
  • or /api/impact-reports/{run_id} (clearly distinct from admin reports)

5. The new endpoint should probably be auth-gated too. create_org notwithstanding, /api/reports/{run_id} exposes hectares/carbon estimates for a run. If runs are tied to organizations, we should at least require an API key. Add org: dict = Depends(require_api_key) and (eventually) verify the run belongs to the caller's org.

6. enable_carbon defaults to False everywhere — please document the trade-off. Add a one-liner in the OpenAPI description= field explaining that carbon estimation adds ~N ms of post-processing per inference. Helps API consumers decide.

7. Defensive double-check on the conversion. In pipeline.py:

mask_np = (predictions == 1).squeeze(0).cpu().numpy()

This assumes predictions is shape (1, H, W). If batched inference ever returns (N, H, W) you'll silently produce an (H, W) mask from only the first sample. Either assert the batch dim or iterate. @franchaise — can you confirm CarbonEstimator.estimate_from_mask is fine taking a single 2-D mask only? If batching is in the roadmap we should plumb through.

Nits

  • The carbon-estimation block has #Begin fix issue #14 / #End fix issue #14 sentinel comments. Drop these — git history is the record. Same for # Forwarding feature flag (Issue #14) inline tags; the PR description is the place to link the issue.
  • The new ImpactReportResponse schema is declared but the endpoint returns a dict, not the model. Annotate the endpoint with -> ImpactReportResponse and response_model=ImpactReportResponse on the decorator so FastAPI validates and the OpenAPI matches.
  • enable_carbon:boolenable_carbon: bool (PEP 8 spacing).

Let's get the auth hunk reverted and the rebase done; the carbon wiring itself is solid and I want this to land.

cc @franchaise for the carbon-math side.

@Goldokpa
Copy link
Copy Markdown
Member

📢 Heads-up: repo history was rewritten today (2026-05-18)

We force-pushed a cleaned history across all branches to remove an internal directory from past commits. Your code and this PR are unaffected — only the commit SHAs underneath have shifted. GitHub will re-render the diff against the new base automatically.

If you have a local clone, please bring it back in sync before pushing anything else:

# Option A (simplest): fresh start
git clone https://github.com/Climate-Vision/ClimateVision.git

# Option B: rebase the existing PR branch in your fork
git fetch origin
git checkout <your-branch>
git rebase origin/main          # likely no conflicts
git push --force-with-lease

Do not git pull on an existing clone — it will produce a messy non-fast-forward state. Either re-clone, or rebase explicitly as above.

Apologies for the interruption — really appreciate your patience here. If anything looks off after rebasing, leave a comment and I'll help unblock right away. Thanks for contributing 🙏

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Good First Issue] Wire carbon analytics into API response and add impact report endpoint

5 participants