Skip to content

[ENHANCEMENT] Added provider, resource, and version parameters to sources#535

Open
Eric Godwin (ericgodwin) wants to merge 2 commits into
mainfrom
ericg/530-update-sources-version
Open

[ENHANCEMENT] Added provider, resource, and version parameters to sources#535
Eric Godwin (ericgodwin) wants to merge 2 commits into
mainfrom
ericg/530-update-sources-version

Conversation

@ericgodwin
Copy link
Copy Markdown

@ericgodwin Eric Godwin (ericgodwin) commented May 21, 2026

Background

The intent of this change is to update our source item field to include the information necessary for data provenance:

- provider: The name of the entity that produced the data: meta, esri, microsoft, osm, etc.
- resource: The subject or type of data given by the provider: division-names, buildings, planet, etc.
- version: The sortable identifier such as a date or number: 2026-02-13, 5.3, A5692

Together, along with the version_id these values allow a user to uniquely identify what raw input data was used to construct Overture data. Our current system, of providing only a dataset is lacking dataset version information but is also inconsistently constructed. All three new fields will be nullable and optional to start as this is the first step where we are making it so the pipeline can populate these fields.

Major change release plan

While this change in itself is not a breaking change, it is part of a larger plan with major impact. The rough timeline for these changes is:

  • Update the schema to add provider + resource + version details as optional fields (this PR - June / July)
  • Non-schema work for pipelines to populate the new fields
  • Update the schema to A) make provider + resource + version required fields and B) mark dataset as deprecated and make it an optional field. (BREAKING - September)
  • Update the schema and code to remove the dataset field. (BREAKING - March 2027 or later)

Messaging around this change is that the current method of providing provenance is not sufficient to ensure traceability. Besides documenting the deprecation of dataset we will want to provide details on how the provider, resource, version work together to identify a data snapshot.

Closes #530

Testing

A couple of new examples / counterexamples have been added. In particular one to check that the length of each of the provided fields is at least 1 and a second which shows what properly populated fields look like.

The tests were then run with the following results:

  ## Test Results
  
  | Result | Count |
  |--------|-------|
  | ✅ Passed | 1993 |
  | ❌ Failed | 0 |
  | ⚠️  Errors | 0 |

  **Duration:** 5.21s

Documentation website

Docs preview for this PR.

…yItem

Signed-off-by: ericgodwin <eric@overturemaps.org>
@ericgodwin Eric Godwin (ericgodwin) added the change type - minor 🤏 Minor schema change. See https://lf-overturemaps.atlassian.net/wiki/x/GgDa label May 21, 2026
@github-actions
Copy link
Copy Markdown

🗺️ Schema reference docs preview is live!

🌍 Preview https://staging.overturemaps.org/schema/pr/535/schema/index.html
🕐 Updated May 21, 2026 17:34 UTC
📝 Commit 0893ae6
🔧 env SCHEMA_PREVIEW true

Note

♻️ This preview updates automatically with each push to this PR.

@ericgodwin Eric Godwin (ericgodwin) changed the title Added provider, resource, and version parameters to the sourcePropert… [Enhancement] Added provider, resource, and version parameters to sources May 21, 2026
@ericgodwin Eric Godwin (ericgodwin) changed the title [Enhancement] Added provider, resource, and version parameters to sources [ENHANCEMENT] Added provider, resource, and version parameters to sources May 21, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds additional provenance fields to sources schema items so that a source can be identified with finer granularity than the current dataset string (as groundwork for future deprecation of dataset). All changed files in the PR were reviewed.

Changes:

  • Extend sourcePropertyItem with optional provider, resource, and version string fields (with minLength constraints).
  • Update schema documentation text around sourcePropertyItem and sources to reflect the intended future direction.
  • Add one new example and one new counterexample illustrating populated vs. invalid empty values.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
schema/defs.yaml Adds provider/resource/version fields to the common sourcePropertyItem definition and updates related descriptions.
examples/buildings/sources-with-version.yaml New example demonstrating populated provider/resource/version fields in sources.
counterexamples/buildings/bad-sources-empty-provider.yaml New counterexample validating that empty strings for the new fields are rejected by the schema constraints.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread schema/defs.yaml
Comment thread schema/defs.yaml
Comment thread schema/defs.yaml
Agent-Logs-Url: https://github.com/OvertureMaps/schema/sessions/2997187e-b460-4134-a7d3-bd9fab7b2c22

Co-authored-by: ericgodwin <1336911+ericgodwin@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

change type - minor 🤏 Minor schema change. See https://lf-overturemaps.atlassian.net/wiki/x/GgDa

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Update sources details to provide the version

3 participants