Skip to content

FreaxMATE/timedb

 
 

Repository files navigation

timedb

TL;DR

timedb is a opinionated schema and API built on top of PostgreSQL design to handle overlapping time series revisions and auditable human-in-the-loop updates.

Most time series systems assume a single immutable value per timestamp. timedb is built for domains where data is revised, forecasted, reviewed, and corrected over time.

timedb lets you:

  • ⏱️ Retain "time-of-knowledge" history through a three-dimensional time series data model;
  • ✍️ Make versioned ad-hoc updates to the time series data with annotations and tags; and
  • 🔀 Represent both timestamp and time-interval time series simultaneously.

Why timedb?

Most time series systems assume:

  • one value per timestamp;
  • immutable historical data; and
  • no distinction between when something was true vs when it was known.

This pattern is a major drawback in situations such as:

  • forecasting, where multiple forecast revisions predicts the same timestamp;
  • backtesting, where "time-of-knowledge" history is required by algorithms;
  • data communication, where and auditable history of updates is required.
  • Human review and correction, where values are manually adjusted, annotated, or validated over time
  • Late-arriving data and backfills, where new information must be incorporated without rewriting history

In practice, teams work around these limitations by overwriting data, duplicating tables and columns, or encoding semantics in column names — making systems fragile, opaque, and hard to reason about.

timedb addresses this by making revisions, provenance, and temporal semantics explicit in the data model, rather than treating them as edge cases.

Installation

pip install timedb

Basic usage

TBD

Tables

runs_table

Field Type Purpose
run_id (primary key) attribute Unique identifier for the run (generated by the API)
workflow_id attribute Identifier for the workflow that produced this run
run_start_time time dimension When the workflow started
run_finish_time (optional) time dimension When the workflow finished
run_params (optional) attribute Parameters/configuration used for this run (JSON object)
inserted_at time dimension When the row was inserted (default now())

values_table

Field Type Purpose
value_id (primary key) attribute Unique identifier for each version of a value
run_id (foreign key) attribute References the run that produced this value (runs_table.run_id)
valid_time time dimension Timestamp the value is valid for
valid_time_end (optional) time dimension Optional interval end time; NULL means point-in-time at valid_time
value (optional) measure The numeric value (nullable; NULL can be a valid stored value)
annotation (optional) attribute Optional human annotation (whitespace-only disallowed)
tags (optional) attribute Optional semantic labels / quality flags (empty arrays disallowed; use NULL)
changed_by (optional) attribute User or service responsible for the change
change_time time dimension When this version row was created (default now())
is_current attribute Whether this row is the active version for its key (default true)

metadata_table

Field Type Purpose
metadata_id (primary key) attribute Surrogate primary key for metadata rows
run_id (foreign key) attribute References run context (runs_table.run_id)
valid_time time dimension Time context for the metadata (joins onto values via (run_id, valid_time))
metadata_key attribute Name of the metadata field (e.g. contractId, deliveryStart)
value_number (optional) attribute Numeric metadata value (exactly one typed value must be set per row)
value_string (optional) attribute String metadata value (mutually exclusive with other typed values)
value_bool (optional) attribute Boolean metadata value (mutually exclusive with other typed values)
value_time (optional) attribute Timestamp metadata value (mutually exclusive with other typed values)
value_json (optional) attribute JSON metadata value (mutually exclusive with other typed values)
inserted_at time dimension When the metadata row was inserted (default now())

Changed

Three-dimensional time series data model

Every time series value is described using three independent timelines:

Time dimension Description
knowledge_time The time when the value was known
valid_time The time the value represents a fact for
change_time The time when the value was changed

Additional attributes

Schema columns provides additional attributes to the values according to:

Column name Description
value The numeric value being stored (may be NULL)
tags Semantic labels and quality flags applied to the value
annotation Optional human annotation explaining the value or change
run_id Reference to the workflow run that produced the value
run_params Parameters and configuration associated with the producing run
is_current Indicates whether this row is the active version for its key
changed_by User or service responsible for the change

Roadmap

  • Decouple the knowledge time from the run_time
  • Python SDK that allows time series data manipulations, reads and writes
  • RESTful API layer that serves data to users
  • Handle different time zones in the API layer while always storing in UTC in the database.
  • Support for postgres time intervals (tsrange/tstzrange)
  • Built in data retention, TTL, and archiving
  • Support for subscribing to database updates through the API
  • Unit handling (e.g. MW, kW)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 93.6%
  • PLpgSQL 6.1%
  • Shell 0.3%