Skip to content

feat: add 'date.to_start_of_interval' function for truncating values to interval boundaries#5741

Open
sunyeongchoi wants to merge 6 commits into
PRQL:mainfrom
AhnLab-OSSG:feature/datestartofinterval
Open

feat: add 'date.to_start_of_interval' function for truncating values to interval boundaries#5741
sunyeongchoi wants to merge 6 commits into
PRQL:mainfrom
AhnLab-OSSG:feature/datestartofinterval

Conversation

@sunyeongchoi
Copy link
Copy Markdown
Contributor

Closes #5734

Adds a to_start_of_interval function to provide native truncating values to interval boundaries capabilities in PRQL.

from events | select (event_time | date.to_start_of_interval 15 minute)

@max-sixty
Copy link
Copy Markdown
Member

max-sixty commented Mar 26, 2026

how about using some form of round or floor here?

either literally or at least using the term as part of the name?

@sunyeongchoi
Copy link
Copy Markdown
Contributor Author

sunyeongchoi commented Mar 26, 2026

@max-sixty
Thanks for the feedback.
I want to make sure I understood correctly. Are you proposing we use a term like 'floor' in the name (e.g., date.floor_interval) to make the floor logic more explicit?

@max-sixty
Copy link
Copy Markdown
Member

I want to make sure I understood correctly. Are you proposing we use a term like 'floor' in the name (e.g., date.floor_interval) to make the floor logic more explicit?

yes, something like this!

@prql-bot maybe you could do a survey of what is supported by other languages, around "rounding" to regular intervals

@prql-bot
Copy link
Copy Markdown
Collaborator

Here's a survey of how other systems name their datetime truncation/rounding-to-interval functions:

SQL Databases

Database Standard Trunc Arbitrary Interval (e.g. 15 min)
PostgreSQL date_trunc('hour', ts) date_bin('15 min', ts, origin)
ClickHouse date_trunc('hour', ts) toStartOfInterval(ts, INTERVAL 15 MINUTE)
DuckDB date_trunc('hour', ts) time_bucket(INTERVAL '15 min', ts) / date_bin(...)
BigQuery TIMESTAMP_TRUNC(ts, HOUR) (manual arithmetic)
Snowflake DATE_TRUNC('hour', ts) TIME_SLICE(ts, 15, 'MINUTE')
Presto/Trino date_trunc('hour', ts) date_bin(INTERVAL '15' MINUTE, ts, origin)
SQL Server DATETRUNC(hour, ts) (manual DATEADD/DATEDIFF)
TimescaleDB date_trunc(...) time_bucket('15 minutes', ts)

Programming Languages / Data Tools

System Function Notes
pandas dt.floor('15min') Also has dt.round(), dt.ceil()
Polars dt.truncate('15m') Also has dt.round()
R (lubridate) floor_date(ts, '15 minutes') Also round_date(), ceiling_date()
Julia Dates.floor(ts, Minute(15)) Also round(), ceil()
Spark date_trunc('hour', ts) window(ts, '15 minutes') for arbitrary

Key takeaways

  1. All systems use floor semantics — truncation/floor is the universal default, not round-to-nearest.
  2. date_trunc is the dominant SQL name but only handles standard calendar units, not arbitrary intervals.
  3. For arbitrary intervals, naming is fragmented: date_bin (PG/Trino), time_bucket (TimescaleDB/DuckDB), toStartOfInterval (ClickHouse), TIME_SLICE (Snowflake).
  4. Programming languages uniformly use floor — and offer round/ceil as companions.
  5. toStartOfInterval is ClickHouse-specific — no other major system uses this naming convention.

Possible PRQL names

Candidate Pros Cons
date.floor Matches pandas/R/Julia; short; natural round/ceil family Less familiar to SQL-only users
date.truncate Matches Polars; close to SQL date_trunc Slightly verbose
date.bin Matches PostgreSQL/Trino date_bin Jargon-y; less intuitive
date.bucket Matches TimescaleDB time_bucket Narrow adoption
date.to_start_of_interval Very descriptive Verbose; ClickHouse-specific

date.floor seems strongest — it's short, widely used in the data ecosystem, and naturally extends to date.round / date.ceil later if needed.

@sunyeongchoi
Copy link
Copy Markdown
Contributor Author

@max-sixty

What do you think of date.floor suggested by prql-bot? if you're okay with it, I'll go ahead and rename it.

from events | select (event_time | date.floor 15 minute)

@max-sixty
Copy link
Copy Markdown
Member

sorry I missed this! I think the .floor is good, if you happen to see this!

@sunyeongchoi sunyeongchoi force-pushed the feature/datestartofinterval branch from 8330dd8 to add2138 Compare May 20, 2026 04:11
- Updated integration snapshots for debug lineage queries to correct input and target IDs.
- Adjusted SQL test cases to ensure proper formatting and expected output.
- Modified documentation snapshots to reflect the correct PRQL version and expressions.
- Improved comments in snapshot tests for clarity regarding pipe usage in PRQL syntax.
Copy link
Copy Markdown
Collaborator

@prql-bot prql-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The function was renamed to date.floor during review (following the naming survey), but the PR title still says date.to_start_of_interval. Worth updating it before merge — PRs here land with the PR title as the squash commit message, so it would otherwise ship as a stale name in the history.

Implementation looks good — tests cover the four supported dialects and the error path for the others, and the per-dialect SQL output matches what each engine expects.

@prql-bot
Copy link
Copy Markdown
Collaborator

CI is failing on cargo fmt --check — the date_floor_unsupported_dialects and date_floor_operator blocks in prqlc/prqlc/tests/integration/sql.rs have a few off-by-four-spaces indent issues and a wrapping that rustfmt wants on one line. Running cargo fmt --all locally and committing the result will fix it.

Happy to push the fix if useful — say the word.

@sunyeongchoi
Copy link
Copy Markdown
Contributor Author

@max-sixty Thanks! As you suggested, I changed the name to date.floor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature Request] Support for arbitrary time-binning in ClickHouse (e.g., toStartOfInterval)

3 participants