Skip to content

Comments

feat: introduce query plan outlines and canonicalization#2901

Open
barakmich wants to merge 14 commits intomainfrom
barakmich/outlines
Open

feat: introduce query plan outlines and canonicalization#2901
barakmich wants to merge 14 commits intomainfrom
barakmich/outlines

Conversation

@barakmich
Copy link
Contributor

Description

Testing

References

@barakmich barakmich requested a review from a team as a code owner February 13, 2026 23:50
@github-actions github-actions bot added the area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools) label Feb 13, 2026
@codecov
Copy link

codecov bot commented Feb 13, 2026

Codecov Report

❌ Patch coverage is 82.76644% with 152 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.62%. Comparing base (1d65cbb) to head (ae0dd6b).

Files with missing lines Patch % Lines
pkg/query/path.go 38.95% 48 Missing and 10 partials ⚠️
pkg/query/outline.go 86.40% 43 Missing and 14 partials ⚠️
pkg/query/build_tree.go 85.72% 22 Missing ⚠️
pkg/query/canonicalize.go 92.17% 9 Missing and 4 partials ⚠️
pkg/schema/v2/schema.go 95.84% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2901      +/-   ##
==========================================
+ Coverage   74.54%   74.62%   +0.08%     
==========================================
  Files         484      485       +1     
  Lines       59228    59972     +744     
==========================================
+ Hits        44145    44747     +602     
- Misses      11990    12104     +114     
- Partials     3093     3121      +28     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

tstirrat15
tstirrat15 previously approved these changes Feb 17, 2026
Copy link
Contributor

@tstirrat15 tstirrat15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments, otherwise LGTM


// extractCaveats recursively extracts all CaveatIteratorType nodes from the tree.
// Returns the tree with caveats removed and a list of all extracted caveats.
func extractCaveats(outline Outline) (Outline, []*core.ContextualizedCaveat) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More for my own edification than something that needs to change: what's the tradeoff between using iterators for this kind of work and using slices? Like I had the thought that extractCaveats could be written as an iterator and then you could use slices.SortedFunc above, but I don't know whether one approach is better than the other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the cool thing about Outlines is all in the function signature.

Iterators, for doing the Check/IR/IS, make sense to be purely abstracted. They're a set, and we just want to fetch/evaluate some subset of that set.

But when we're manipulating them directly, like this, the signature func(Iterator) (Iterator, []*core.ContextualizedCaveat) here requires a lot of type switching, especially when replacing subiterators. .ReplaceSubiterators() does some of that, but hopefully that can go away!

So you totally can write extractCaveats this way -- the first version of canonicalization did exactly that -- but it's far more convenient to work with outlines.

(But this is also a good note to clean up the Iterator interface a little in a followup!)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! When I said "iterator" I meant golang iter seqs or whatever they're called. Like lazy evaluation and yielding instead of operating on slices.


// Compare compares two BaseRelations, returning -1 if b < other, 0 if b == other, 1 if b > other.
// Comparison is done lexicographically by: definition name, relation name, type, subrelation, caveat, expiration.
func (b *BaseRelation) Compare(other *BaseRelation) int {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same q here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one I feel more confident that we won't be changing, unless there's something drastic to change in the schema. But! We are testing it at a higher level because otherwise the fuzz for the canonicalization will fail

}

// argsCompare returns -1 if a < b, 0 if a == b, 1 if a > b
func argsCompare(a, b *IteratorArgs) int {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question here - will we capture it if we add new args?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep; fuzzing on this sort order should find args that wouldn't stably sort

Copy link
Contributor

@miparnisari miparnisari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please add descriptions to your PRs? Motivation, etc?

// 2. Bottom-Up Canonicalization: Apply five transformation steps sequentially
//
// The function is idempotent: applying it multiple times produces the same result.
func CanonicalizeOutline(outline Outline) (Outline, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpick: why not func (outline *Outline) CanonicalizeOutline() (Outline, error)? makes discovering the func easier

and why return Outline and not a new CanonicalizedOutline?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where CanonicalizedOutline is an alias for Outline?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, a new struct altogether

Type IteratorType
Args *IteratorArgs
Subiterators []Outline
CanonicalKey CanonicalKey // Populated only after canonicalization
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same as my other comment - code smell here? why not make a new type CanonicalizedOutline?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I did think about that. The thing is, we need to keep track of the CanonicalKey for a given node in the outline/iterator tree. But at the same time, we want to modify the Outline, restructuring it as we see fit, so that it's not canonical but has a CanonicalKey.

And so that's exactly how it's written. CanonicalOutline is a good idea, but it's not obvious to me how to do that without some indirection; how to keep a CanonicalKey attached to a node in a tree, without keeping it on the node.

It's true that one could generate UUIDs, and then this field is a UUID, and the CanonicalOutline contains the outline, and a map from UUID to CanonicalKey. (Or anything else we want to track). I don't mind this if you prefer!

Subiterators: decompSubs,
}, nil

default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm starting to think that we should bring this linter https://golangci-lint.run/docs/linters/configuration/#exhaustive to prevent errors of the form "we added a new type of iterator but we forgot to update a few switches"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm open to the idea, but only in explicit annotation mode. And then annotate places like this. Or, better, if it supports annotating specific type alias enums which, if they are switched over, should be exhaustive; and this would certainly be one of them (then it will enforce new code easier)

Comment on lines 449 to 450
aStr := fmt.Sprintf("%v", a.Metadata)
bStr := fmt.Sprintf("%v", b.Metadata)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when you format a map like this, do you have a guarantee that the keys will be printed in the same order every time? i don't think so.

how about we make Metadata a proper struct that has a comparison method of its own? then you can put all this logic there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the spec of the language doesn't guarantee it, but the implementation of the language does, similar to javascript and python.

@barakmich barakmich force-pushed the barakmich/outlines branch 2 times, most recently from 203ac7a to 16c1024 Compare February 19, 2026 22:22
Copy link
Contributor

@tstirrat15 tstirrat15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Curious about whether something like rapid would be useful for writing these tests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dig

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tooling Affects the dev or user toolchain (e.g. tests, ci, build tools)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants