SOLR-12074: Add optional terms index for PointFields #3922

HoustonPutman · 2025-12-04T21:35:38Z

https://issues.apache.org/jira/browse/SOLR-12074

This is to replace the only functionality that TrieFields outperform PointFields in, namely term or termInSet matching.

Currently this uses the enhancedIndex option, but we should find a better name.

Other side issues, like expanding pointField functionality when this is enabled, will be done afterwards.

HoustonPutman · 2025-12-04T22:10:45Z

Ok, benchmarking with and without docvalues, enhancedIndex and Point vs Trie:

Benchmark                        Mode  Cnt     Score      Error  Units
NumericSearch.intDvEnhancedSet  thrpt    5  4495.128 ±  867.927  ops/s
NumericSearch.intDvSet          thrpt    5  2858.541 ±  594.706  ops/s
NumericSearch.intEnhancedSet    thrpt    5  4638.795 ±  568.743  ops/s
NumericSearch.intSet            thrpt    5  1191.282 ±  107.378  ops/s
NumericSearch.intTrieDvSet      thrpt    5  4845.332 ±  252.985  ops/s
NumericSearch.intTrieSet        thrpt    5  4495.456 ± 1568.532  ops/s

The same benchmark on main (No change to the queries):

Benchmark                    Mode  Cnt     Score     Error  Units
NumericSearch.intDvSet      thrpt    5  3298.151 ± 482.841  ops/s
NumericSearch.intSet        thrpt    5  1295.497 ±  44.262  ops/s
NumericSearch.intTrieDvSet  thrpt    5  4727.185 ± 357.855  ops/s
NumericSearch.intTrieSet    thrpt    5  4748.940 ± 667.751  ops/s

Looks like we are good to move forward with this and remove TrieFields.

gerlowskija

A few preliminary comments - nothing major, mostly just trying to understand some of the motivations underlying particular changes.

Still need to review the tests, but the 'main' code looks great. This'll be an awesome improvement to get in, especially if it opens the door to us moving the needle on Trie fields!

gerlowskija · 2025-12-05T21:17:32Z

solr/benchmark/src/java/org/apache/solr/bench/search/NumericSearch.java


-    public QueryRequest intSetQuery(boolean dvs) {
-      return setQuery("numbers_i" + (dvs ? "_dv" : ""));
+    public QueryRequest intTrieSetQuery(boolean dvs, boolean enhancedIndex) {


[Q] Why provide the "enhancedIndex" flag here and then not use it?

gerlowskija · 2025-12-05T21:19:21Z

solr/benchmark/src/resources/configs/cloud-minimal/conf/schema.xml

 limitations under the License.
 -->
-<schema name="minimal" version="1.7">
+<schema name="minimal" version="1.6">


[Q] Why is this going backwards from 1.7?

Ahhh yeah, because it was testing docValues or not, but the docValues are enabled by docValues=true which is default in 1.7. So instead of going and changing to docValues=false, I just downgraded. I'll make it better before merging for sure. Same with the enhancedIndex flag above.

gerlowskija · 2025-12-06T03:26:40Z

solr/core/src/java/org/apache/solr/schema/PointField.java

+      readableToIndexed(externalVal, br);
+      return new TermQuery(new Term(field.getName(), br));
+    } else {
+      return getPointRangeQuery(parser, field, externalVal, externalVal, true, true);


[Q] Prior to this PR, most of the implementing subclasses handled the no-terms-index case by calling Lucene's IntPoint.newExactQuery(...). Currently, that factory also creates a range-query under the hood, so it's (afaict) functionally equivalent to the getPointRangeQuery call here.

If the code's largely equivalent then, is there a reason for the switch? The Lucene IntPoint.newExactQuery call gives us a bit less control I guess, but it has the benefit of being code that we don't have to maintain ourselves.

Are there optimizations in getPointRangeQuery that the Lucene equivalent doesn't have? Or are there other reasons that make the switch worthwhile?

no, they are functionally the same. And the only reason I didn't do this was so that I didn't have to write another getSpecializedExactQuery that does the IntPoint.nextExactQuery or LongPoint.nextExactQuery, etc. But that's an easy thing to add back in.

dsmiley

This is awesome; thanks!

I think "enhancedIndex" is a terrible name; sorry.

I think the below would be the most useful to explain/document/use:

indexed=true|false should toggle both "lookup" and "range" indexes accorindgly. It should be an error to explicitly set this and also set one of the settings below.
lookupIndexed=true|false should toggle "lookup" index (i.e. as implemented by Terms)
rangeIndexed=true|false a "range" index (i.e. as implemented by BKD)

HoustonPutman · 2025-12-20T00:52:36Z

I think "enhancedIndex" is a terrible name; sorry.

Agreed, it was only a placeholder.

I've created a new PR to change this to a new fieldType that can have it enabled by default (which we can't really do currently since existing fields don't have the index). #3972 I think that's the direction I want to go, then deprecate and remove all other numeric fields.

SOLR-12074: Add optional terms index for PointFields

0ab6359

github-actions bot added tests cat:schema labels Dec 4, 2025

Tidy

6a72fad

HoustonPutman added 3 commits December 4, 2025 14:11

Add tries to benchmark

e188171

Add testing in TestPointFields

703d497

Tidy

77798d1

gerlowskija reviewed Dec 6, 2025

View reviewed changes

dsmiley reviewed Dec 9, 2025

View reviewed changes

HoustonPutman mentioned this pull request Dec 20, 2025

SOLR-12074: Add new NumericField that combines terms and points #3972

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

SOLR-12074: Add optional terms index for PointFields #3922

SOLR-12074: Add optional terms index for PointFields #3922

Uh oh!

HoustonPutman commented Dec 4, 2025 •

edited

Loading

Uh oh!

HoustonPutman commented Dec 4, 2025 •

edited

Loading

Uh oh!

gerlowskija left a comment

Uh oh!

gerlowskija Dec 5, 2025

Uh oh!

gerlowskija Dec 5, 2025

Uh oh!

HoustonPutman Dec 8, 2025

Uh oh!

gerlowskija Dec 6, 2025

Uh oh!

HoustonPutman Dec 8, 2025

Uh oh!

dsmiley left a comment

Uh oh!

HoustonPutman commented Dec 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

SOLR-12074: Add optional terms index for PointFields #3922

Are you sure you want to change the base?

SOLR-12074: Add optional terms index for PointFields #3922

Uh oh!

Conversation

HoustonPutman commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

HoustonPutman commented Dec 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gerlowskija left a comment

Choose a reason for hiding this comment

Uh oh!

gerlowskija Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

gerlowskija Dec 5, 2025

Choose a reason for hiding this comment

Uh oh!

HoustonPutman Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

gerlowskija Dec 6, 2025

Choose a reason for hiding this comment

Uh oh!

HoustonPutman Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

dsmiley left a comment

Choose a reason for hiding this comment

Uh oh!

HoustonPutman commented Dec 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

HoustonPutman commented Dec 4, 2025 •

edited

Loading

HoustonPutman commented Dec 4, 2025 •

edited

Loading