Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 19 additions & 0 deletions docs/indexing/fts-index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -83,3 +83,22 @@ Enable phrase queries by setting:
|:----------|:---------------|:--------|
| `with_position` | `True` | Track token positions for phrase matching |
| `remove_stop_words` | `False` | Preserve stop words for exact phrase matching |

## Indexing nested string fields

You can build an FTS index on a string field inside a struct by passing its full dotted path, like `nested.text`. The same path is used when you query the index through `fts_columns`, and the indexed column is reported back as the full path from `list_indices()`.

```python
# Schema: pa.struct([pa.field("text", pa.string())]) stored under the `nested` column.
table.create_fts_index("nested.text")

results = (
table.search("puppy", query_type="fts", fts_columns="nested.text")
.limit(5)
.to_list()
)
```

<Note>
Use the canonical Lance path: dot-separate each struct field from root to leaf (for example, `metadata.author.name`). The same convention applies to scalar and vector indexes.
</Note>
16 changes: 16 additions & 0 deletions docs/indexing/scalar-index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -89,6 +89,22 @@ Scalar indexes can also speed up scans containing a vector search or full text s
</CodeBlock>
</CodeGroup>

## Indexing nested fields

Scalar indexes can target a scalar field inside a struct by passing its full dotted path. The path is preserved end to end: it's the value you pass to `create_scalar_index`, it's what `list_indices()` reports under `columns`, and it's the column reference you use in filter predicates.

```python
# Schema: pa.struct([pa.field("user_id", pa.int32())]) stored under the `metadata` column.
table.create_scalar_index("metadata.user_id", name="metadata_user_id_idx")

# The same dotted path works in WHERE clauses.
table.search().where("metadata.user_id = 42").limit(1).to_list()
```

<Note>
Nested paths follow Lance field-path semantics: dot-separate each struct field from root to leaf (for example, `metadata.author.name`). The same convention applies to FTS and vector indexes.
</Note>

## Index UUID Columns

LanceDB supports scalar indexes on UUID columns (stored as `FixedSizeBinary(16)`), enabling efficient lookups and filtering on UUID-based primary keys.
Expand Down
25 changes: 25 additions & 0 deletions docs/indexing/vector-index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,31 @@ Create an `IVF_PQ` index with `cosine` similarity. Specify `vector_column_name`
</CodeBlock>
</CodeGroup>

#### Indexing nested vector fields

If your vector column lives inside a struct, pass its full dotted path as `vector_column_name`. The same path is used at query time and is what `list_indices()` reports under `columns`:

```python
# Schema: pa.struct([pa.field("embedding", pa.list_(pa.float32(), 2))])
# stored under the `image` column.
table.create_index(
vector_column_name="image.embedding",
num_partitions=1,
num_sub_vectors=1,
name="image_embedding_idx",
)

results = (
table.search([0.0, 1.0], vector_column_name="image.embedding")
.limit(1)
.to_list()
)
```

<Note>
Nested paths follow Lance field-path semantics: dot-separate each struct field from root to leaf (for example, `image.thumbnail.embedding`). The same convention applies to FTS and scalar indexes.
</Note>

### Async API and Config Objects

With asynchronous Python connections, create vector indexes with `await table.create_index("vector", config=...)`. The `config` object carries the same index choices you configure in the synchronous API, such as distance metric, partition count, and quantization settings:
Expand Down
Loading