Skip to content

feat: tokenize endpoints + property-level TextAnalyzer + StopwordPresets (Weaviate 1.37.0+)#329

Open
mpartipilo wants to merge 4 commits intomainfrom
feat/tokenize-endpoint
Open

feat: tokenize endpoints + property-level TextAnalyzer + StopwordPresets (Weaviate 1.37.0+)#329
mpartipilo wants to merge 4 commits intomainfrom
feat/tokenize-endpoint

Conversation

@mpartipilo
Copy link
Copy Markdown
Collaborator

@mpartipilo mpartipilo commented Apr 21, 2026

Summary

Ports two related Weaviate 1.37.0 features from the Python client into one stacked PR:

  1. python-client PR #2012/v1/tokenize endpoints

    • client.Tokenize.Text(text, tokenization, analyzerConfig?, stopwordPresets?, ct) (POST /v1/tokenize)
    • collection.Tokenize.Property(propertyName, text, ct) (POST /v1/schema/{class}/properties/{prop}/tokenize)
    • Public surface mirrors the TS client's tokenize namespace design.
    • AsciiFold is a nullable record (AsciiFoldConfig? AsciiFold) — null = disabled, non-null = enabled with optional Ignore list. The invalid "ignore without fold" state is unrepresentable, so no runtime validator is needed.
    • Version-gated at 1.37.0 via [RequiresWeaviateVersion(1, 37, 0)] + EnsureVersion<T>().
  2. python-client PR #2006 — property-level TextAnalyzer + collection-level StopwordPresets

    • Property.TextAnalyzer: pin ASCII folding and stopword preset per property at index time. Reuses the TextAnalyzerConfig record from (1) so tokenize-at-query and index-at-insert stay aligned. Propagates through nested properties.
    • InvertedIndexConfig.StopwordPresets: named preset → word-list map on the collection inverted-index config. Properties reference presets via TextAnalyzer.StopwordPreset.
    • InvertedIndexConfigUpdate.StopwordPresets: mirrors the set accessor on the update wrapper so c.InvertedIndexConfig.StopwordPresets = ... works inside collection.Config.Update(...).
    • Preflight in CollectionsClient.Create detects either feature in the incoming schema and throws WeaviateVersionMismatchException when the server is older than 1.37.0, before any REST call.
    • Originally planned as a follow-up PR but folded into this one since both features share the same TextAnalyzerConfig shape and version gate. TokenizeAnalyzerConfig was renamed to TextAnalyzerConfig to match the server type name.

Docs: TOKENIZE_API_USAGE.md — end-to-end guide covering both scopes, including schema-time analyzer + preset examples.

Out of scope

  • Java parity + Java gse_ch fix — separate tracked work.
  • gRPC tokenize (no proto as of 1.37.0).

Test plan

  • dotnet build src/Weaviate.Client/ → 0 errors
  • dotnet build src/Weaviate.Client.Tests/ → 0 warnings, 0 errors
  • dotnet test --filter FullyQualifiedName~TestTokenize → 16/16 passed against Weaviate 1.37.1
  • dotnet test --filter FullyQualifiedName~TestCollectionTextAnalyzer against Weaviate 1.37.1
  • CI green

🤖 Generated with Claude Code

Port of python-client PR #2012, aligned with the TS client's `tokenize`
namespace design. Adds:

- `client.Tokenize.Text(text, tokenization, analyzerConfig?, stopwordPresets?)`
  → POST /v1/tokenize
- `collection.Tokenize.Property(propertyName, text)`
  → POST /v1/schema/{class}/properties/{prop}/tokenize

Version-gated at 1.37.0 via `[RequiresWeaviateVersion]`. `AsciiFold` is
modeled as a nullable record (null = disabled, non-null = enabled with
optional `Ignore` list) so the invalid "ignore without fold" state is
unrepresentable without a validator.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@orca-security-eu orca-security-eu Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Orca Security Scan Summary

Status Check Issues by priority
Passed Passed Infrastructure as Code high 0   medium 0   low 0   info 0 View in Orca
Passed Passed SAST high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Secrets high 0   medium 0   low 0   info 0 View in Orca
Passed Passed Vulnerabilities high 0   medium 0   low 0   info 0 View in Orca

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Apr 21, 2026

Summary - Weaviate C# Client Coverage

Summary
Generated on: 04/21/2026 - 22:15:16
Coverage date: 04/21/2026 - 22:06:37 - 04/21/2026 - 22:14:27
Parser: MultiReport (103x Cobertura)
Assemblies: 3
Classes: 362
Files: 232
Line coverage: 46.9% (10648 of 22674)
Covered lines: 10648
Uncovered lines: 12026
Coverable lines: 22674
Total lines: 58003
Branch coverage: 35.4% (2327 of 6568)
Covered branches: 2327
Total branches: 6568
Method coverage: Feature is only available for sponsors

Coverage

Weaviate.Client - 48.8%
Name Line Branch
Weaviate.Client 48.8% 37.7%
Weaviate.Client.AggregateClient 19.5% 11.3%
Weaviate.Client.AggregateClientHybridExtensions 0%
Weaviate.Client.AliasClient 100% 62.5%
Weaviate.Client.ApiKeyTokenService 100% 50%
Weaviate.Client.Auth 38%
Weaviate.Client.AuthenticatedHttpHandler 71.7% 60%
Weaviate.Client.BackupClient 82% 43.4%
Weaviate.Client.BaseCollectionClient 0% 0%
Weaviate.Client.Batch.BatchContext 58% 50%
Weaviate.Client.Batch.BatchManager 39.2% 25.7%
Weaviate.Client.Batch.BatchOptions 25% 50%
Weaviate.Client.Batch.BatchResult 100%
Weaviate.Client.Batch.TaskHandle 83.3% 50%
Weaviate.Client.Cache.SchemaCache 69.4% 60%
Weaviate.Client.ClientConfiguration 100% 100%
Weaviate.Client.ClientConfigurationExtensions 35.2% 16.6%
Weaviate.Client.ClusterClient 86.2% 64.2%
Weaviate.Client.CollectionClient 98.7% 87.5%
Weaviate.Client.CollectionClientExtensions 100% 100%
Weaviate.Client.CollectionConfigClient 95.5% 58.3%
Weaviate.Client.CollectionsClient 44.8% 63.6%
Weaviate.Client.CollectionTokenizeClient 100% 50%
Weaviate.Client.Configure 49% 47.3%
Weaviate.Client.Connect 25.5% 0%
Weaviate.Client.DataClient 91.5% 78.7%
Weaviate.Client.DefaultTokenServiceFactory 28.5% 25%
Weaviate.Client.DependencyInjection.WeaviateClientFactory 0% 0%
Weaviate.Client.DependencyInjection.WeaviateInitializationService 0%
Weaviate.Client.DependencyInjection.WeaviateOptions 50.9% 87.5%
Weaviate.Client.DependencyInjection.WeaviateServiceCollectionExtensions 0% 0%
Weaviate.Client.Factory 100%
Weaviate.Client.Generate 100%
Weaviate.Client.GenerateClient 13.3% 32.6%
Weaviate.Client.GenerateClientHybridExtensions 0%
Weaviate.Client.GenerativeConfigFactory 5.2% 100%
Weaviate.Client.GenerativeProviderFactory 1%
Weaviate.Client.GroupsClient 100%
Weaviate.Client.GroupsOidcClient 47.8%
Weaviate.Client.Grpc.BatchStreamContext 100%
Weaviate.Client.Grpc.BatchStreamWrapper 75.3% 58.9%
Weaviate.Client.Grpc.LoggingInterceptor 0% 0%
Weaviate.Client.Grpc.RetryInterceptor 41.6% 37.5%
Weaviate.Client.Grpc.WeaviateGrpcClient 66.7% 49.8%
Weaviate.Client.Grpc.WeaviateGrpcServerException 0%
Weaviate.Client.Internal.AutoArray`1 61.9% 50%
Weaviate.Client.Internal.AutoArrayBuilder 100% 100%
Weaviate.Client.Internal.BatchStreamAcks 100%
Weaviate.Client.Internal.BatchStreamBackoff 0%
Weaviate.Client.Internal.BatchStreamError 0%
Weaviate.Client.Internal.BatchStreamOutOfMemory 0%
Weaviate.Client.Internal.BatchStreamResults 100%
Weaviate.Client.Internal.BatchStreamSuccess 100%
Weaviate.Client.Internal.ExceptionHelper 72.1% 61.5%
Weaviate.Client.Internal.HttpLoggingHandler 0% 0%
Weaviate.Client.Internal.KeySortedList`2 50%
Weaviate.Client.Internal.MultiKeySortedList`2 0% 0%
Weaviate.Client.Internal.ObjectHelper 48% 34.8%
Weaviate.Client.Internal.RetryHandler 49% 50%
Weaviate.Client.Internal.TimeoutHelper 72.9% 44.4%
Weaviate.Client.Internal.VersionGuard 92.3% 88.8%
Weaviate.Client.Models.Aggregate 100%
Weaviate.Client.Models.AggregateGroupByResult 26.1% 8.4%
Weaviate.Client.Models.AggregateResult 49.2% 12.8%
Weaviate.Client.Models.Alias 100%
Weaviate.Client.Models.AliasesResource 100%
Weaviate.Client.Models.AndNestedFilter 50%
Weaviate.Client.Models.AsciiFoldConfig 100%
Weaviate.Client.Models.Backup 63.6%
Weaviate.Client.Models.BackupBackend 20%
Weaviate.Client.Models.BackupClientConfig 100%
Weaviate.Client.Models.BackupCreateOperation 100%
Weaviate.Client.Models.BackupCreateRequest 100%
Weaviate.Client.Models.BackupOperationBase 61.6% 65.3%
Weaviate.Client.Models.BackupRestoreOperation 100%
Weaviate.Client.Models.BackupRestoreRequest 100%
Weaviate.Client.Models.BackupsResource 100%
Weaviate.Client.Models.BackupStatusExtensions 78.5% 52.5%
Weaviate.Client.Models.BatchInsertRequest 52.1%
Weaviate.Client.Models.BatchInsertResponse 80%
Weaviate.Client.Models.BatchInsertResponseEntry 100%
Weaviate.Client.Models.BatchReferenceReturn 29% 0%
Weaviate.Client.Models.BM25Config 61.1% 66.6%
Weaviate.Client.Models.Bm25ConfigUpdate 60% 50%
Weaviate.Client.Models.BM25Operator 66.6%
Weaviate.Client.Models.ClusterNode 30.7%
Weaviate.Client.Models.ClusterNodeVerbose 23% 0%
Weaviate.Client.Models.CollectionConfig 69.5% 57.1%
Weaviate.Client.Models.CollectionConfigCommon 67.9% 59.3%
Weaviate.Client.Models.CollectionConfigExport 0% 0%
Weaviate.Client.Models.CollectionsResource 100%
Weaviate.Client.Models.CollectionUpdate 42.2% 50%
Weaviate.Client.Models.CurrentUserInfo 80%
Weaviate.Client.Models.DatabaseUser 55.5%
Weaviate.Client.Models.DataReference 100% 50%
Weaviate.Client.Models.DataResource 100%
Weaviate.Client.Models.DataTypeExtensions 0% 0%
Weaviate.Client.Models.DeleteManyObjectResult 100%
Weaviate.Client.Models.DeleteManyResult 100%
Weaviate.Client.Models.DynamicDto 0%
Weaviate.Client.Models.EmptyBackend 33.3%
Weaviate.Client.Models.EmptyStringEnumConverter`1 86.9% 66.6%
Weaviate.Client.Models.FilesystemBackend 100%
Weaviate.Client.Models.Filter 60.8% 40%
Weaviate.Client.Models.Filter`1 50% 50%
Weaviate.Client.Models.FlatDto 100%
Weaviate.Client.Models.FlexibleConverter`1 11.3% 3.4%
Weaviate.Client.Models.FlexibleStringConverter 38.4% 25%
Weaviate.Client.Models.Generative.Providers 0.6%
Weaviate.Client.Models.GenerativeConfig 7.5%
Weaviate.Client.Models.GenerativeConfigSerialization 46.8% 55.4%
Weaviate.Client.Models.GenerativeDebug 0%
Weaviate.Client.Models.GenerativeGroupByObject 100%
Weaviate.Client.Models.GenerativeGroupByResult 25%
Weaviate.Client.Models.GenerativePrompt 100%
Weaviate.Client.Models.GenerativeProvider 83.3%
Weaviate.Client.Models.GenerativeReply 100%
Weaviate.Client.Models.GenerativeResult 20% 0%
Weaviate.Client.Models.GenerativeWeaviateGroup 100%
Weaviate.Client.Models.GenerativeWeaviateObject 100%
Weaviate.Client.Models.GenerativeWeaviateResult 100%
Weaviate.Client.Models.GeoCoordinate 100%
Weaviate.Client.Models.GeoCoordinateConstraint 0%
Weaviate.Client.Models.GroupByObject 100%
Weaviate.Client.Models.GroupByRequest 100%
Weaviate.Client.Models.GroupByResult 16.6%
Weaviate.Client.Models.GroupByResult`2 100%
Weaviate.Client.Models.GroupedTask 100%
Weaviate.Client.Models.GroupRoleAssignment 0%
Weaviate.Client.Models.GroupsResource 100%
Weaviate.Client.Models.HFreshDto 0%
Weaviate.Client.Models.HnswDto 100%
Weaviate.Client.Models.HybridNearTextBuilder 0%
Weaviate.Client.Models.HybridNearVectorBuilder 0%
Weaviate.Client.Models.HybridVectorInput 78.5% 85.7%
Weaviate.Client.Models.HybridVectorInputBuilder 0%
Weaviate.Client.Models.InvertedIndexConfig 44.5% 29.1%
Weaviate.Client.Models.InvertedIndexConfigUpdate 71.4% 50%
Weaviate.Client.Models.JsonConverterEmptyCollectionAsNull 73.6% 50%
Weaviate.Client.Models.Metadata 100%
Weaviate.Client.Models.MetadataQuery 81.8%
Weaviate.Client.Models.MetaInfo 92.8% 80%
Weaviate.Client.Models.Metrics 84.8% 60%
Weaviate.Client.Models.ModelsToDtoExtensions 100% 90%
Weaviate.Client.Models.ModuleConfigList 0% 0%
Weaviate.Client.Models.Move 100%
Weaviate.Client.Models.MultiTenancyConfig 100%
Weaviate.Client.Models.MultiTenancyConfigUpdate 60%
Weaviate.Client.Models.MultiVectorDto 100%
Weaviate.Client.Models.MultiVectorEncodingDto 100%
Weaviate.Client.Models.MuveraDto 100% 100%
Weaviate.Client.Models.NamedVector 100% 100%
Weaviate.Client.Models.NearTextBuilder 36.2%
Weaviate.Client.Models.NearTextInput 50%
Weaviate.Client.Models.NearVectorBuilder 0%
Weaviate.Client.Models.NearVectorInput 45.4%
Weaviate.Client.Models.NestedFilter 100%
Weaviate.Client.Models.NodesResource 50%
Weaviate.Client.Models.NodeStatusExtensions 0% 0%
Weaviate.Client.Models.NotNestedFilter 100%
Weaviate.Client.Models.ObjectReference 80%
Weaviate.Client.Models.ObjectStorageBackend 0%
Weaviate.Client.Models.ObjectTTLConfig 97.5%
Weaviate.Client.Models.ObjectTTLConfigUpdate 89.6% 26.9%
Weaviate.Client.Models.OrNestedFilter 100%
Weaviate.Client.Models.PermissionResourceExtensions 78.9% 33.3%
Weaviate.Client.Models.Permissions 67.8% 60%
Weaviate.Client.Models.PermissionScope 100%
Weaviate.Client.Models.PhoneNumber 77.7%
Weaviate.Client.Models.Property 86.3% 62.5%
Weaviate.Client.Models.Property`1 100%
Weaviate.Client.Models.PropertyFilter 74.4% 50%
Weaviate.Client.Models.PropertyHelper 67.8% 53.4%
Weaviate.Client.Models.PropertyIndexTypeExtensions 50% 25%
Weaviate.Client.Models.PropertyUpdate 25%
Weaviate.Client.Models.QueryProfile 100%
Weaviate.Client.Models.QueryReference 100%
Weaviate.Client.Models.Reference 100%
Weaviate.Client.Models.ReferenceFilter 100%
Weaviate.Client.Models.ReferenceUpdate 0%
Weaviate.Client.Models.ReplicateRequest 100%
Weaviate.Client.Models.ReplicateResource 100%
Weaviate.Client.Models.ReplicationAsyncConfig 0%
Weaviate.Client.Models.ReplicationClientConfig 100%
Weaviate.Client.Models.ReplicationConfig 100%
Weaviate.Client.Models.ReplicationConfigUpdate 44.4%
Weaviate.Client.Models.ReplicationOperation 65% 50%
Weaviate.Client.Models.ReplicationOperationError 0%
Weaviate.Client.Models.ReplicationOperationStatus 37.5% 0%
Weaviate.Client.Models.ReplicationOperationTracker 64% 54.5%
Weaviate.Client.Models.Rerank 100%
Weaviate.Client.Models.Reranker 15.3%
Weaviate.Client.Models.RerankerConfigSerialization 56.8% 55%
Weaviate.Client.Models.RoleInfo 100%
Weaviate.Client.Models.RolesResource 100%
Weaviate.Client.Models.SearchProfile 100%
Weaviate.Client.Models.ShardInfo 100%
Weaviate.Client.Models.ShardingConfig 100%
Weaviate.Client.Models.ShardProfile 100%
Weaviate.Client.Models.ShardStatusExtensions 100% 50%
Weaviate.Client.Models.SimpleTargetVectors 100%
Weaviate.Client.Models.SinglePrompt 100%
Weaviate.Client.Models.Sort 100% 50%
Weaviate.Client.Models.SortExtensions 100%
Weaviate.Client.Models.StopwordConfig 61.9% 62.5%
Weaviate.Client.Models.StopwordsConfigUpdate 57.1% 50%
Weaviate.Client.Models.TargetVectors 27.6% 0%
Weaviate.Client.Models.Tenant 38.8% 13.3%
Weaviate.Client.Models.TenantsResource 100%
Weaviate.Client.Models.TextAnalyzerConfig 100%
Weaviate.Client.Models.TimeFilter 75% 25%
Weaviate.Client.Models.TokenizeMapping 100% 68%
Weaviate.Client.Models.TokenizeResult 100%
Weaviate.Client.Models.Typed.AggregateGroupByResult`1 0% 0%
Weaviate.Client.Models.Typed.AggregatePropertyMapper 0% 0%
Weaviate.Client.Models.Typed.AggregateResult`1 0%
Weaviate.Client.Models.Typed.BooleanMetricsAttribute 0%
Weaviate.Client.Models.Typed.DateMetricsAttribute 0%
Weaviate.Client.Models.Typed.GenerativeGroupByObject`1 0%
Weaviate.Client.Models.Typed.GenerativeGroupByResult`1 0%
Weaviate.Client.Models.Typed.GenerativeWeaviateGroup`1 0%
Weaviate.Client.Models.Typed.GenerativeWeaviateObject`1 0%
Weaviate.Client.Models.Typed.GenerativeWeaviateResult`1 0%
Weaviate.Client.Models.Typed.GroupByObject`1 0%
Weaviate.Client.Models.Typed.GroupByResult`1 0%
Weaviate.Client.Models.Typed.IntegerMetricsAttribute 0%
Weaviate.Client.Models.Typed.MetricsExtractor 0% 0%
Weaviate.Client.Models.Typed.NumberMetricsAttribute 0%
Weaviate.Client.Models.Typed.TextMetricsAttribute 0%
Weaviate.Client.Models.Typed.TypedResultConverter 12% 7.6%
Weaviate.Client.Models.Typed.WeaviateGroup`2 0%
Weaviate.Client.Models.Typed.WeaviateObject`1 47.3% 37.5%
Weaviate.Client.Models.TypedBase`1 70.5%
Weaviate.Client.Models.TypedGuid 66.6%
Weaviate.Client.Models.TypedValue`1 80%
Weaviate.Client.Models.User 0%
Weaviate.Client.Models.UserMetadata 0%
Weaviate.Client.Models.UserRoleAssignment 100%
Weaviate.Client.Models.UsersResource 100%
Weaviate.Client.Models.Vector 32.8% 18.7%
Weaviate.Client.Models.VectorBuilder 0% 0%
Weaviate.Client.Models.VectorConfig 82.7% 50%
Weaviate.Client.Models.VectorConfigList 59.3% 60%
Weaviate.Client.Models.VectorConfigUpdate 50%
Weaviate.Client.Models.VectorIndex 84.3%
Weaviate.Client.Models.VectorIndexConfig 100%
Weaviate.Client.Models.VectorIndexConfigUpdate 68.4% 33.3%
Weaviate.Client.Models.VectorIndexConfigUpdateDynamic 0% 0%
Weaviate.Client.Models.VectorIndexConfigUpdateFlat 40%
Weaviate.Client.Models.VectorIndexConfigUpdateHNSW 52.9%
Weaviate.Client.Models.VectorIndexMappingExtensions 65.1% 50%
Weaviate.Client.Models.VectorIndexSerialization 44.7% 36.1%
Weaviate.Client.Models.VectorInputBuilderFactories 50% 50%
Weaviate.Client.Models.Vectorizer 9% 0%
Weaviate.Client.Models.VectorizerAttribute 100%
Weaviate.Client.Models.VectorizerConfig 64.1% 56.2%
Weaviate.Client.Models.VectorizerRegistry 68.9% 71.4%
Weaviate.Client.Models.Vectorizers.VectorizerConfigFactory 58% 62.5%
Weaviate.Client.Models.VectorMulti`1 38.8% 23%
Weaviate.Client.Models.VectorQuery 45% 50%
Weaviate.Client.Models.Vectors 31.2% 100%
Weaviate.Client.Models.VectorSearchInput 51.7% 16.6%
Weaviate.Client.Models.VectorSingle`1 18.1% 0%
Weaviate.Client.Models.WeaviateGroup`1 80%
Weaviate.Client.Models.WeaviateObject 88.8%
Weaviate.Client.Models.WeaviateObjectExtensions 56.5% 50%
Weaviate.Client.Models.WeaviateResult 100%
Weaviate.Client.Models.WeaviateResult`1 100%
Weaviate.Client.Models.WeightedField 0%
Weaviate.Client.Models.WeightedFields 0% 0%
Weaviate.Client.Models.WeightedTargetVectors 100% 100%
Weaviate.Client.NearMediaBuilder 0% 0%
Weaviate.Client.NearMediaInput 0%
Weaviate.Client.NodesClient 87.5% 50%
Weaviate.Client.OAuthConfig 71.4%
Weaviate.Client.OAuthTokenService 39.3% 18.7%
Weaviate.Client.QueryClient 47.6% 40%
Weaviate.Client.QueryClientHybridExtensions 0% 0%
Weaviate.Client.QueryClientNearTextExtensions 0% 0%
Weaviate.Client.ReplicationsClient 88.2% 55.1%
Weaviate.Client.RequiresWeaviateVersionAttribute 100%
Weaviate.Client.RerankerConfigFactory 14.2% 100%
Weaviate.Client.Rest.EnumMemberJsonConverter`1 0% 0%
Weaviate.Client.Rest.EnumMemberJsonConverterFactory 0%
Weaviate.Client.Rest.HttpResponseMessageExtensions 82.9% 62.5%
Weaviate.Client.Rest.InvalidEnumWireFormatException 0%
Weaviate.Client.Rest.WeaviateEndpoints 78.9% 75.8%
Weaviate.Client.Rest.WeaviateRestClient 88.6% 53.3%
Weaviate.Client.Rest.WeaviateRestClientException 0% 0%
Weaviate.Client.Rest.WeaviateRestServerException 0% 0%
Weaviate.Client.Rest.WeaviateUnexpectedStatusCodeException 100%
Weaviate.Client.RetryPolicy 50% 42.8%
Weaviate.Client.RolesClient 90.3% 50%
Weaviate.Client.Serialization.Converters.BlobPropertyConverter 58.8% 37.5%
Weaviate.Client.Serialization.Converters.BoolPropertyConverter 64% 40%
Weaviate.Client.Serialization.Converters.DatePropertyConverter 31% 21%
Weaviate.Client.Serialization.Converters.GeoPropertyConverter 28.3% 10.5%
Weaviate.Client.Serialization.Converters.IntPropertyConverter 49.2% 26.3%
Weaviate.Client.Serialization.Converters.NumberPropertyConverter 56.8% 23.5%
Weaviate.Client.Serialization.Converters.ObjectPropertyConverter 18% 7.3%
Weaviate.Client.Serialization.Converters.PhonePropertyConverter 14.4% 5.2%
Weaviate.Client.Serialization.Converters.TextPropertyConverter 36.3% 20.8%
Weaviate.Client.Serialization.Converters.UuidPropertyConverter 61.1% 37.5%
Weaviate.Client.Serialization.PropertyBag 0% 0%
Weaviate.Client.Serialization.PropertyConverterBase 23% 11.9%
Weaviate.Client.Serialization.PropertyConverterRegistry 73.6% 65.5%
Weaviate.Client.TenantsClient 84.5% 37.5%
Weaviate.Client.TokenizeClient 100% 75%
Weaviate.Client.Typed.TypedCollectionClient`1 91.1% 50%
Weaviate.Client.Typed.TypedDataClient`1 49%
Weaviate.Client.Typed.TypedGenerateClient`1 0.8% 0%
Weaviate.Client.Typed.TypedGenerateClientHybridExtensions 0%
Weaviate.Client.Typed.TypedQueryClient`1 10.6% 100%
Weaviate.Client.Typed.TypedQueryClientHybridExtensions 0%
Weaviate.Client.UsersClient 89.4% 66.6%
Weaviate.Client.UsersDatabaseClient 100% 62.5%
Weaviate.Client.UsersOidcClient 4.5%
Weaviate.Client.Validation.TypeValidationException 0%
Weaviate.Client.Validation.TypeValidator 50% 48.6%
Weaviate.Client.Validation.ValidationError 83.3%
Weaviate.Client.Validation.ValidationResult 62.5% 50%
Weaviate.Client.Validation.ValidationWarning 0%
Weaviate.Client.ValidationExtensions 80% 100%
Weaviate.Client.VectorizerFactory 3.1% 0%
Weaviate.Client.VectorizerFactoryMulti 6.2%
Weaviate.Client.WeaviateAuthenticationException 100% 100%
Weaviate.Client.WeaviateAuthorizationException 100% 100%
Weaviate.Client.WeaviateBackupConflictException 100%
Weaviate.Client.WeaviateBadRequestException 0% 0%
Weaviate.Client.WeaviateClient 59.4% 47.9%
Weaviate.Client.WeaviateClientBuilder 56.6% 50%
Weaviate.Client.WeaviateClientBuilderExtensions 0% 0%
Weaviate.Client.WeaviateClientException 33.3%
Weaviate.Client.WeaviateCollectionLimitReachedException 0% 0%
Weaviate.Client.WeaviateConflictException 100%
Weaviate.Client.WeaviateDefaults 100%
Weaviate.Client.WeaviateException 66.6%
Weaviate.Client.WeaviateExtensions 78.4% 52.9%
Weaviate.Client.WeaviateExternalModuleProblemException 0% 0%
Weaviate.Client.WeaviateFeatureNotSupportedException 0% 0%
Weaviate.Client.WeaviateModuleNotAvailableException 0% 0%
Weaviate.Client.WeaviateNotFoundException 14.8% 0%
Weaviate.Client.WeaviateServerException 66.6%
Weaviate.Client.WeaviateTimeoutException 0% 0%
Weaviate.Client.WeaviateUnprocessableEntityException 100% 100%
Weaviate.Client.WeaviateVersionMismatchException 76.9% 50%
Weaviate.Client.Analyzers - 0%
Name Line Branch
Weaviate.Client.Analyzers 0% 0%
Weaviate.Client.Analyzers.AggregatePropertySuffixAnalyzer 0% 0%
Weaviate.Client.Analyzers.AutoArrayUsageAnalyzer 0% 0%
Weaviate.Client.Analyzers.HybridSearchNullParametersAnalyzer 0% 0%
Weaviate.Client.Analyzers.RequiresVersionEnsureCallAnalyzer 0% 0%
Weaviate.Client.Analyzers.VectorizerFactoryAnalyzer 0% 0%
Weaviate.Client.VectorData - 50.3%
Name Line Branch
Weaviate.Client.VectorData 50.3% 31.2%
Weaviate.Client.VectorData.DependencyInjection.WeaviateVectorDataServiceCol
lectionExtensions
0% 0%
Weaviate.Client.VectorData.Filters.WeaviateFilterTranslator 29.2% 19.5%
Weaviate.Client.VectorData.Mapping.AttributeBasedRecordMapper`1 59.7% 50%
Weaviate.Client.VectorData.Mapping.DataPropertyInfo 100%
Weaviate.Client.VectorData.Mapping.DynamicRecordMapper 0% 0%
Weaviate.Client.VectorData.Mapping.RecordPropertyModel 54.9% 43.1%
Weaviate.Client.VectorData.Mapping.VectorDataSchemaBuilder 41.3% 18.7%
Weaviate.Client.VectorData.Mapping.VectorPropertyInfo 85.7%
Weaviate.Client.VectorData.WeaviateVectorStore 61.2% 33.3%
Weaviate.Client.VectorData.WeaviateVectorStoreCollection`2 71.3% 44.7%
Weaviate.Client.VectorData.WeaviateVectorStoreCollectionOptions 0%
Weaviate.Client.VectorData.WeaviateVectorStoreOptions 0%

- New docs/TOKENIZE_API_USAGE.md covers both `client.Tokenize.Text` and
  `collection.Tokenize.Property`, analyzer config (ASCII folding,
  stopwords), the result shape, and common usage patterns.
- Link the guide from README under "Additional Guides".
- Add an "Unreleased" CHANGELOG entry for the tokenize endpoints.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mpartipilo mpartipilo marked this pull request as ready for review April 21, 2026 17:48
Port weaviate-python-client PR #2006 on top of the tokenize-endpoint
stack for Weaviate 1.37.0:

- Property.TextAnalyzer: pin ASCII folding and stopword preset per
  property at index time. Reuses the TextAnalyzerConfig record already
  introduced for /v1/tokenize so tokenize-at-query and index-at-insert
  stay aligned. Propagates through nested properties via Property->
  NestedProperties recursion.
- InvertedIndexConfig.StopwordPresets: named preset->word-list map on
  the collection inverted-index config. Properties reference presets
  via TextAnalyzer.StopwordPreset. Round-trips through create + update.
- InvertedIndexConfigUpdate.StopwordPresets: mirrors the set accessor
  on the update wrapper so c.InvertedIndexConfig.StopwordPresets = ...
  works inside collection.Config.Update(...).
- Preflight in CollectionsClient.Create: detects either feature in the
  incoming schema and throws WeaviateVersionMismatchException when the
  connected server is older than 1.37.0, before any REST call.
- Rename TokenizeAnalyzerConfig -> TextAnalyzerConfig: same shape now
  serves both the tokenize endpoint and the property-level analyzer,
  matching the server type name and Python naming.
- Integration tests in TestCollectionTextAnalyzer.cs cover preset
  round-trip, update, referenced-removal rejection, ascii-fold combos,
  and version-gate behaviour.
- CHANGELOG + docs/TOKENIZE_API_USAGE.md extended with worked examples
  for the schema-side analyzer and stopword presets.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@mpartipilo mpartipilo changed the title feat: add tokenize endpoint support (Weaviate 1.37.0+) feat: tokenize endpoints + property-level TextAnalyzer + StopwordPresets (Weaviate 1.37.0+) Apr 21, 2026
…pwordPresets rejections

The `StopwordPresets_RemoveInUse_RejectedByServer` and
`StopwordPresets_RemoveReferencedByNested_RejectedByServer` tests expected
`WeaviateClientException`, but the server returns HTTP 422 which the client
maps to `WeaviateUnprocessableEntityException : WeaviateServerException`.
The test names already indicate these are server-side rejections — align the
assertions with the actual (and correct) exception type.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant