Skip to content

feat(api): add /v1/detokenize endpoint#9620

Open
Dennisadira wants to merge 1 commit intomudler:masterfrom
Dennisadira:feat/detokenize-endpoint
Open

feat(api): add /v1/detokenize endpoint#9620
Dennisadira wants to merge 1 commit intomudler:masterfrom
Dennisadira:feat/detokenize-endpoint

Conversation

@Dennisadira
Copy link
Copy Markdown
Contributor

Summary

Closes #1649. Mirror of /v1/tokenize for the inverse direction: take a list of token IDs and return the detokenized text, requested by @benniekiss in the issue thread for "complete API workflow" use cases that need to turn token IDs back into text without local processing.

The proto/handler shape was discussed in #1649 (comment). @benniekiss reacted positively; landing this with the strict-mirror-of-tokenize precedent in mind. Happy to adjust if the proto naming or response shape should differ.

What's added

  • Proto (backend/backend.proto): new Detokenize(DetokenizeRequest) returns (DetokenizeResponse) RPC, with DetokenizeRequest{repeated int32 tokens} and DetokenizeResponse{string content}. The Go bindings are regenerated by make protogen-go (gitignored as usual).
  • llama.cpp backend (backend/cpp/llama-cpp/grpc-server.cpp): handler that calls common_token_to_piece per token and concatenates — the same primitive TokenizeString already uses internally at the same file.
  • Other backends: inherit the default Unimplemented from pkg/grpc/base.Base — same pattern as Detect, Rerank, etc. Backends can opt in later.
  • Go plumbing: pkg/grpc/{interface,server,backend,client,embed}.go + pkg/grpc/base/base.go updated alongside their TokenizeString counterparts.
  • HTTP: POST /v1/detokenize in core/http/endpoints/localai/detokenize.go and core/http/routes/localai.go. Request {"model": "...", "tokens": [...]}, response {"content": "..."}.
  • Auth: entry in RouteFeatureRegistry gated by the existing FeatureTokenize — no new feature flag.
  • Discovery: added under ai_functions in the routes index.
  • Swagger regenerated; authentication.md updated to list the new endpoint.

Test plan

  • make protogen-go regenerates clean
  • go build ./core/... ./pkg/grpc/... clean
  • go vet ./core/... ./pkg/grpc/... clean
  • go test -c -o /dev/null ./core/services/nodes/... clean (the existing testcontainers-based suite needs Docker; only updated the two interface mocks so the test package still compiles)
  • make swagger regenerates with the new endpoint visible
  • Manual round-trip: POST /v1/tokenizePOST /v1/detokenize returns the original text on a llama.cpp model

Assisted-by: Claude:claude-opus-4-7

Closes mudler#1649.

Mirror of the existing /v1/tokenize path, requested by @benniekiss in
the issue thread for "complete API workflow" use cases that need to
turn token IDs back into text without local processing.

- Add Detokenize gRPC RPC with DetokenizeRequest{tokens} /
  DetokenizeResponse{content} messages.
- Implement in the llama.cpp backend using common_token_to_piece, the
  same primitive TokenizeString already uses internally.
- Other backends inherit the default Unimplemented from base.Base, in
  line with how Detect, Rerank, etc. are gated per-backend.
- Wire up the Go gRPC interface, server, client, and in-process embed
  wrapper alongside their TokenizeString counterparts.
- Add the schema types, ModelDetokenize wrapper, HTTP handler, route
  registration, RouteFeatureRegistry entry (gated by FeatureTokenize so
  no new feature flag is needed), and the discovery map entry under
  ai_functions.
- Regenerated swagger reflects the new endpoint and types.
- Update authentication.md to list /v1/detokenize alongside /v1/tokenize.

Assisted-by: Claude:claude-opus-4-7
return grpc::Status::OK;
}

grpc::Status Detokenize(ServerContext* context, const backend::DetokenizeRequest* request, backend::DetokenizeResponse* response) override {
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this requires a test addition to our e2e-backend test suite where we exercise a mocked backend via api

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Tokenization endpoint

2 participants