Skip to content

TGLuong/vector-embedding-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Test Embedding

Rust demo service for semantic search over user behavior text using fastembed, Axum, SeaORM, Postgres, and pgvector.

The project stores short behavior descriptions, turns them into embeddings, and searches for behavior records by semantic similarity instead of exact keyword matching.

How Embedding Search Works

Traditional search usually compares words directly. Embedding search converts text into vectors, then compares the meaning of those vectors.

This service follows this flow:

  1. A client sends user behavior text to POST /api/v0/user.
  2. The app uses fastembed with AllMiniLML6V2 to convert the behavior text into a 384-dimension vector.
  3. The original behavior text and its vector are stored in Postgres.
  4. A client sends search text to GET /api/v0/search.
  5. The app embeds the search text with the same model.
  6. Postgres ranks stored rows by cosine distance with pgvector using embedding <=> query_embedding.
  7. The API returns the top 5 most similar behavior records.

Example: a search for likes outdoor activities can match records like went hiking on the weekend even though the words are not exactly the same.

Architecture

HTTP client
    |
    v
Axum API (`src/transport/http`)
    |
    v
Storage trait (`src/storage.rs`)
    |
    +--> fastembed model (`src/embedding`)
    |
    +--> Postgres + pgvector (`src/storage/postgres.rs`)

Main components:

  • src/main.rs: CLI entry point for starting the server and running migrations.
  • src/transport/http: Axum routes and HTTP server setup.
  • src/embedding: wrapper around the fastembed text embedding model.
  • src/storage: storage interface, Postgres implementation, SeaORM models, and migrations.
  • docker-compose.yaml: local Postgres database using the pgvector/pgvector image.

API Demo

Start by adding a few behavior records:

curl -X POST http://127.0.0.1:8080/api/v0/user \
  -H 'content-type: application/json' \
  -d '{"user_id":"user-1","behavior":"watched several videos about Rust web services"}'
curl -X POST http://127.0.0.1:8080/api/v0/user \
  -H 'content-type: application/json' \
  -d '{"user_id":"user-2","behavior":"read articles about hiking trails and camping gear"}'
curl -X POST http://127.0.0.1:8080/api/v0/user \
  -H 'content-type: application/json' \
  -d '{"user_id":"user-3","behavior":"searched for async programming tutorials"}'

Then search by meaning:

curl -X GET http://127.0.0.1:8080/api/v0/search \
  -H 'content-type: text/plain' \
  --data 'learning backend development with Rust'

The result should prioritize behavior records that are semantically close to the query, such as Rust web service or async programming activity.

Local Development

Prerequisites

  • Rust toolchain
  • Docker
  • Docker Compose

Start Postgres

docker compose up -d

This starts Postgres on localhost:5432 with:

  • user: app
  • password: app
  • database: app

Start the Server

cargo run -- start

By default, the server listens on 127.0.0.1:8080 and runs migrations automatically.

The first run may download the embedding model into .fastembed_cache.

Run Migrations Manually

cargo run -- migrate up
cargo run -- migrate down

The migration creates:

  • the vector Postgres extension
  • the user_behavior table
  • a vector(384) embedding column
  • an HNSW index using vector_cosine_ops

Configuration

Configuration is available through CLI flags and environment variables.

Setting Default Description
DATABASE / --database postgresql://app:app@localhost:5432/app Postgres connection string
HTTP_ADDR / --http-addr 127.0.0.1:8080 HTTP listen address
MIGRATION / --migration true Run migrations when starting the server
EMBEDDING_BATCH / --embedding-batch unset Optional embedding batch size passed to fastembed
STEP / --step unset Optional migration step count for migrate up or migrate down

Example:

HTTP_ADDR=127.0.0.1:3000 cargo run -- start

Implementation Notes

  • Embedding model: AllMiniLML6V2
  • Embedding dimension: 384
  • Search distance: cosine distance through pgvector
  • Search limit: top 5 records
  • Database access: SeaORM with Postgres
  • HTTP framework: Axum

Local generated data is ignored by git:

  • target/
  • data/
  • .fastembed_cache/

Troubleshooting

If Postgres cannot start, check whether another process is already using port 5432.

If the API cannot connect to the database, make sure docker compose up -d is running and the database URL matches postgresql://app:app@localhost:5432/app.

If the first request or startup is slow, the embedding model may still be downloading or initializing.

If you want to reset local database state, stop Postgres and remove the local data/ directory:

docker compose down
rm -rf data
docker compose up -d

About

Simple vector embedding search using postgres vector for storage and all-MiniLM-L6-v2 for embedding model

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages