Image Retrieval System

A modular image retrieval system that uses BLIP for image captioning and configurable embedding models for semantic search.

Features

Modular Architecture: Easily enable/disable embedding models via configuration
Multiple Embedding Models: Support for Qwen, GTE-multilingual, and EmbeddingGemma
Flexible Configuration: Choose which models to use without code changes
Incremental Indexing: Only processes new images, skips already indexed ones
On-the-fly Search: Search additional folders without pre-indexing

Setup

Install dependencies using UV:

uv sync

Configure which models to use by editing config.py:

EMBEDDING_MODELS = {
    "qwen": False,      # Qwen/Qwen3-Embedding-0.6B
    "gte": True,        # Alibaba-NLP/gte-multilingual-base (DEFAULT)
    "gemma": False,     # google/embeddinggemma-300m
}

Make sure you have a CUDA-compatible GPU for optimal performance (optional, works on CPU too).

Usage

1. Index Images

Run the indexing script to process all images in a folder:

uv run index_images.py

The script will:

Show which models are enabled
Ask for a folder path containing images
Generate captions for all images using BLIP
Create embeddings using all enabled models
Store embeddings and metadata in embeddings_data/ folder

2. Search Images

Run the search script to find similar images:

uv run search_images.py

The script will:

Show which models are enabled
Optionally index images from an additional folder
Ask for a search query
Find the top 3 most similar images using all enabled models
Display the images with their captions and similarity scores

Project Structure

ImageRetrival/
├── config.py              # Configuration file - enable/disable models here
├── embedding_models.py    # Modular embedding model classes
├── index_images.py       # Image indexing script
├── search_images.py      # Image search script
├── embeddings_data/      # Stored embeddings and metadata
│   ├── embeddings_*.npy  # NumPy arrays of embeddings (one per model)
│   └── metadata.json     # Image paths, captions, and timing info
└── README.md            # This file

Configuration

Enabling/Disabling Models

Edit config.py to enable or disable embedding models:

EMBEDDING_MODELS = {
    "qwen": False,   # Set to True to enable
    "gte": True,     # Currently enabled (default)
    "gemma": False,  # Set to True to enable
}

Available Models

Qwen (qwen): Qwen/Qwen3-Embedding-0.6B
- Good accuracy, moderate speed
- Supports query-specific encoding
GTE-multilingual (gte): Alibaba-NLP/gte-multilingual-base
- Fastest performance
- Multilingual support
- Currently enabled by default
EmbeddingGemma (gemma): google/embeddinggemma-300m
- Moderate speed
- Built-in similarity computation
- Separate query/document encoding

Files

config.py: Configuration file to enable/disable embedding models
embedding_models.py: Modular embedding model implementations
index_images.py: Indexes images by generating captions and embeddings
search_images.py: Searches through indexed images and displays results
embeddings_data/: Directory where embeddings and metadata are stored
- embeddings_*.npy: NumPy arrays of embeddings (one file per enabled model)
- metadata.json: JSON file with image paths, captions, timing, and enabled models

Requirements

Python >= 3.12
CUDA-compatible GPU (recommended, but works on CPU)
Transformers >= 4.51.0
Sentence-transformers >= 3.0.0
torch >= 2.0.0
numpy >= 1.24.0
matplotlib >= 3.7.0
pillow >= 10.0.0

Adding New Models

To add a new embedding model:

Add the model configuration to config.py:

EMBEDDING_MODELS = {
    "new_model": False,  # Add your new model
}

MODEL_CONFIGS = {
    "new_model": {
        "name": "model-name/path",
        # ... other config
    }
}

Create a model class in embedding_models.py:

class NewModelEmbedding(BaseEmbeddingModel):
    def __init__(self):
        super().__init__("model-name/path")
    # Implement required methods

Register it in the factory function in embedding_models.py:

model_classes = {
    "new_model": NewModelEmbedding,
    # ...
}

Notes

Only enabled models will be loaded and used
Embeddings are stored separately for each model
The system automatically handles missing embeddings (regenerates if needed)
All enabled models are used simultaneously for comparison

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
static		static
.DS_Store		.DS_Store
.gitignore		.gitignore
.python-version		.python-version
MODELS_RESULTS.md		MODELS_RESULTS.md
README.md		README.md
app.py		app.py
config.py		config.py
download_blip.py		download_blip.py
embedding_models.py		embedding_models.py
index_images.py		index_images.py
pyproject.toml		pyproject.toml
search_engine.py		search_engine.py
search_images.py		search_images.py
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Image Retrieval System

Features

Setup

Usage

1. Index Images

2. Search Images

Project Structure

Configuration

Enabling/Disabling Models

Available Models

Files

Requirements

Adding New Models

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Image Retrieval System

Features

Setup

Usage

1. Index Images

2. Search Images

Project Structure

Configuration

Enabling/Disabling Models

Available Models

Files

Requirements

Adding New Models

Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages