LocalRAG

🧠 AI-Powered Document Chunking and Querying Project

This project is designed to update a database with new documents, split them into smaller chunks, and query the database using a given query text to generate a response based on the most relevant chunks retrieved. The core features of this project include document chunking, database updating, and querying using a vector database and pre-trained language models. The project aims to provide an efficient and effective way to manage and query large documents.

🚀 Features

Document chunking: splits documents into smaller chunks based on a specified size
Database updating: updates the database with new document chunks
Querying: queries the database using a given query text to generate a response
Vector database: uses a Chroma vector database to store and query document chunks
Pre-trained language models: uses pre-trained language models for generating responses
Embeddings generation: generates embeddings for document chunks and query texts

🛠️ Tech Stack

langchain.vectorstores for Chroma vector database
langchain.text_splitter for RecursiveCharacterTextSplitter
langchain.document_loaders.pdf for PyPDFDirectoryLoader
Embeddings for embeddings generation
transformers for loading pre-trained language models
torch for GPU acceleration and tensor operations
sentence_transformers for natural language processing tasks
huggingface_hub for downloading models from the Hugging Face Hub
pathlib for handling file paths
os for interacting with the operating system
dotenv for loading environment variables

📦 Installation

To install the project, follow these steps:

Clone the repository using git clone
Install the required dependencies using pip install -r requirements.txt Additionally, you need to download the required models using the following scripts:

installations/mistral_install.py or
installations/phi2_download.py (whichever compatible)
installations/embeddings_install.py

💻 Usage

To use the project, follow these steps:

Update the database with new documents using update_db.py
Query the database using a given query text using query_rag.py
Evaluate the response generated by query_rag.py using eval_resp.py

📂 Project Structure

.
├── update_db.py
├── query_rag.py
├── eval_resp.py
├── Embeddings.py
├── installations
│   ├── mistral_install.py
│   ├── phi2_download.py
│   ├── embeddings_install.py
├── models
│   ├── mistral
│   ├── phi-2
│   ├── all-MiniLM-L12-v2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LocalRAG

🧠 AI-Powered Document Chunking and Querying Project

🚀 Features

🛠️ Tech Stack

📦 Installation

💻 Usage

📂 Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
installations		installations
.gitignore		.gitignore
Embeddings.py		Embeddings.py
README.md		README.md
eval_resp.py		eval_resp.py
query_rag.py		query_rag.py
requirements.txt		requirements.txt
update_db.py		update_db.py

Folders and files

Latest commit

History

Repository files navigation

LocalRAG

🧠 AI-Powered Document Chunking and Querying Project

🚀 Features

🛠️ Tech Stack

📦 Installation

💻 Usage

📂 Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages