Skip to content

CodeMatrix1/LocalRAG

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

LocalRAG

🧠 AI-Powered Document Chunking and Querying Project

This project is designed to update a database with new documents, split them into smaller chunks, and query the database using a given query text to generate a response based on the most relevant chunks retrieved. The core features of this project include document chunking, database updating, and querying using a vector database and pre-trained language models. The project aims to provide an efficient and effective way to manage and query large documents.

🚀 Features

  • Document chunking: splits documents into smaller chunks based on a specified size
  • Database updating: updates the database with new document chunks
  • Querying: queries the database using a given query text to generate a response
  • Vector database: uses a Chroma vector database to store and query document chunks
  • Pre-trained language models: uses pre-trained language models for generating responses
  • Embeddings generation: generates embeddings for document chunks and query texts

🛠️ Tech Stack

  • langchain.vectorstores for Chroma vector database
  • langchain.text_splitter for RecursiveCharacterTextSplitter
  • langchain.document_loaders.pdf for PyPDFDirectoryLoader
  • Embeddings for embeddings generation
  • transformers for loading pre-trained language models
  • torch for GPU acceleration and tensor operations
  • sentence_transformers for natural language processing tasks
  • huggingface_hub for downloading models from the Hugging Face Hub
  • pathlib for handling file paths
  • os for interacting with the operating system
  • dotenv for loading environment variables

📦 Installation

To install the project, follow these steps:

  1. Clone the repository using git clone
  2. Install the required dependencies using pip install -r requirements.txt Additionally, you need to download the required models using the following scripts:
  • installations/mistral_install.py or
  • installations/phi2_download.py (whichever compatible)
  • installations/embeddings_install.py

💻 Usage

To use the project, follow these steps:

  1. Update the database with new documents using update_db.py
  2. Query the database using a given query text using query_rag.py
  3. Evaluate the response generated by query_rag.py using eval_resp.py

📂 Project Structure

.
├── update_db.py
├── query_rag.py
├── eval_resp.py
├── Embeddings.py
├── installations
│   ├── mistral_install.py
│   ├── phi2_download.py
│   ├── embeddings_install.py
├── models
│   ├── mistral
│   ├── phi-2
│   ├── all-MiniLM-L12-v2

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages