A HIPAA-compliant medical chatbot using CyborgDB encrypted vector search for secure RAG pipelines where medical record embeddings are encrypted end-to-end.
This application demonstrates encryption-in-use for AI/ML workloads, preventing vector inversion attacks by keeping embeddings encrypted throughout storage and retrieval. Medical professionals can interact with patient data through a secure chatbot without exposing sensitive vector representations.
- π End-to-End Encryption: Medical embeddings encrypted with AES-256-GCM before storage
- π₯ HIPAA Compliance: Full audit logging and role-based access control
- π€ RAG Pipeline: Retrieval-Augmented Generation with encrypted similarity search
- β‘ Real-time Chat: WebSocket-based streaming responses
- π Performance Metrics: Built-in benchmarking for query latency and throughput
- π³ Docker Ready: Complete containerization with docker-compose
graph TB
A[React Frontend] -->|WebSocket| B[NestJS Backend]
A -->|REST API| B
B --> C[Auth Module - JWT + RBAC]
B --> D[RAG Service]
D --> E[Embeddings Service]
D --> F[Encryption Service]
D --> G[CyborgDB Adapter]
G -->|Encrypted Vectors| H[(CyborgDB)]
B --> I[(PostgreSQL)]
I --> J[Medical Records]
I --> K[Audit Logs]
I --> L[Users]
style H fill:#ff6b6b
style F fill:#4ecdc4
style C fill:#45b7d1
Backend:
- NestJS (TypeScript)
- PostgreSQL with TypeORM
- CyborgDB (Encrypted Vector Search)
- JWT Authentication
- WebSocket (Socket.io)
Frontend:
- React 18
- TypeScript
- Vite
- TailwindCSS
- Socket.io-client
Embedding Models:
- OpenAI Embeddings API (configurable)
- Ollama (local deployment option)
- Node.js 18+ and npm
- Docker and Docker Compose
- PostgreSQL 14+ (if running without Docker)
- CyborgDB instance or credentials
Create .env files in both backend and frontend directories:
Backend (backend/.env):
# Copy from backend/.env.example and fill in values
NODE_ENV=development
PORT=3000
# Database
DATABASE_HOST=localhost
DATABASE_PORT=5432
DATABASE_USER=postgres
DATABASE_PASSWORD=your_password
DATABASE_NAME=encrypted_medical_rag
# CyborgDB
CYBORGDB_API_URL=https://api.cyborgdb.com
CYBORGDB_API_KEY=your_cyborgdb_api_key
CYBORGDB_INDEX_NAME=medical-embeddings
# Encryption
ENCRYPTION_KEY=your-256-bit-hex-key-here
ENCRYPTION_ALGORITHM=aes-256-gcm
# JWT
JWT_SECRET=your-jwt-secret-key
JWT_EXPIRATION=24h
# Embeddings (OpenAI or Ollama)
EMBEDDING_PROVIDER=openai
OPENAI_API_KEY=your_openai_api_key
# OR for Ollama:
# EMBEDDING_PROVIDER=ollama
# OLLAMA_BASE_URL=http://localhost:11434Frontend (frontend/.env):
VITE_API_BASE_URL=http://localhost:3000
VITE_WS_URL=ws://localhost:3000# Start all services
docker-compose -f docker/docker-compose.yml up --build
# Frontend: http://localhost:5173
# Backend API: http://localhost:3000
# API Docs: http://localhost:3000/apiBackend:
cd backend
npm install
npm run start:devFrontend:
cd frontend
npm install
npm run devNavigate to http://localhost:5173 and create an account. Default roles:
user: Can view and query medical recordsdoctor: Can create and manage medical recordsadmin: Full access + audit log viewing
Use the API or UI to upload medical documents. The system will:
- Chunk the document
- Generate embeddings using OpenAI/Ollama
- Encrypt embeddings with AES-256-GCM
- Store encrypted vectors in CyborgDB
- Store metadata in PostgreSQL
# Example API call
curl -X POST http://localhost:3000/api/medical-records \
-H "Authorization: Bearer YOUR_JWT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"patientId": "P001",
"content": "Patient presents with...",
"recordType": "clinical_note"
}'Open the chat interface and ask medical questions:
- "What are the symptoms for patient P001?"
- "Show me the latest lab results"
The RAG pipeline will:
- Generate embedding for your query
- Search encrypted vectors in CyborgDB
- Decrypt results in-memory only
- Assemble context
- Stream LLM response
cd backend
npm testcd backend
npm run test:e2ecd scripts/performance-tests
npm install
npm run benchmarkThis will generate a performance report with:
- Query latency (p50, p95, p99)
- Throughput (queries/second)
- Memory usage
- Encryption/decryption overhead
Expected Performance:
- Query latency: <100ms (p95)
- Throughput: >100 queries/second
- Encryption overhead: <10ms per operation
β
Encryption at Rest: All embeddings encrypted before storage
β
Encryption in Transit: HTTPS/WSS for all communications
β
Access Control: Role-based access with permission checks
β
Audit Logging: Complete audit trail for all data access
β
Authentication: JWT-based auth with secure password hashing
β
Data Minimization: Only necessary data in memory
Medical Record β Chunking β Embedding Generation
β
AES-256-GCM Encryption
β
CyborgDB Storage
β
(encrypted at rest)
β
Query β Similarity Search (encrypted)
β
Decrypt in-memory ONLY
β
Context Assembly β LLM
Once running, visit http://localhost:3000/api for interactive Swagger documentation.
POST /api/auth/register- Register new userPOST /api/auth/login- Login and get JWTGET /api/medical-records- List medical recordsPOST /api/medical-records- Create medical recordPOST /api/chat/message- Send chat messageGET /api/logs/audit- View audit logs (admin only)
encrypted-medical-rag-ai/
βββ backend/
β βββ src/
β β βββ modules/
β β β βββ auth/ # JWT authentication & RBAC
β β β βββ cyborgdb/ # CyborgDB integration
β β β βββ encryption/ # AES-256-GCM encryption
β β β βββ embeddings/ # OpenAI/Ollama integration
β β β βββ medical-records/
β β β βββ rag/ # RAG pipeline
β β β βββ chat/ # WebSocket chat
β β β βββ logs/ # Audit logging
β β βββ config/
β β βββ app.module.ts
β β βββ main.ts
β βββ test/
β βββ package.json
βββ frontend/
β βββ src/
β β βββ components/
β β βββ pages/
β β βββ services/
β β βββ main.tsx
β βββ package.json
βββ scripts/
β βββ performance-tests/
βββ docker/
β βββ docker-compose.yml
βββ README.md
- Create a new module in
backend/src/modules/ - Register in
app.module.ts - Add corresponding frontend components in
frontend/src/ - Update tests and documentation
-
CyborgDB Integration: This implementation uses an adapter pattern. You'll need to integrate the actual CyborgDB SDK based on their documentation.
-
Embedding Models: The application supports OpenAI and Ollama. For production, consider:
- Rate limiting for API calls
- Caching frequently used embeddings
- Batch processing for large datasets
-
Scalability: Current implementation is single-instance. For production:
- Use Redis for session management
- Implement horizontal scaling with load balancer
- Consider message queue for async processing
-
Vector Inversion Protection: While encryption prevents direct vector inversion, consider:
- Rate limiting queries to prevent inference attacks
- Monitoring for suspicious access patterns
- Regular key rotation
Run the benchmarking suite to evaluate:
cd scripts/performance-tests
npm run benchmark -- --dataset-size 10000 --concurrent-queries 100Results will be saved to performance-report.json with metrics:
- Average query latency
- Throughput (QPS)
- Encryption/decryption time
- Memory usage
- p50, p95, p99 latency percentiles
This is a demonstration project. For production use:
- Complete CyborgDB integration with official SDK
- Implement comprehensive error handling
- Add monitoring and alerting
- Conduct security audit
- Implement key rotation strategy
- Add data backup and recovery
MIT License - see LICENSE file for details
For issues or questions:
- Check the documentation
- Review audit logs for errors
- Run diagnostic scripts in
scripts/ - Open an issue on GitHub
Built with β€οΈ for secure AI applications
Preventing vector inversion attacks through encryption-in-use