This document provides comprehensive API documentation for the enhanced RAG (Retrieval-Augmented Generation) system with source attribution and confidence scoring.
The enhanced RAG system provides several API endpoints for:
- Document management and processing
- Conversation handling with source attribution
- Enhanced response generation with confidence scoring
- Query processing with metadata and statistics
http://localhost:8000/api/v1
Currently, the system uses basic authentication or API keys. Include authentication headers with all requests:
Authorization: Bearer <your-api-key>
Content-Type: application/jsonPOST /documents/upload
Upload and process a document for RAG retrieval.
Request Body (multipart/form-data):
| Field | Type | Required | Description |
|---|---|---|---|
file |
File | ✓ | Document file (PDF, DOCX, TXT, MD, HTML) |
metadata |
JSON | ✗ | Additional document metadata |
processing_options |
JSON | ✗ | Custom processing configuration |
Example Request:
curl -X POST \
http://localhost:8000/api/v1/documents/upload \
-H "Authorization: Bearer your-api-key" \
-F "file=@document.pdf" \
-F "metadata={\"category\": \"legal\", \"priority\": \"high\"}"Response:
{
"success": true,
"document_id": "doc_20250125_abc123",
"filename": "document.pdf",
"file_size": 1024000,
"content_type": "application/pdf",
"upload_date": "2025-01-25T10:30:00Z",
"processing_status": "processing",
"estimated_completion": "2025-01-25T10:32:00Z"
}GET /documents/{document_id}/status
Get the processing status of a document.
Response:
{
"document_id": "doc_20250125_abc123",
"filename": "document.pdf",
"processing_status": "completed",
"chunk_count": 45,
"processing_time": 120.5,
"metadata": {
"pages": 10,
"word_count": 2500,
"language": "en"
}
}GET /documents
List all uploaded documents with filtering options.
Query Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
status |
string | ✗ | Filter by processing status |
content_type |
string | ✗ | Filter by content type |
limit |
integer | ✗ | Number of results (default: 50) |
offset |
integer | ✗ | Pagination offset (default: 0) |
sort_by |
string | ✗ | Sort field (upload_date, filename, file_size) |
sort_order |
string | ✗ | Sort order (asc, desc) |
Response:
{
"documents": [
{
"document_id": "doc_20250125_abc123",
"filename": "document.pdf",
"file_size": 1024000,
"content_type": "application/pdf",
"upload_date": "2025-01-25T10:30:00Z",
"processing_status": "completed",
"chunk_count": 45
}
],
"total_count": 150,
"has_more": true
}DELETE /documents/{document_id}
Delete a document and all its associated data.
Response:
{
"success": true,
"message": "Document deleted successfully",
"document_id": "doc_20250125_abc123"
}POST /conversations
Start a new conversation with initial query.
Request Body:
{
"initial_question": "What is the address of the property being purchased?",
"filters": {
"content_type": "application/pdf",
"document_ids": ["doc_001", "doc_002"],
"date_range": {
"start": "2024-01-01",
"end": "2025-01-25"
}
},
"search_mode": "multi_document",
"options": {
"include_sources": true,
"max_sources": 10,
"confidence_threshold": 0.5
}
}Response:
{
"success": true,
"conversation_id": "conv_20250125_xyz789",
"answer": "The property being purchased is located at 172 McLeod Road, London, SE2 0BT.",
"sources": [
{
"citation_id": "cite_001",
"document_id": "doc_001",
"document_name": "Purchase_Agreement.pdf",
"chunk_id": "chunk_doc001_005",
"page_number": 1,
"paragraph_number": 3,
"excerpt": "The property being purchased is located at 172 McLeod Road, London, SE2 0BT. The buyers are Colombe De Rotalier and James Algar.",
"relevance_score": 0.89,
"highlights": [
{
"start": 41,
"end": 79
}
]
}
],
"attributions": [
{
"answerStart": 41,
"answerEnd": 79,
"sourceId": "doc_001",
"sourceStart": 41,
"sourceEnd": 79,
"confidence": 0.89
}
],
"metadata": {
"chunksAnalyzed": 15,
"documentsSearched": 5,
"processingTime": 3.82,
"retrievalTime": 1.2,
"generationTime": 2.62,
"model": "gpt-4",
"temperature": 0.3,
"maxTokens": 1000
}
}POST /conversations/{conversation_id}/messages
Send a message to an existing conversation.
Request Body:
{
"message": "Who are the buyers?",
"filters": {
"document_ids": ["doc_001"]
},
"include_sources": true,
"search_mode": "multi_document"
}Response: (Same format as conversation creation)
GET /conversations/{conversation_id}/history
Retrieve the full conversation history.
Response:
{
"conversation_id": "conv_20250125_xyz789",
"start_time": "2025-01-25T10:30:00Z",
"total_turns": 3,
"messages": [
{
"role": "user",
"content": "What is the address of the property?",
"timestamp": "2025-01-25T10:30:00Z"
},
{
"role": "assistant",
"content": "The property is located at 172 McLeod Road, London, SE2 0BT.",
"sources": [...],
"attributions": [...],
"metadata": {...},
"timestamp": "2025-01-25T10:30:05Z"
}
],
"filters": {
"content_type": "application/pdf"
}
}POST /query/enhanced
Process a query using the enhanced RAG system with advanced features.
Request Body:
{
"query": "What are the key terms of the purchase agreement?",
"options": {
"use_hybrid_search": true,
"enable_reranking": true,
"enable_compression": true,
"max_chunks": 20,
"similarity_threshold": 0.7,
"include_reasoning": true
},
"filters": {
"document_types": ["legal", "contracts"],
"date_range": {
"start": "2024-01-01",
"end": "2025-01-25"
}
}
}Response:
{
"success": true,
"query_id": "query_abc123def",
"answer": "The key terms of the purchase agreement include...",
"reasoning": {
"query_type": "factual",
"confidence": 0.92,
"reasoning_chain": [
"Identified query as factual information request",
"Retrieved relevant contract documents",
"Extracted key terms from purchase agreement sections"
]
},
"sources": [
{
"citation_id": "cite_002",
"document_id": "doc_001",
"document_name": "Purchase_Agreement.pdf",
"chunk_id": "chunk_doc001_010",
"page_number": 3,
"paragraph_number": 2,
"excerpt": "The purchase price is £850,000, with completion scheduled for March 15, 2025...",
"relevance_score": 0.94,
"confidence_score": 0.91,
"highlights": [
{
"start": 4,
"end": 30
}
],
"rerank_score": 0.87
}
],
"attributions": [
{
"answerStart": 45,
"answerEnd": 71,
"sourceId": "doc_001",
"sourceStart": 4,
"sourceEnd": 30,
"confidence": 0.91,
"attribution_type": "direct_quote"
}
],
"metadata": {
"chunksAnalyzed": 25,
"documentsSearched": 8,
"processingTime": 4.15,
"retrievalTime": 1.8,
"reranking_time": 0.95,
"generationTime": 1.4,
"model": "gpt-4",
"temperature": 0.3,
"maxTokens": 1500,
"hybrid_weights": {
"semantic": 0.7,
"lexical": 0.3
},
"query_classification": {
"type": "factual",
"confidence": 0.92,
"subcategories": ["contract_terms", "legal_documents"]
}
}
}POST /query/{query_id}/rerun
Re-run a previous query with modified parameters.
Request Body:
{
"options": {
"clear_cache": true,
"expand_search": true,
"temperature": 0.5,
"max_chunks": 30
}
}GET /citations/{citation_id}
Get detailed information about a specific citation.
Response:
{
"citation_id": "cite_002",
"document_id": "doc_001",
"document_name": "Purchase_Agreement.pdf",
"chunk_id": "chunk_doc001_010",
"page_number": 3,
"paragraph_number": 2,
"excerpt": "The purchase price is £850,000, with completion scheduled for March 15, 2025...",
"full_context": "This agreement outlines the terms... The purchase price is £850,000, with completion scheduled for March 15, 2025. The buyer agrees to...",
"relevance_score": 0.94,
"confidence_score": 0.91,
"metadata": {
"section": "financial_terms",
"extracted_at": "2025-01-25T09:15:00Z"
}
}GET /documents/{document_id}/preview
Get a preview of the document with highlighted relevant sections.
Query Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
query |
string | ✗ | Query to highlight relevant sections |
chunk_ids |
string | ✗ | Comma-separated chunk IDs to highlight |
page |
integer | ✗ | Specific page to preview |
Response:
{
"document_id": "doc_001",
"document_name": "Purchase_Agreement.pdf",
"total_pages": 10,
"preview_data": {
"page_number": 3,
"text_content": "This agreement outlines the terms...",
"highlights": [
{
"start": 45,
"end": 71,
"type": "query_match",
"confidence": 0.91
}
],
"metadata": {
"word_count": 450,
"char_count": 2800
}
}
}GET /statistics
Get overall system statistics.
Response:
{
"documents": {
"total_count": 150,
"by_status": {
"completed": 140,
"processing": 8,
"failed": 2
},
"by_type": {
"application/pdf": 120,
"application/vnd.openxmlformats-officedocument.wordprocessingml.document": 25,
"text/plain": 5
},
"total_size_mb": 2450.5,
"total_chunks": 15670
},
"conversations": {
"total_count": 1250,
"active_count": 45,
"avg_turns_per_conversation": 4.2
},
"performance": {
"avg_query_time": 3.2,
"avg_retrieval_time": 1.1,
"avg_generation_time": 2.1,
"cache_hit_rate": 0.65
},
"last_updated": "2025-01-25T10:30:00Z"
}GET /analytics/queries
Get analytics data for queries and conversations.
Query Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
start_date |
string | ✗ | Start date (ISO 8601) |
end_date |
string | ✗ | End date (ISO 8601) |
group_by |
string | ✗ | Grouping (hour, day, week, month) |
Response:
{
"time_period": {
"start": "2025-01-20T00:00:00Z",
"end": "2025-01-25T23:59:59Z"
},
"query_stats": {
"total_queries": 450,
"unique_conversations": 180,
"avg_confidence_score": 0.78,
"query_types": {
"factual": 280,
"analytical": 120,
"comparative": 50
}
},
"performance_metrics": {
"avg_response_time": 3.2,
"p95_response_time": 8.1,
"p99_response_time": 15.3,
"error_rate": 0.02
},
"daily_breakdown": [
{
"date": "2025-01-25",
"query_count": 95,
"avg_response_time": 3.1,
"avg_confidence": 0.82
}
]
}All enhanced RAG responses follow this structure:
interface EnhancedRAGResponse {
success: boolean;
query_id?: string;
conversation_id?: string;
answer: string;
sources: SourceCitation[];
attributions: Attribution[];
metadata: ResponseMetadata;
reasoning?: ReasoningInfo;
error?: string;
}
interface SourceCitation {
citation_id: string;
document_id: string;
document_name: string;
chunk_id: string;
page_number?: number;
paragraph_number?: number;
excerpt: string;
relevance_score: number;
confidence_score?: number;
rerank_score?: number;
highlights: TextHighlight[];
metadata?: Record<string, any>;
}
interface Attribution {
answerStart: number;
answerEnd: number;
sourceId: string;
sourceStart: number;
sourceEnd: number;
confidence: number;
attribution_type?: 'direct_quote' | 'paraphrase' | 'inference';
}
interface TextHighlight {
start: number;
end: number;
type?: 'query_match' | 'relevant_context';
confidence?: number;
}
interface ResponseMetadata {
chunksAnalyzed: number;
documentsSearched: number;
processingTime: number;
retrievalTime: number;
generationTime: number;
reranking_time?: number;
model: string;
temperature: number;
maxTokens: number;
query_classification?: QueryClassification;
hybrid_weights?: HybridWeights;
}
interface ReasoningInfo {
query_type: string;
confidence: number;
reasoning_chain: string[];
thought_process?: string;
}The attribution system maps specific segments of the generated answer to their source documents:
- Answer Segments: Character-level offsets in the generated answer
- Source Mapping: Corresponding text segments in source documents
- Confidence Scores: Reliability measure for each attribution
- Highlight Coordinates: Precise text highlighting information
{
"success": false,
"error": {
"code": "DOCUMENT_NOT_FOUND",
"message": "The specified document could not be found",
"details": {
"document_id": "doc_invalid",
"requested_at": "2025-01-25T10:30:00Z"
}
}
}| Code | HTTP Status | Description |
|---|---|---|
DOCUMENT_NOT_FOUND |
404 | Document does not exist |
DOCUMENT_PROCESSING |
202 | Document still processing |
INVALID_QUERY |
400 | Query format or content invalid |
CONVERSATION_NOT_FOUND |
404 | Conversation ID not found |
RATE_LIMIT_EXCEEDED |
429 | Too many requests |
INSUFFICIENT_SOURCES |
200 | Query successful but limited sources |
PROCESSING_ERROR |
500 | Internal processing error |
AUTHENTICATION_FAILED |
401 | Invalid or missing authentication |
AUTHORIZATION_FAILED |
403 | Insufficient permissions |
The API implements rate limiting to ensure fair usage:
- Document Upload: 10 requests per minute
- Query Processing: 60 requests per minute
- Conversation Messages: 100 requests per minute
- Statistics/Analytics: 30 requests per minute
Rate limit headers are included in responses:
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1706180400Register webhook URLs to receive notifications about document processing:
POST /webhooks/register
{
"url": "https://yourapp.com/webhooks/documents",
"events": ["document.processed", "document.failed"],
"secret": "your-webhook-secret"
}{
"event": "document.processed",
"timestamp": "2025-01-25T10:32:00Z",
"data": {
"document_id": "doc_20250125_abc123",
"filename": "document.pdf",
"processing_status": "completed",
"chunk_count": 45,
"processing_time": 120.5
}
}from rag_client import RAGClient
# Initialize client
client = RAGClient(
base_url="http://localhost:8000/api/v1",
api_key="your-api-key"
)
# Upload document
document = client.documents.upload(
file_path="contract.pdf",
metadata={"category": "legal"}
)
# Start conversation
conversation = client.conversations.create(
initial_question="What is the purchase price?",
filters={"document_ids": [document.document_id]}
)
print(f"Answer: {conversation.answer}")
for source in conversation.sources:
print(f"Source: {source.document_name} (confidence: {source.confidence_score})")import { RAGClient } from '@company/rag-client';
const client = new RAGClient({
baseUrl: 'http://localhost:8000/api/v1',
apiKey: 'your-api-key'
});
// Upload document
const document = await client.documents.upload({
file: fileObject,
metadata: { category: 'legal' }
});
// Process query with enhanced features
const response = await client.query.enhanced({
query: 'What are the key terms?',
options: {
useHybridSearch: true,
enableReranking: true,
maxChunks: 20
}
});
console.log('Answer:', response.answer);
console.log('Sources:', response.sources.length);
console.log('Processing time:', response.metadata.processingTime);- Update API Calls: Add new fields for enhanced features
- Handle New Response Format: Process attribution and metadata
- Update UI Components: Integrate source panels and confidence visualization
- Configure Enhanced Features: Enable hybrid search, reranking, compression
- Response format now includes
attributionsarray - Source citations have additional
confidence_scorefield - Metadata structure expanded with performance metrics
- Query classification added to responses
The API maintains backward compatibility for:
- Basic query endpoints without enhanced features
- Document upload and management
- Simple conversation history retrieval
- Document Embeddings: Cached for 7 days
- Query Results: Cached for 1 hour (configurable)
- Conversation Context: Cached during session
- Batch Operations: Upload multiple documents together
- Filter Usage: Use specific filters to reduce search scope
- Confidence Thresholds: Set appropriate thresholds for quality
- Chunking Strategy: Optimize chunk size for your content type
GET /health
{
"status": "healthy",
"version": "2.0.0",
"timestamp": "2025-01-25T10:30:00Z",
"services": {
"database": "healthy",
"vector_store": "healthy",
"embedding_service": "healthy",
"llm_service": "healthy"
},
"metrics": {
"uptime_seconds": 86400,
"memory_usage_mb": 2048,
"active_connections": 25
}
}GET /metrics
Returns Prometheus-compatible metrics for monitoring.
- Keys can be scoped to specific operations
- Support for key rotation
- Rate limiting per key
- Usage analytics per key
- Document encryption at rest
- PII detection and masking
- Audit logging for all operations
- GDPR compliance features
For API support:
- Documentation: https://docs.company.com/rag-api
- Status Page: https://status.company.com
- Support Email: api-support@company.com
- GitHub Issues: https://github.com/company/rag-api/issues