API Documentation - Enhanced RAG System

This document provides comprehensive API documentation for the enhanced RAG (Retrieval-Augmented Generation) system with source attribution and confidence scoring.

Overview

The enhanced RAG system provides several API endpoints for:

Document management and processing
Conversation handling with source attribution
Enhanced response generation with confidence scoring
Query processing with metadata and statistics

Base URL

http://localhost:8000/api/v1

Authentication

Currently, the system uses basic authentication or API keys. Include authentication headers with all requests:

Authorization: Bearer <your-api-key>
Content-Type: application/json

Endpoints

Document Management

Upload Document

POST /documents/upload

Upload and process a document for RAG retrieval.

Request Body (multipart/form-data):

Field	Type	Required	Description
`file`	File	✓	Document file (PDF, DOCX, TXT, MD, HTML)
`metadata`	JSON	✗	Additional document metadata
`processing_options`	JSON	✗	Custom processing configuration

Example Request:

curl -X POST \
  http://localhost:8000/api/v1/documents/upload \
  -H "Authorization: Bearer your-api-key" \
  -F "file=@document.pdf" \
  -F "metadata={\"category\": \"legal\", \"priority\": \"high\"}"

Response:

{
  "success": true,
  "document_id": "doc_20250125_abc123",
  "filename": "document.pdf",
  "file_size": 1024000,
  "content_type": "application/pdf",
  "upload_date": "2025-01-25T10:30:00Z",
  "processing_status": "processing",
  "estimated_completion": "2025-01-25T10:32:00Z"
}

Get Document Status

GET /documents/{document_id}/status

Get the processing status of a document.

Response:

{
  "document_id": "doc_20250125_abc123",
  "filename": "document.pdf",
  "processing_status": "completed",
  "chunk_count": 45,
  "processing_time": 120.5,
  "metadata": {
    "pages": 10,
    "word_count": 2500,
    "language": "en"
  }
}

List Documents

GET /documents

List all uploaded documents with filtering options.

Query Parameters:

Parameter	Type	Required	Description
`status`	string	✗	Filter by processing status
`content_type`	string	✗	Filter by content type
`limit`	integer	✗	Number of results (default: 50)
`offset`	integer	✗	Pagination offset (default: 0)
`sort_by`	string	✗	Sort field (upload_date, filename, file_size)
`sort_order`	string	✗	Sort order (asc, desc)

Response:

{
  "documents": [
    {
      "document_id": "doc_20250125_abc123",
      "filename": "document.pdf",
      "file_size": 1024000,
      "content_type": "application/pdf",
      "upload_date": "2025-01-25T10:30:00Z",
      "processing_status": "completed",
      "chunk_count": 45
    }
  ],
  "total_count": 150,
  "has_more": true
}

Delete Document

DELETE /documents/{document_id}

Delete a document and all its associated data.

Response:

{
  "success": true,
  "message": "Document deleted successfully",
  "document_id": "doc_20250125_abc123"
}

Conversation Management

Create Conversation

POST /conversations

Start a new conversation with initial query.

Request Body:

{
  "initial_question": "What is the address of the property being purchased?",
  "filters": {
    "content_type": "application/pdf",
    "document_ids": ["doc_001", "doc_002"],
    "date_range": {
      "start": "2024-01-01",
      "end": "2025-01-25"
    }
  },
  "search_mode": "multi_document",
  "options": {
    "include_sources": true,
    "max_sources": 10,
    "confidence_threshold": 0.5
  }
}

Response:

{
  "success": true,
  "conversation_id": "conv_20250125_xyz789",
  "answer": "The property being purchased is located at 172 McLeod Road, London, SE2 0BT.",
  "sources": [
    {
      "citation_id": "cite_001",
      "document_id": "doc_001",
      "document_name": "Purchase_Agreement.pdf",
      "chunk_id": "chunk_doc001_005",
      "page_number": 1,
      "paragraph_number": 3,
      "excerpt": "The property being purchased is located at 172 McLeod Road, London, SE2 0BT. The buyers are Colombe De Rotalier and James Algar.",
      "relevance_score": 0.89,
      "highlights": [
        {
          "start": 41,
          "end": 79
        }
      ]
    }
  ],
  "attributions": [
    {
      "answerStart": 41,
      "answerEnd": 79,
      "sourceId": "doc_001",
      "sourceStart": 41,
      "sourceEnd": 79,
      "confidence": 0.89
    }
  ],
  "metadata": {
    "chunksAnalyzed": 15,
    "documentsSearched": 5,
    "processingTime": 3.82,
    "retrievalTime": 1.2,
    "generationTime": 2.62,
    "model": "gpt-4",
    "temperature": 0.3,
    "maxTokens": 1000
  }
}

Send Message

POST /conversations/{conversation_id}/messages

Send a message to an existing conversation.

Request Body:

{
  "message": "Who are the buyers?",
  "filters": {
    "document_ids": ["doc_001"]
  },
  "include_sources": true,
  "search_mode": "multi_document"
}

Response: (Same format as conversation creation)

Get Conversation History

GET /conversations/{conversation_id}/history

Retrieve the full conversation history.

Response:

{
  "conversation_id": "conv_20250125_xyz789",
  "start_time": "2025-01-25T10:30:00Z",
  "total_turns": 3,
  "messages": [
    {
      "role": "user",
      "content": "What is the address of the property?",
      "timestamp": "2025-01-25T10:30:00Z"
    },
    {
      "role": "assistant",
      "content": "The property is located at 172 McLeod Road, London, SE2 0BT.",
      "sources": [...],
      "attributions": [...],
      "metadata": {...},
      "timestamp": "2025-01-25T10:30:05Z"
    }
  ],
  "filters": {
    "content_type": "application/pdf"
  }
}

Enhanced Query Processing

Process Query (Enhanced)

POST /query/enhanced

Process a query using the enhanced RAG system with advanced features.

Request Body:

{
  "query": "What are the key terms of the purchase agreement?",
  "options": {
    "use_hybrid_search": true,
    "enable_reranking": true,
    "enable_compression": true,
    "max_chunks": 20,
    "similarity_threshold": 0.7,
    "include_reasoning": true
  },
  "filters": {
    "document_types": ["legal", "contracts"],
    "date_range": {
      "start": "2024-01-01",
      "end": "2025-01-25"
    }
  }
}

Response:

{
  "success": true,
  "query_id": "query_abc123def",
  "answer": "The key terms of the purchase agreement include...",
  "reasoning": {
    "query_type": "factual",
    "confidence": 0.92,
    "reasoning_chain": [
      "Identified query as factual information request",
      "Retrieved relevant contract documents",
      "Extracted key terms from purchase agreement sections"
    ]
  },
  "sources": [
    {
      "citation_id": "cite_002",
      "document_id": "doc_001",
      "document_name": "Purchase_Agreement.pdf",
      "chunk_id": "chunk_doc001_010",
      "page_number": 3,
      "paragraph_number": 2,
      "excerpt": "The purchase price is £850,000, with completion scheduled for March 15, 2025...",
      "relevance_score": 0.94,
      "confidence_score": 0.91,
      "highlights": [
        {
          "start": 4,
          "end": 30
        }
      ],
      "rerank_score": 0.87
    }
  ],
  "attributions": [
    {
      "answerStart": 45,
      "answerEnd": 71,
      "sourceId": "doc_001",
      "sourceStart": 4,
      "sourceEnd": 30,
      "confidence": 0.91,
      "attribution_type": "direct_quote"
    }
  ],
  "metadata": {
    "chunksAnalyzed": 25,
    "documentsSearched": 8,
    "processingTime": 4.15,
    "retrievalTime": 1.8,
    "reranking_time": 0.95,
    "generationTime": 1.4,
    "model": "gpt-4",
    "temperature": 0.3,
    "maxTokens": 1500,
    "hybrid_weights": {
      "semantic": 0.7,
      "lexical": 0.3
    },
    "query_classification": {
      "type": "factual",
      "confidence": 0.92,
      "subcategories": ["contract_terms", "legal_documents"]
    }
  }
}

Re-run Query

POST /query/{query_id}/rerun

Re-run a previous query with modified parameters.

Request Body:

{
  "options": {
    "clear_cache": true,
    "expand_search": true,
    "temperature": 0.5,
    "max_chunks": 30
  }
}

Citations and Sources

Get Citation Details

GET /citations/{citation_id}

Get detailed information about a specific citation.

Response:

{
  "citation_id": "cite_002",
  "document_id": "doc_001",
  "document_name": "Purchase_Agreement.pdf",
  "chunk_id": "chunk_doc001_010",
  "page_number": 3,
  "paragraph_number": 2,
  "excerpt": "The purchase price is £850,000, with completion scheduled for March 15, 2025...",
  "full_context": "This agreement outlines the terms... The purchase price is £850,000, with completion scheduled for March 15, 2025. The buyer agrees to...",
  "relevance_score": 0.94,
  "confidence_score": 0.91,
  "metadata": {
    "section": "financial_terms",
    "extracted_at": "2025-01-25T09:15:00Z"
  }
}

Get Document Preview

GET /documents/{document_id}/preview

Get a preview of the document with highlighted relevant sections.

Query Parameters:

Parameter	Type	Required	Description
`query`	string	✗	Query to highlight relevant sections
`chunk_ids`	string	✗	Comma-separated chunk IDs to highlight
`page`	integer	✗	Specific page to preview

Response:

{
  "document_id": "doc_001",
  "document_name": "Purchase_Agreement.pdf",
  "total_pages": 10,
  "preview_data": {
    "page_number": 3,
    "text_content": "This agreement outlines the terms...",
    "highlights": [
      {
        "start": 45,
        "end": 71,
        "type": "query_match",
        "confidence": 0.91
      }
    ],
    "metadata": {
      "word_count": 450,
      "char_count": 2800
    }
  }
}

Statistics and Analytics

Get System Statistics

GET /statistics

Get overall system statistics.

Response:

{
  "documents": {
    "total_count": 150,
    "by_status": {
      "completed": 140,
      "processing": 8,
      "failed": 2
    },
    "by_type": {
      "application/pdf": 120,
      "application/vnd.openxmlformats-officedocument.wordprocessingml.document": 25,
      "text/plain": 5
    },
    "total_size_mb": 2450.5,
    "total_chunks": 15670
  },
  "conversations": {
    "total_count": 1250,
    "active_count": 45,
    "avg_turns_per_conversation": 4.2
  },
  "performance": {
    "avg_query_time": 3.2,
    "avg_retrieval_time": 1.1,
    "avg_generation_time": 2.1,
    "cache_hit_rate": 0.65
  },
  "last_updated": "2025-01-25T10:30:00Z"
}

Get Query Analytics

GET /analytics/queries

Get analytics data for queries and conversations.

Query Parameters:

Parameter	Type	Required	Description
`start_date`	string	✗	Start date (ISO 8601)
`end_date`	string	✗	End date (ISO 8601)
`group_by`	string	✗	Grouping (hour, day, week, month)

Response:

{
  "time_period": {
    "start": "2025-01-20T00:00:00Z",
    "end": "2025-01-25T23:59:59Z"
  },
  "query_stats": {
    "total_queries": 450,
    "unique_conversations": 180,
    "avg_confidence_score": 0.78,
    "query_types": {
      "factual": 280,
      "analytical": 120,
      "comparative": 50
    }
  },
  "performance_metrics": {
    "avg_response_time": 3.2,
    "p95_response_time": 8.1,
    "p99_response_time": 15.3,
    "error_rate": 0.02
  },
  "daily_breakdown": [
    {
      "date": "2025-01-25",
      "query_count": 95,
      "avg_response_time": 3.1,
      "avg_confidence": 0.82
    }
  ]
}

Enhanced Response Format

Standard Response Structure

All enhanced RAG responses follow this structure:

interface EnhancedRAGResponse {
  success: boolean;
  query_id?: string;
  conversation_id?: string;
  answer: string;
  sources: SourceCitation[];
  attributions: Attribution[];
  metadata: ResponseMetadata;
  reasoning?: ReasoningInfo;
  error?: string;
}

interface SourceCitation {
  citation_id: string;
  document_id: string;
  document_name: string;
  chunk_id: string;
  page_number?: number;
  paragraph_number?: number;
  excerpt: string;
  relevance_score: number;
  confidence_score?: number;
  rerank_score?: number;
  highlights: TextHighlight[];
  metadata?: Record<string, any>;
}

interface Attribution {
  answerStart: number;
  answerEnd: number;
  sourceId: string;
  sourceStart: number;
  sourceEnd: number;
  confidence: number;
  attribution_type?: 'direct_quote' | 'paraphrase' | 'inference';
}

interface TextHighlight {
  start: number;
  end: number;
  type?: 'query_match' | 'relevant_context';
  confidence?: number;
}

interface ResponseMetadata {
  chunksAnalyzed: number;
  documentsSearched: number;
  processingTime: number;
  retrievalTime: number;
  generationTime: number;
  reranking_time?: number;
  model: string;
  temperature: number;
  maxTokens: number;
  query_classification?: QueryClassification;
  hybrid_weights?: HybridWeights;
}

interface ReasoningInfo {
  query_type: string;
  confidence: number;
  reasoning_chain: string[];
  thought_process?: string;
}

Attribution Mapping

The attribution system maps specific segments of the generated answer to their source documents:

Answer Segments: Character-level offsets in the generated answer
Source Mapping: Corresponding text segments in source documents
Confidence Scores: Reliability measure for each attribution
Highlight Coordinates: Precise text highlighting information

Error Handling

Error Response Format

{
  "success": false,
  "error": {
    "code": "DOCUMENT_NOT_FOUND",
    "message": "The specified document could not be found",
    "details": {
      "document_id": "doc_invalid",
      "requested_at": "2025-01-25T10:30:00Z"
    }
  }
}

Common Error Codes

Code	HTTP Status	Description
`DOCUMENT_NOT_FOUND`	404	Document does not exist
`DOCUMENT_PROCESSING`	202	Document still processing
`INVALID_QUERY`	400	Query format or content invalid
`CONVERSATION_NOT_FOUND`	404	Conversation ID not found
`RATE_LIMIT_EXCEEDED`	429	Too many requests
`INSUFFICIENT_SOURCES`	200	Query successful but limited sources
`PROCESSING_ERROR`	500	Internal processing error
`AUTHENTICATION_FAILED`	401	Invalid or missing authentication
`AUTHORIZATION_FAILED`	403	Insufficient permissions

Rate Limiting

The API implements rate limiting to ensure fair usage:

Document Upload: 10 requests per minute
Query Processing: 60 requests per minute
Conversation Messages: 100 requests per minute
Statistics/Analytics: 30 requests per minute

Rate limit headers are included in responses:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1706180400

Webhooks

Document Processing Webhooks

POST /webhooks/register

{
  "url": "https://yourapp.com/webhooks/documents",
  "events": ["document.processed", "document.failed"],
  "secret": "your-webhook-secret"
}

Webhook Payload Example

{
  "event": "document.processed",
  "timestamp": "2025-01-25T10:32:00Z",
  "data": {
    "document_id": "doc_20250125_abc123",
    "filename": "document.pdf",
    "processing_status": "completed",
    "chunk_count": 45,
    "processing_time": 120.5
  }
}

SDK Integration Examples

Python SDK

from rag_client import RAGClient

# Initialize client
client = RAGClient(
    base_url="http://localhost:8000/api/v1",
    api_key="your-api-key"
)

# Upload document
document = client.documents.upload(
    file_path="contract.pdf",
    metadata={"category": "legal"}
)

# Start conversation
conversation = client.conversations.create(
    initial_question="What is the purchase price?",
    filters={"document_ids": [document.document_id]}
)

print(f"Answer: {conversation.answer}")
for source in conversation.sources:
    print(f"Source: {source.document_name} (confidence: {source.confidence_score})")

JavaScript SDK

import { RAGClient } from '@company/rag-client';

const client = new RAGClient({
  baseUrl: 'http://localhost:8000/api/v1',
  apiKey: 'your-api-key'
});

// Upload document
const document = await client.documents.upload({
  file: fileObject,
  metadata: { category: 'legal' }
});

// Process query with enhanced features
const response = await client.query.enhanced({
  query: 'What are the key terms?',
  options: {
    useHybridSearch: true,
    enableReranking: true,
    maxChunks: 20
  }
});

console.log('Answer:', response.answer);
console.log('Sources:', response.sources.length);
console.log('Processing time:', response.metadata.processingTime);

Migration Guide

Upgrading from Basic RAG

Update API Calls: Add new fields for enhanced features
Handle New Response Format: Process attribution and metadata
Update UI Components: Integrate source panels and confidence visualization
Configure Enhanced Features: Enable hybrid search, reranking, compression

Breaking Changes in v2.0

Response format now includes attributions array
Source citations have additional confidence_score field
Metadata structure expanded with performance metrics
Query classification added to responses

Backward Compatibility

The API maintains backward compatibility for:

Basic query endpoints without enhanced features
Document upload and management
Simple conversation history retrieval

Performance Optimization

Caching Strategy

Document Embeddings: Cached for 7 days
Query Results: Cached for 1 hour (configurable)
Conversation Context: Cached during session

Best Practices

Batch Operations: Upload multiple documents together
Filter Usage: Use specific filters to reduce search scope
Confidence Thresholds: Set appropriate thresholds for quality
Chunking Strategy: Optimize chunk size for your content type

Monitoring and Observability

Health Checks

GET /health

{
  "status": "healthy",
  "version": "2.0.0",
  "timestamp": "2025-01-25T10:30:00Z",
  "services": {
    "database": "healthy",
    "vector_store": "healthy",
    "embedding_service": "healthy",
    "llm_service": "healthy"
  },
  "metrics": {
    "uptime_seconds": 86400,
    "memory_usage_mb": 2048,
    "active_connections": 25
  }
}

Metrics Endpoint

GET /metrics

Returns Prometheus-compatible metrics for monitoring.

Security

API Key Management

Keys can be scoped to specific operations
Support for key rotation
Rate limiting per key
Usage analytics per key

Data Privacy

Document encryption at rest
PII detection and masking
Audit logging for all operations
GDPR compliance features

Support and Contact

For API support:

Documentation: https://docs.company.com/rag-api
Status Page: https://status.company.com
Support Email: api-support@company.com
GitHub Issues: https://github.com/company/rag-api/issues

FilesExpand file tree

API_DOCUMENTATION.md

Latest commit

History

API_DOCUMENTATION.md

File metadata and controls

API Documentation - Enhanced RAG System

Overview

Base URL

Authentication

Endpoints

Document Management

Upload Document

Get Document Status

List Documents

Delete Document

Conversation Management

Create Conversation

Send Message

Get Conversation History

Enhanced Query Processing

Process Query (Enhanced)

Re-run Query

Citations and Sources

Get Citation Details

Get Document Preview

Statistics and Analytics

Get System Statistics

Get Query Analytics

Enhanced Response Format

Standard Response Structure

Attribution Mapping

Error Handling

Error Response Format

Common Error Codes

Rate Limiting

Webhooks

Document Processing Webhooks

Webhook Payload Example

SDK Integration Examples

Python SDK

JavaScript SDK

Migration Guide

Upgrading from Basic RAG

Breaking Changes in v2.0

Backward Compatibility

Performance Optimization

Caching Strategy

Best Practices

Monitoring and Observability

Health Checks

Metrics Endpoint

Security

API Key Management

Data Privacy

Support and Contact