Skip to content

Latest commit

 

History

History
898 lines (724 loc) · 19.4 KB

File metadata and controls

898 lines (724 loc) · 19.4 KB

API Documentation - Enhanced RAG System

This document provides comprehensive API documentation for the enhanced RAG (Retrieval-Augmented Generation) system with source attribution and confidence scoring.

Overview

The enhanced RAG system provides several API endpoints for:

  • Document management and processing
  • Conversation handling with source attribution
  • Enhanced response generation with confidence scoring
  • Query processing with metadata and statistics

Base URL

http://localhost:8000/api/v1

Authentication

Currently, the system uses basic authentication or API keys. Include authentication headers with all requests:

Authorization: Bearer <your-api-key>
Content-Type: application/json

Endpoints

Document Management

Upload Document

POST /documents/upload

Upload and process a document for RAG retrieval.

Request Body (multipart/form-data):

Field Type Required Description
file File Document file (PDF, DOCX, TXT, MD, HTML)
metadata JSON Additional document metadata
processing_options JSON Custom processing configuration

Example Request:

curl -X POST \
  http://localhost:8000/api/v1/documents/upload \
  -H "Authorization: Bearer your-api-key" \
  -F "file=@document.pdf" \
  -F "metadata={\"category\": \"legal\", \"priority\": \"high\"}"

Response:

{
  "success": true,
  "document_id": "doc_20250125_abc123",
  "filename": "document.pdf",
  "file_size": 1024000,
  "content_type": "application/pdf",
  "upload_date": "2025-01-25T10:30:00Z",
  "processing_status": "processing",
  "estimated_completion": "2025-01-25T10:32:00Z"
}

Get Document Status

GET /documents/{document_id}/status

Get the processing status of a document.

Response:

{
  "document_id": "doc_20250125_abc123",
  "filename": "document.pdf",
  "processing_status": "completed",
  "chunk_count": 45,
  "processing_time": 120.5,
  "metadata": {
    "pages": 10,
    "word_count": 2500,
    "language": "en"
  }
}

List Documents

GET /documents

List all uploaded documents with filtering options.

Query Parameters:

Parameter Type Required Description
status string Filter by processing status
content_type string Filter by content type
limit integer Number of results (default: 50)
offset integer Pagination offset (default: 0)
sort_by string Sort field (upload_date, filename, file_size)
sort_order string Sort order (asc, desc)

Response:

{
  "documents": [
    {
      "document_id": "doc_20250125_abc123",
      "filename": "document.pdf",
      "file_size": 1024000,
      "content_type": "application/pdf",
      "upload_date": "2025-01-25T10:30:00Z",
      "processing_status": "completed",
      "chunk_count": 45
    }
  ],
  "total_count": 150,
  "has_more": true
}

Delete Document

DELETE /documents/{document_id}

Delete a document and all its associated data.

Response:

{
  "success": true,
  "message": "Document deleted successfully",
  "document_id": "doc_20250125_abc123"
}

Conversation Management

Create Conversation

POST /conversations

Start a new conversation with initial query.

Request Body:

{
  "initial_question": "What is the address of the property being purchased?",
  "filters": {
    "content_type": "application/pdf",
    "document_ids": ["doc_001", "doc_002"],
    "date_range": {
      "start": "2024-01-01",
      "end": "2025-01-25"
    }
  },
  "search_mode": "multi_document",
  "options": {
    "include_sources": true,
    "max_sources": 10,
    "confidence_threshold": 0.5
  }
}

Response:

{
  "success": true,
  "conversation_id": "conv_20250125_xyz789",
  "answer": "The property being purchased is located at 172 McLeod Road, London, SE2 0BT.",
  "sources": [
    {
      "citation_id": "cite_001",
      "document_id": "doc_001",
      "document_name": "Purchase_Agreement.pdf",
      "chunk_id": "chunk_doc001_005",
      "page_number": 1,
      "paragraph_number": 3,
      "excerpt": "The property being purchased is located at 172 McLeod Road, London, SE2 0BT. The buyers are Colombe De Rotalier and James Algar.",
      "relevance_score": 0.89,
      "highlights": [
        {
          "start": 41,
          "end": 79
        }
      ]
    }
  ],
  "attributions": [
    {
      "answerStart": 41,
      "answerEnd": 79,
      "sourceId": "doc_001",
      "sourceStart": 41,
      "sourceEnd": 79,
      "confidence": 0.89
    }
  ],
  "metadata": {
    "chunksAnalyzed": 15,
    "documentsSearched": 5,
    "processingTime": 3.82,
    "retrievalTime": 1.2,
    "generationTime": 2.62,
    "model": "gpt-4",
    "temperature": 0.3,
    "maxTokens": 1000
  }
}

Send Message

POST /conversations/{conversation_id}/messages

Send a message to an existing conversation.

Request Body:

{
  "message": "Who are the buyers?",
  "filters": {
    "document_ids": ["doc_001"]
  },
  "include_sources": true,
  "search_mode": "multi_document"
}

Response: (Same format as conversation creation)

Get Conversation History

GET /conversations/{conversation_id}/history

Retrieve the full conversation history.

Response:

{
  "conversation_id": "conv_20250125_xyz789",
  "start_time": "2025-01-25T10:30:00Z",
  "total_turns": 3,
  "messages": [
    {
      "role": "user",
      "content": "What is the address of the property?",
      "timestamp": "2025-01-25T10:30:00Z"
    },
    {
      "role": "assistant",
      "content": "The property is located at 172 McLeod Road, London, SE2 0BT.",
      "sources": [...],
      "attributions": [...],
      "metadata": {...},
      "timestamp": "2025-01-25T10:30:05Z"
    }
  ],
  "filters": {
    "content_type": "application/pdf"
  }
}

Enhanced Query Processing

Process Query (Enhanced)

POST /query/enhanced

Process a query using the enhanced RAG system with advanced features.

Request Body:

{
  "query": "What are the key terms of the purchase agreement?",
  "options": {
    "use_hybrid_search": true,
    "enable_reranking": true,
    "enable_compression": true,
    "max_chunks": 20,
    "similarity_threshold": 0.7,
    "include_reasoning": true
  },
  "filters": {
    "document_types": ["legal", "contracts"],
    "date_range": {
      "start": "2024-01-01",
      "end": "2025-01-25"
    }
  }
}

Response:

{
  "success": true,
  "query_id": "query_abc123def",
  "answer": "The key terms of the purchase agreement include...",
  "reasoning": {
    "query_type": "factual",
    "confidence": 0.92,
    "reasoning_chain": [
      "Identified query as factual information request",
      "Retrieved relevant contract documents",
      "Extracted key terms from purchase agreement sections"
    ]
  },
  "sources": [
    {
      "citation_id": "cite_002",
      "document_id": "doc_001",
      "document_name": "Purchase_Agreement.pdf",
      "chunk_id": "chunk_doc001_010",
      "page_number": 3,
      "paragraph_number": 2,
      "excerpt": "The purchase price is £850,000, with completion scheduled for March 15, 2025...",
      "relevance_score": 0.94,
      "confidence_score": 0.91,
      "highlights": [
        {
          "start": 4,
          "end": 30
        }
      ],
      "rerank_score": 0.87
    }
  ],
  "attributions": [
    {
      "answerStart": 45,
      "answerEnd": 71,
      "sourceId": "doc_001",
      "sourceStart": 4,
      "sourceEnd": 30,
      "confidence": 0.91,
      "attribution_type": "direct_quote"
    }
  ],
  "metadata": {
    "chunksAnalyzed": 25,
    "documentsSearched": 8,
    "processingTime": 4.15,
    "retrievalTime": 1.8,
    "reranking_time": 0.95,
    "generationTime": 1.4,
    "model": "gpt-4",
    "temperature": 0.3,
    "maxTokens": 1500,
    "hybrid_weights": {
      "semantic": 0.7,
      "lexical": 0.3
    },
    "query_classification": {
      "type": "factual",
      "confidence": 0.92,
      "subcategories": ["contract_terms", "legal_documents"]
    }
  }
}

Re-run Query

POST /query/{query_id}/rerun

Re-run a previous query with modified parameters.

Request Body:

{
  "options": {
    "clear_cache": true,
    "expand_search": true,
    "temperature": 0.5,
    "max_chunks": 30
  }
}

Citations and Sources

Get Citation Details

GET /citations/{citation_id}

Get detailed information about a specific citation.

Response:

{
  "citation_id": "cite_002",
  "document_id": "doc_001",
  "document_name": "Purchase_Agreement.pdf",
  "chunk_id": "chunk_doc001_010",
  "page_number": 3,
  "paragraph_number": 2,
  "excerpt": "The purchase price is £850,000, with completion scheduled for March 15, 2025...",
  "full_context": "This agreement outlines the terms... The purchase price is £850,000, with completion scheduled for March 15, 2025. The buyer agrees to...",
  "relevance_score": 0.94,
  "confidence_score": 0.91,
  "metadata": {
    "section": "financial_terms",
    "extracted_at": "2025-01-25T09:15:00Z"
  }
}

Get Document Preview

GET /documents/{document_id}/preview

Get a preview of the document with highlighted relevant sections.

Query Parameters:

Parameter Type Required Description
query string Query to highlight relevant sections
chunk_ids string Comma-separated chunk IDs to highlight
page integer Specific page to preview

Response:

{
  "document_id": "doc_001",
  "document_name": "Purchase_Agreement.pdf",
  "total_pages": 10,
  "preview_data": {
    "page_number": 3,
    "text_content": "This agreement outlines the terms...",
    "highlights": [
      {
        "start": 45,
        "end": 71,
        "type": "query_match",
        "confidence": 0.91
      }
    ],
    "metadata": {
      "word_count": 450,
      "char_count": 2800
    }
  }
}

Statistics and Analytics

Get System Statistics

GET /statistics

Get overall system statistics.

Response:

{
  "documents": {
    "total_count": 150,
    "by_status": {
      "completed": 140,
      "processing": 8,
      "failed": 2
    },
    "by_type": {
      "application/pdf": 120,
      "application/vnd.openxmlformats-officedocument.wordprocessingml.document": 25,
      "text/plain": 5
    },
    "total_size_mb": 2450.5,
    "total_chunks": 15670
  },
  "conversations": {
    "total_count": 1250,
    "active_count": 45,
    "avg_turns_per_conversation": 4.2
  },
  "performance": {
    "avg_query_time": 3.2,
    "avg_retrieval_time": 1.1,
    "avg_generation_time": 2.1,
    "cache_hit_rate": 0.65
  },
  "last_updated": "2025-01-25T10:30:00Z"
}

Get Query Analytics

GET /analytics/queries

Get analytics data for queries and conversations.

Query Parameters:

Parameter Type Required Description
start_date string Start date (ISO 8601)
end_date string End date (ISO 8601)
group_by string Grouping (hour, day, week, month)

Response:

{
  "time_period": {
    "start": "2025-01-20T00:00:00Z",
    "end": "2025-01-25T23:59:59Z"
  },
  "query_stats": {
    "total_queries": 450,
    "unique_conversations": 180,
    "avg_confidence_score": 0.78,
    "query_types": {
      "factual": 280,
      "analytical": 120,
      "comparative": 50
    }
  },
  "performance_metrics": {
    "avg_response_time": 3.2,
    "p95_response_time": 8.1,
    "p99_response_time": 15.3,
    "error_rate": 0.02
  },
  "daily_breakdown": [
    {
      "date": "2025-01-25",
      "query_count": 95,
      "avg_response_time": 3.1,
      "avg_confidence": 0.82
    }
  ]
}

Enhanced Response Format

Standard Response Structure

All enhanced RAG responses follow this structure:

interface EnhancedRAGResponse {
  success: boolean;
  query_id?: string;
  conversation_id?: string;
  answer: string;
  sources: SourceCitation[];
  attributions: Attribution[];
  metadata: ResponseMetadata;
  reasoning?: ReasoningInfo;
  error?: string;
}

interface SourceCitation {
  citation_id: string;
  document_id: string;
  document_name: string;
  chunk_id: string;
  page_number?: number;
  paragraph_number?: number;
  excerpt: string;
  relevance_score: number;
  confidence_score?: number;
  rerank_score?: number;
  highlights: TextHighlight[];
  metadata?: Record<string, any>;
}

interface Attribution {
  answerStart: number;
  answerEnd: number;
  sourceId: string;
  sourceStart: number;
  sourceEnd: number;
  confidence: number;
  attribution_type?: 'direct_quote' | 'paraphrase' | 'inference';
}

interface TextHighlight {
  start: number;
  end: number;
  type?: 'query_match' | 'relevant_context';
  confidence?: number;
}

interface ResponseMetadata {
  chunksAnalyzed: number;
  documentsSearched: number;
  processingTime: number;
  retrievalTime: number;
  generationTime: number;
  reranking_time?: number;
  model: string;
  temperature: number;
  maxTokens: number;
  query_classification?: QueryClassification;
  hybrid_weights?: HybridWeights;
}

interface ReasoningInfo {
  query_type: string;
  confidence: number;
  reasoning_chain: string[];
  thought_process?: string;
}

Attribution Mapping

The attribution system maps specific segments of the generated answer to their source documents:

  1. Answer Segments: Character-level offsets in the generated answer
  2. Source Mapping: Corresponding text segments in source documents
  3. Confidence Scores: Reliability measure for each attribution
  4. Highlight Coordinates: Precise text highlighting information

Error Handling

Error Response Format

{
  "success": false,
  "error": {
    "code": "DOCUMENT_NOT_FOUND",
    "message": "The specified document could not be found",
    "details": {
      "document_id": "doc_invalid",
      "requested_at": "2025-01-25T10:30:00Z"
    }
  }
}

Common Error Codes

Code HTTP Status Description
DOCUMENT_NOT_FOUND 404 Document does not exist
DOCUMENT_PROCESSING 202 Document still processing
INVALID_QUERY 400 Query format or content invalid
CONVERSATION_NOT_FOUND 404 Conversation ID not found
RATE_LIMIT_EXCEEDED 429 Too many requests
INSUFFICIENT_SOURCES 200 Query successful but limited sources
PROCESSING_ERROR 500 Internal processing error
AUTHENTICATION_FAILED 401 Invalid or missing authentication
AUTHORIZATION_FAILED 403 Insufficient permissions

Rate Limiting

The API implements rate limiting to ensure fair usage:

  • Document Upload: 10 requests per minute
  • Query Processing: 60 requests per minute
  • Conversation Messages: 100 requests per minute
  • Statistics/Analytics: 30 requests per minute

Rate limit headers are included in responses:

X-RateLimit-Limit: 60
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1706180400

Webhooks

Document Processing Webhooks

Register webhook URLs to receive notifications about document processing:

POST /webhooks/register

{
  "url": "https://yourapp.com/webhooks/documents",
  "events": ["document.processed", "document.failed"],
  "secret": "your-webhook-secret"
}

Webhook Payload Example

{
  "event": "document.processed",
  "timestamp": "2025-01-25T10:32:00Z",
  "data": {
    "document_id": "doc_20250125_abc123",
    "filename": "document.pdf",
    "processing_status": "completed",
    "chunk_count": 45,
    "processing_time": 120.5
  }
}

SDK Integration Examples

Python SDK

from rag_client import RAGClient

# Initialize client
client = RAGClient(
    base_url="http://localhost:8000/api/v1",
    api_key="your-api-key"
)

# Upload document
document = client.documents.upload(
    file_path="contract.pdf",
    metadata={"category": "legal"}
)

# Start conversation
conversation = client.conversations.create(
    initial_question="What is the purchase price?",
    filters={"document_ids": [document.document_id]}
)

print(f"Answer: {conversation.answer}")
for source in conversation.sources:
    print(f"Source: {source.document_name} (confidence: {source.confidence_score})")

JavaScript SDK

import { RAGClient } from '@company/rag-client';

const client = new RAGClient({
  baseUrl: 'http://localhost:8000/api/v1',
  apiKey: 'your-api-key'
});

// Upload document
const document = await client.documents.upload({
  file: fileObject,
  metadata: { category: 'legal' }
});

// Process query with enhanced features
const response = await client.query.enhanced({
  query: 'What are the key terms?',
  options: {
    useHybridSearch: true,
    enableReranking: true,
    maxChunks: 20
  }
});

console.log('Answer:', response.answer);
console.log('Sources:', response.sources.length);
console.log('Processing time:', response.metadata.processingTime);

Migration Guide

Upgrading from Basic RAG

  1. Update API Calls: Add new fields for enhanced features
  2. Handle New Response Format: Process attribution and metadata
  3. Update UI Components: Integrate source panels and confidence visualization
  4. Configure Enhanced Features: Enable hybrid search, reranking, compression

Breaking Changes in v2.0

  • Response format now includes attributions array
  • Source citations have additional confidence_score field
  • Metadata structure expanded with performance metrics
  • Query classification added to responses

Backward Compatibility

The API maintains backward compatibility for:

  • Basic query endpoints without enhanced features
  • Document upload and management
  • Simple conversation history retrieval

Performance Optimization

Caching Strategy

  • Document Embeddings: Cached for 7 days
  • Query Results: Cached for 1 hour (configurable)
  • Conversation Context: Cached during session

Best Practices

  1. Batch Operations: Upload multiple documents together
  2. Filter Usage: Use specific filters to reduce search scope
  3. Confidence Thresholds: Set appropriate thresholds for quality
  4. Chunking Strategy: Optimize chunk size for your content type

Monitoring and Observability

Health Checks

GET /health

{
  "status": "healthy",
  "version": "2.0.0",
  "timestamp": "2025-01-25T10:30:00Z",
  "services": {
    "database": "healthy",
    "vector_store": "healthy",
    "embedding_service": "healthy",
    "llm_service": "healthy"
  },
  "metrics": {
    "uptime_seconds": 86400,
    "memory_usage_mb": 2048,
    "active_connections": 25
  }
}

Metrics Endpoint

GET /metrics

Returns Prometheus-compatible metrics for monitoring.


Security

API Key Management

  • Keys can be scoped to specific operations
  • Support for key rotation
  • Rate limiting per key
  • Usage analytics per key

Data Privacy

  • Document encryption at rest
  • PII detection and masking
  • Audit logging for all operations
  • GDPR compliance features

Support and Contact

For API support: