Sentences Chunker

The Sentences Chunker is a cutting-edge tool that revolutionizes text segmentation for modern NLP applications by intelligently splitting documents into optimally-sized chunks while preserving sentence boundaries and semantic integrity. This innovative solution leverages state-of-the-art WTPSplit (Where's the Point? Self-Supervised Multilingual Punctuation-Agnostic Sentence Segmentation) technology to deliver unparalleled accuracy across 85+ languages without requiring language-specific models or punctuation. Traditional text chunkers often break sentences arbitrarily at token limits, destroying context and meaning. This leads to degraded performance in downstream tasks like embeddings generation, retrieval-augmented generation (RAG), and language model processing. The Sentences Chunker overcomes these challenges by combining advanced sentence boundary detection with intelligent token-aware chunking. It ensures chunks respect natural language boundaries while adhering to strict token limits, with configurable sentence overlap for maintaining context across chunk boundaries. Whether you're building RAG pipelines, preparing training data, or optimizing text for LLM consumption, the Sentences Chunker provides a robust, production-ready solution for intelligent text segmentation.

Key Features

Advanced Sentence Segmentation: Powered by WTPSplit's neural models for state-of-the-art sentence boundary detection across 85+ languages.
Flexible Model Selection: Choose from multiple SaT (Segment any Text) models optimized for different speed/accuracy trade-offs (1-layer to 12-layer variants).
Precise Token Control: Enforce strict token limits per chunk while preserving sentence integrity.
Configurable Sentence Overlap: Maintain contextual continuity between chunks with customizable overlap settings.
Strict Mode with Smart Suggestions: Get intelligent parameter recommendations when chunking constraints cannot be met.
GPU Acceleration: CUDA-enabled for fast processing with automatic GPU/CPU detection.
Persistent Model Caching: Models are saved to disk after first download for instant subsequent loads.
Comprehensive Metadata: Detailed statistics on chunking results including token distribution and processing metrics.
Universal REST API with FastAPI: Modern, high-performance API interface with automatic documentation, data validation, and seamless integration capabilities for any system or language.
Markdown Structure-Aware Chunking: Split Markdown documents by header hierarchy with optional isolation of code blocks and tables, YAML frontmatter extraction, and configurable token limit enforcement.
Docker Integration: Easy deployment with GPU/CPU profiles and automatic hardware detection.

How the Text Chunking Algorithm Works

The Pipeline

The Sentences Chunker implements a sophisticated multi-stage pipeline that combines neural sentence segmentation with intelligent chunking:

The application exposes a REST API where users upload text documents with parameters for token limits, overlap settings, and model selection.
Text preprocessing handles single line breaks while preserving paragraph boundaries for optimal sentence detection.
The WTPSplit SaT model performs neural sentence segmentation, handling complex cases like abbreviations, URLs, and multilingual text.
Each sentence is tokenized using the cl100k_base encoding (compatible with modern LLMs) to calculate precise token counts.
An intelligent chunking algorithm groups sentences while respecting the maximum token limit.
Optional sentence overlap is applied between chunks to maintain context continuity.
In strict mode, the system validates all constraints and provides smart parameter suggestions if limits cannot be met.
The API returns structured JSON with chunks, token counts, and comprehensive metadata.

Intelligent Overlap Management

The overlap feature ensures contextual continuity between chunks, critical for RAG applications:

# Extract last N sentences from previous chunk for overlap
overlap_start_index = max(0, len(previous_chunk_sentences) - configured_overlap_sentences)
overlap_sentences = previous_chunk_sentences[overlap_start_index:]
overlap_tokens_count = sum(s["token_count"] for s in overlap_sentences)

How it works:

When a chunk is finalized, the last N sentences are extracted as overlap
These sentences become the starting point for the next chunk
The overlap ensures context flows naturally between chunks

Edge case handling: When overlap would exceed token limits, the system:

In normal mode: Creates the chunk with a warning in overflow_details
In strict mode: Calculates optimal parameters and suggests adjustments with 30% safety margins

Strict Mode and Parameter Optimization

Strict mode ensures absolute compliance with constraints and provides intelligent suggestions:

# Calculate suggested max_chunk_tokens with 30% safety margin
tokens_with_margin = base_required_tokens * (1 + SUGGESTION_SAFETY_MARGIN_PERCENT)
suggested_max_t = _round_up_to_nearest_multiple(tokens_with_margin, 100)

# Provide actionable suggestions
suggestions = {
    "suggested_max_chunk_tokens": suggested_max_t,
    "suggested_overlap_sentences": optimal_overlap,
    "message": detailed_explanation
}

Comparison with Traditional Chunking

Feature	Traditional Chunking	Sentences Chunker
Sentence Detection	Basic regex or newline splits	Neural model with 85+ language support
Boundary Preservation	Often breaks mid-sentence	Always preserves sentence boundaries
Token Counting	Approximate or character-based	Precise tiktoken-based counting
Overlap Handling	Fixed character/token overlap	Intelligent sentence-based overlap
Error Handling	Basic validation	Smart parameter suggestions
Model Persistence	Downloads every run	Caches models to disk
Markdown Awareness	None — treats Markdown as plain text	Header-based splitting with code block/table isolation and YAML frontmatter extraction
GPU Support	Rarely implemented	Automatic GPU/CPU detection

Advantages of the Solution

Superior Sentence Segmentation

WTPSplit's neural models provide:

Language Agnostic: Works across 85+ languages without configuration
Punctuation Agnostic: Handles text without punctuation marks
Context Aware: Understands abbreviations, URLs, and special cases
Configurable Confidence: Adjustable split threshold for different use cases

Markdown Structure-Aware Chunking

Purpose-built Markdown processing provides:

Header-Based Splitting: Split documents by header hierarchy (levels 1-6) to preserve section semantics
Element Isolation: Optionally extract code blocks and tables as standalone chunks for specialized processing
YAML Frontmatter Extraction: Automatically parse and merge frontmatter metadata into every chunk
Token Limit Enforcement: Split oversized text on sentence boundaries, code on line boundaries, and tables on row boundaries

Optimal for RAG and LLM Applications

Chunks generated are ideal for modern NLP pipelines:

Semantic Integrity: Complete sentences preserve meaning and context
Token Precision: Exact token counts ensure compatibility with model limits
Context Windows: Overlap maintains continuity for retrieval tasks
Metadata Rich: Detailed statistics for pipeline optimization

Performance and Scalability

Production-ready features include:

GPU Acceleration: Automatic CUDA detection and optimization
Model Caching: Models cached to disk after first download; in-memory cache with 1-hour timeout
Thread-Safe Operations: Thread-safe model cache using RLock for concurrent requests
Async Processing: FastAPI with uvloop for high concurrency
Memory Management: Efficient GPU memory handling with automatic cleanup
Token Counting: Precise tiktoken-based counting using cl100k_base encoding
Docker Profiles: Separate CPU/GPU deployments with health checks

Installation and Deployment

Prerequisites

Docker and Docker Compose (for Docker deployment)
NVIDIA GPU with CUDA support (recommended for performance)
NVIDIA Container Toolkit (for GPU passthrough in Docker)
Python 3.10-3.12 (for local installation)

Getting the Code

Before proceeding with any installation method, clone the repository:

git clone https://github.com/smart-models/Sentences-Chunker.git
cd Sentences-Chunker

Local Installation with Uvicorn

Create a virtual environment:

python -m venv venv
source venv/bin/activate  # On Linux/Mac

For Windows users:

Using Command Prompt:

venv\Scripts\activate.bat

Using PowerShell:

# If you encounter execution policy restrictions, run this once per session:
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope Process

# Then activate the virtual environment:
venv\Scripts\Activate.ps1

Note: PowerShell's default security settings may prevent script execution. The above command temporarily allows scripts for the current session only, which is safer than changing system-wide settings.

Install dependencies:

pip install -r requirements.txt

Note: For GPU support, ensure you install the correct PyTorch version:

pip install --extra-index-url https://download.pytorch.org/whl/cu126 torch==2.6.0+cu126

Run the FastAPI server:
```
uvicorn sentences_chunker:app --reload
```
The API will be available at http://localhost:8000.

Access the API documentation and interactive testing interface at http://localhost:8000/docs.

Docker Deployment (Recommended)

Option A: Pre-built Image from GitHub Container Registry

The easiest way to deploy is using our pre-built Docker images published to GitHub Container Registry.

Pull the latest image:

docker pull ghcr.io/smart-models/sentences-chunker:latest

Run with GPU acceleration (recommended, requires NVIDIA GPU + drivers):

docker run -d \
  --name sentences-chunker \
  --gpus all \
  -p 8000:8000 \
  -v $(pwd)/hf_cache:/app/models \
  -v $(pwd)/logs:/app/logs \
  ghcr.io/smart-models/sentences-chunker:latest

Windows PowerShell:

docker run -d `
  --name sentences-chunker `
  --gpus all `
  -p 8000:8000 `
  -v ${PWD}/hf_cache:/app/models `
  -v ${PWD}/logs:/app/logs `
  ghcr.io/smart-models/sentences-chunker:latest

Run on CPU only (fallback for systems without GPU):

docker run -d \
  --name sentences-chunker \
  -p 8000:8000 \
  -v $(pwd)/hf_cache:/app/models \
  -v $(pwd)/logs:/app/logs \
  ghcr.io/smart-models/sentences-chunker:latest

Use a specific version (recommended for production):

# Replace v1.0.0 with your desired version
docker pull ghcr.io/smart-models/sentences-chunker:v1.0.0
docker run -d --gpus all -p 8000:8000 \
  ghcr.io/smart-models/sentences-chunker:v1.0.0

Verify the service is running:
```
curl http://localhost:8000/
```

Stop and remove the container:

docker stop sentences-chunker
docker rm sentences-chunker

Option B: Build Locally from Source

For developers who want to build from source or customize the image.

Clone the repository:

git clone https://github.com/smart-models/Sentences-Chunker.git
cd Sentences-Chunker

Create required directories for persistent storage:

# Linux/macOS
mkdir -p hf_cache logs

# Windows CMD
mkdir hf_cache
mkdir logs

# Windows PowerShell
New-Item -ItemType Directory -Path hf_cache -Force
New-Item -ItemType Directory -Path logs -Force
# Or with PowerShell alias
mkdir -Force hf_cache, logs

Deploy with Docker Compose:

GPU-accelerated deployment (requires NVIDIA GPU and drivers):
```
cd docker
docker compose --profile gpu up -d
```
CPU-only deployment (works on all machines):
```
cd docker
docker compose --profile cpu up -d
```
Stopping the service:

Important: Always match the --profile flag between your up and down commands:
```
# To stop GPU deployment
docker compose --profile gpu down

# To stop CPU deployment
docker compose --profile cpu down
```
This ensures Docker Compose correctly identifies and manages the specific set of containers you intended to control.
The API will be available at http://localhost:8000.

Access the API documentation and interactive testing interface at http://localhost:8000/docs.

Using the API

Authentication

The API supports optional Bearer token authentication via the API_TOKEN environment variable.

`API_TOKEN` value	Behavior
Not set or empty	Authentication disabled — all endpoints are open
Set to any value	All POST endpoints require `Authorization: Bearer <token>` header

The GET / health check endpoint is always accessible regardless of authentication settings.

Local setup:

# Linux/Mac
export API_TOKEN=your-secret-token
uvicorn sentences_chunker:app --reload

# Windows PowerShell
$env:API_TOKEN="your-secret-token"
uvicorn sentences_chunker:app --reload

# Windows CMD
set API_TOKEN=your-secret-token
uvicorn sentences_chunker:app --reload

Docker setup:

Copy the example environment file and set your token:

cd docker
cp .env.example .env
# Edit .env and set API_TOKEN=your-secret-token

Docker Compose automatically reads the .env file and passes API_TOKEN to the container.

Authenticated request example:

curl -X POST "http://localhost:8000/file-chunker/" \
  -H "Authorization: Bearer your-secret-token" \
  -F "file=@document.txt"

If the token is missing or incorrect, the API returns 403 Forbidden:

{
  "detail": "Invalid or missing API token"
}

API Endpoints

POST /file-chunker/ Chunks a text document into segments based on specified token limits with optional sentence overlap.

File Size Limit: Maximum 50 MB per request

Parameters:
- file: The text file to be chunked (supports .txt and .md formats)
- model_name: WTPSplit SaT model to use (default: sat-12l-sm). Available models: sat-1l, sat-1l-sm, sat-3l, sat-3l-sm, sat-6l, sat-6l-sm, sat-9l, sat-12l, sat-12l-sm
- split_threshold: Confidence threshold for sentence boundaries (0.0-1.0, default: 0.5)
- max_chunk_tokens: Maximum tokens per chunk (integer > 0, default: 500)
- overlap_sentences: Number of sentences to overlap between chunks (integer ≥ 0, where 0 disables overlap, default: 1)
- strict_mode: If true, enforces all constraints strictly (boolean, default: false)
- chunk_metadata_json: JSON string with custom metadata to merge into each chunk (optional)
Response: Returns a JSON object containing:
- chunks: Array of text segments with token counts, IDs, overflow details, and any custom metadata fields
- metadata: Comprehensive processing statistics
POST /markdown-chunker/ Chunks Markdown documents by header structure with optional token limit enforcement.

File Size Limit: Maximum 50 MB per request

Parameters:
- file: The markdown file to be chunked (supports .md formats)
- header_level: Split depth (1-6, default: 1). Specifies up to which header level to split: 1=#, 2=##, 3=###, 4=####, 5=#####, 6=######
- max_chunk_tokens: Token threshold for overflow detection/enforcement (integer > 0, default: 500)
- chunk_metadata_json: JSON string with custom metadata to merge into each chunk (optional)
- isolate_code_blocks: If true, isolates code blocks into separate chunks (default: false)
- isolate_tables: If true, isolates tables into separate chunks (default: false)
- enforce_text_limit: If true, splits text sections on sentence boundaries when over limit (default: false)
- enforce_code_limit: If true, splits code blocks on line boundaries when over limit (default: false)
- enforce_table_limit: If true, splits tables on row boundaries when over limit (default: false)
YAML Frontmatter Support: The endpoint automatically extracts YAML frontmatter (delimited by ---) from Markdown files and merges all fields into each chunk's metadata at the top level. Supports simple key-value pairs, lists, and multiline strings. Complex structures (nested objects, arrays) are converted to string representations. Uses PyYAML if available for full YAML support, with a simple fallback parser for basic key-value pairs. Frontmatter is preserved in the first chunk's content when preserve_frontmatter=True (default behavior).

Response: Returns a JSON object containing:
- chunks: Array of markdown chunks with token counts, IDs, optional overflow details, split metadata, YAML frontmatter fields (if present), and any custom metadata fields
- metadata: Includes configured_max_chunk_tokens and n_chunks_exceeding_limit with token statistics
POST /split-sentences/ Splits text into individual sentences without chunking.

File Size Limit: Maximum 50 MB per request

Parameters:
- file: The text file to split (supports .txt and .md formats)
- model_name: WTPSplit SaT model to use (default: sat-12l-sm). Available models: sat-1l, sat-1l-sm, sat-3l, sat-3l-sm, sat-6l, sat-6l-sm, sat-9l, sat-12l, sat-12l-sm
- split_threshold: Confidence threshold for boundaries (0.0-1.0, default: 0.5)
- chunk_metadata_json: JSON string with custom metadata to merge into each chunk (optional)
Response: Returns sentences as individual chunks with token counts, metadata, and any custom metadata fields.
GET / Health check endpoint that returns service status, API version, GPU availability, and default model name.

Available SaT Models

The API supports multiple WTPSplit SaT (Segment any Text) models with different speed/accuracy trade-offs:

sat-1l, sat-1l-sm: Fastest, suitable for quick processing with lower accuracy
sat-3l, sat-3l-sm: Balanced speed and accuracy
sat-6l, sat-6l-sm: Good accuracy with reasonable speed
sat-9l: Higher accuracy, slower processing
sat-12l, sat-12l-sm: Best accuracy, slowest processing (recommended for production)

Models with -sm suffix are smaller variants optimized for memory efficiency. The default model sat-12l-sm provides the best balance of accuracy and resource usage for most applications.

Example API Call using cURL

# Basic file chunking with defaults
curl -X POST "http://localhost:8000/file-chunker/" \
  -F "file=@document.txt" 

# Advanced chunking with all parameters
curl -X POST "http://localhost:8000/file-chunker/?\
max_chunk_tokens=1024&\
overlap_sentences=2&\
model_name=sat-6l&\
split_threshold=0.6&\
strict_mode=true" \
  -F "file=@document.txt" \
  -H "accept: application/json"

# Split into sentences only
curl -X POST "http://localhost:8000/split-sentences/?model_name=sat-3l" \
  -F "file=@document.txt"

# Chunking with custom metadata (merged into each chunk)
curl -X POST "http://localhost:8000/file-chunker/?chunk_metadata_json={\"doc_id\":\"123\",\"source\":\"api\"}" \
  -F "file=@document.txt"

# Markdown chunking with header level 2 and code block isolation
curl -X POST "http://localhost:8000/markdown-chunker/?\
header_level=2&\
isolate_code_blocks=true&\
max_chunk_tokens=1000" \
  -F "file=@document.md"

# Markdown chunking with token limit enforcement
curl -X POST "http://localhost:8000/markdown-chunker/?\
header_level=3&\
enforce_text_limit=true&\
enforce_code_limit=true&\
max_chunk_tokens=500" \
  -F "file=@document.md"

# Health check with model information
curl http://localhost:8000/

Example API Call using Python

import requests
import json

# API configuration
api_url = 'http://localhost:8000/file-chunker/'
file_path = 'document.txt'  # Your input text file

# Chunking parameters
params = {
    'max_chunk_tokens': 512,
    'overlap_sentences': 2,
    'model_name': 'sat-12l-sm',  # Best accuracy
    'split_threshold': 0.5,
    'strict_mode': True  # Enforce constraints
}

try:
    with open(file_path, 'rb') as f:
        files = {'file': (file_path, f, 'text/plain')}
        response = requests.post(api_url, files=files, params=params)
        response.raise_for_status()

        result = response.json()
        
        # Handle successful response
        print(f"Successfully chunked into {result['metadata']['n_chunks']} chunks")
        print(f"Average tokens per chunk: {result['metadata']['avg_tokens_per_chunk']}")
        print(f"Processing time: {result['metadata']['processing_time']:.2f}s")
        
        # Save results
        with open('chunks_output.json', 'w', encoding='utf-8') as out:
            json.dump(result, out, indent=2, ensure_ascii=False)

except requests.exceptions.HTTPError as e:
    if e.response.status_code == 400:
        # Handle strict mode violations
        error_detail = e.response.json()['detail']
        if isinstance(error_detail, dict) and error_detail.get('chunk_process') == 'failed':
            print("Chunking failed due to constraints. Suggestions:")
            if error_detail.get('suggested_token_limit'):
                print(f"  - Try max_chunk_tokens={error_detail['suggested_token_limit']}")
            if error_detail.get('suggested_overlap_value') is not None:
                print(f"  - Try overlap_sentences={error_detail['suggested_overlap_value']}")
    else:
        print(f"API error: {e}")
except Exception as e:
    print(f"Error: {e}")

Response Format

File Chunker Response

A successful chunking operation returns a FileChunkingResult object:

{
  "chunks": [
    {
      "text": "This is the first chunk containing complete sentences...",
      "token_count": 487,
      "id": 1,
      "overlap_sentences_count": 0,
      "overflow_details": null,
      "doc_id": "123",
      "source": "api"
    },
    {
      "text": "The second chunk starts with overlap sentences from the previous chunk...",
      "token_count": 502,
      "id": 2,
      "overlap_sentences_count": 2,
      "overflow_details": null,
      "doc_id": "123",
      "source": "api"
    }
  ],
  "metadata": {
    "configured_max_chunk_tokens": 512,
    "configured_overlap_sentences": 2,
    "n_input_sentences": 150,
    "avg_tokens_per_input_sentence": 24,
    "max_tokens_in_input_sentence": 89,
    "min_tokens_in_input_sentence": 5,
    "n_chunks": 8,
    "avg_tokens_per_chunk": 495,
    "max_tokens_in_chunk": 512,
    "min_tokens_in_chunk": 234,
    "sat_model_name": "sat-12l-sm",
    "split_threshold": 0.5,
    "source": "document.txt",
    "processing_time": 2.34
  }
}

Note: The doc_id and source fields in the example above are custom metadata fields that were passed via the chunk_metadata_json parameter. The overlap_sentences_count field shows how many sentences from the previous chunk are repeated in the current chunk. These fields are merged directly into each chunk at the top level.

Markdown Chunker Response with YAML Frontmatter

When processing Markdown files with YAML frontmatter, all frontmatter fields are automatically extracted and added to each chunk:

{
  "chunks": [
    {
      "text": "---\ntitle: My Document\nauthor: John Doe\ntags:\n  - test\n  - markdown\n---\n\n# Introduction\n\nThis is the first section...",
      "token_count": 245,
      "id": 1,
      "title": "My Document",
      "author": "John Doe",
      "tags": "['test', 'markdown']",
      "Header 1": "Introduction"
    },
    {
      "text": "# Chapter 2\n\nThis is the second section...",
      "token_count": 312,
      "id": 2,
      "title": "My Document",
      "author": "John Doe",
      "tags": "['test', 'markdown']",
      "Header 1": "Chapter 2"
    }
  ],
  "metadata": {
    "configured_header_level": 1,
    "configured_chunk_size": null,
    "configured_chunk_overlap": 0,
    "configured_max_chunk_tokens": 500,
    "n_chunks_exceeding_limit": 0,
    "encoding_name": "cl100k_base",
    "n_chunks": 2,
    "avg_tokens_per_chunk": 278,
    "max_tokens_in_chunk": 312,
    "min_tokens_in_chunk": 245,
    "source": "document.md",
    "processing_time": 1.23
  }
}

Note: The title, author, and tags fields are extracted from the YAML frontmatter and merged into each chunk. The frontmatter content is preserved in the first chunk's text by default.

For strict mode violations, the API returns a 400 status with suggestions:

{
  "detail": {
    "chunk_process": "failed",
    "single_sentence_too_large": false,
    "overlap_too_large": true,
    "suggested_token_limit": 800,
    "suggested_overlap_value": 1
  }
}

HTTP Status Codes

The API uses standard HTTP status codes to indicate success or failure:

Status Code	Description	Common Causes
200 OK	Request successful	Valid request with proper parameters
400 Bad Request	Invalid input or constraints cannot be met	Strict mode violations, invalid model name, malformed JSON metadata, invalid file type
403 Forbidden	Authentication failed	Missing or invalid `Authorization: Bearer <token>` header when `API_TOKEN` is set
413 Payload Too Large	File exceeds size limit	File larger than 50 MB
422 Unprocessable Entity	Validation error	Invalid parameter types or values
500 Internal Server Error	Server-side error	WTPSplit model loading failure, unexpected exceptions

Error Response Format: All error responses include a detail field with specific information about the error:

{
  "detail": "Error description or structured error object"
}

Contributing

The Sentences Chunker is an open-source project that thrives on community contributions. Your involvement is not just welcome, it's fundamental to the project's growth, innovation, and long-term success.

Whether you're fixing bugs, improving documentation, adding new features, or sharing ideas, every contribution helps build a better tool for everyone. We believe in the power of collaborative development and welcome participants of all skill levels.

If you're interested in contributing:

Fork the repository
Create a development environment with all dependencies
Make your changes
Add tests if applicable
Ensure all tests pass
Submit a pull request

Happy Sentence Chunking!

Name		Name	Last commit message	Last commit date
Latest commit History 37 Commits
.github/workflows		.github/workflows
docker		docker
test		test
.dockerignore		.dockerignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
MarkdownTextSplitter.py		MarkdownTextSplitter.py
README.md		README.md
WHAT-IS-IT.md		WHAT-IS-IT.md
logo.png		logo.png
pytest.ini		pytest.ini
requirements.txt		requirements.txt
sentences_chunker.py		sentences_chunker.py
what-is-it.jpg		what-is-it.jpg

Folders and files

Latest commit

History

Repository files navigation

Sentences Chunker

Key Features

Table of Contents

How the Text Chunking Algorithm Works

The Pipeline

Intelligent Overlap Management

Strict Mode and Parameter Optimization

Comparison with Traditional Chunking

Advantages of the Solution

Superior Sentence Segmentation

Markdown Structure-Aware Chunking

Optimal for RAG and LLM Applications

Performance and Scalability

Installation and Deployment

Prerequisites

Getting the Code

Local Installation with Uvicorn

Docker Deployment (Recommended)

Option A: Pre-built Image from GitHub Container Registry

Option B: Build Locally from Source

Using the API

Authentication

API Endpoints

Available SaT Models

Example API Call using cURL

Example API Call using Python

Response Format

File Chunker Response

Markdown Chunker Response with YAML Frontmatter

HTTP Status Codes

Contributing

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 8

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages