High-performance, embedded Key-Value engine built with Rust π¦
Implementing LSM-Tree architecture with a focus on SOLID principles, observability, and performance.
ApexStore is a modern, Rust-based storage engine designed for write-heavy workloads. It combines the durability of write-ahead logging (WAL) with the efficiency of Log-Structured Merge-Tree (LSM-Tree) architecture.
Built from the ground up using SOLID principles, it provides a production-grade storage solution that is easy to reason about, test, and maintain, while delivering the performance expected from a systems-level language.
π Used in production by TeamCode β an autonomous AI coding agent platform that relies on ApexStore for reliable, low-latency key-value storage.
While industry giants like RocksDB or LevelDB focus on extreme complexity, ApexStore offers:
- Educational Clarity: A clean, modular implementation of LSM-Tree that serves as a blueprint for high-performance systems.
- Strict SOLID Compliance: Leveraging Rust's ownership model to enforce clear boundaries between MemTable, WAL, and SSTable layers.
- Observability First: Built-in real-time metrics for memory, disk usage, and WAL health.
- Modern Defaults: Native LZ4 compression, Bloom Filters, encryption-at-rest (AES-GCM), and 45+ tunable parameters via environment variables.
- Security Hardened: TLS/HTTPS support, CORS enforcement, rate limiting, per-IP connection limits, audit logging, and CSRF protection.
Run locally:
cargo bench --all-featuresβ HTML reports attarget/criterion/
π€ Auto-updated by CI on 2026-05-26 19:35 UTC β View run
No results parsed β check the run artifacts.
Measured on Intel Core i5-9300H @ 2.40GHz, 16 GB DDR4 2667 MHz, HDD SATA 1TB (v2.1.39) β cargo bench --bench mixed_bench -- --sample-size 10
| Benchmark | Size | Median | Throughput | Change vs previous |
|---|---|---|---|---|
| YCSB Type A (50% write / 50% read) | 10K | 952.83 Β΅s | 1.05 Mops/s | no change |
| YCSB Type A (50% write / 50% read) | 100K | 2.706 ms | 369.6 Kops/s | β +49% throughput |
| YCSB Type B (5% write / 95% read) | 10K | 814.90 Β΅s | 1.23 Mops/s | |
| YCSB Type B (5% write / 95% read) | 100K | 1.409 ms | 710.0 Kops/s | |
| YCSB Type C (100% read) | 10K | 334.70 Β΅s | 2.99 Mops/s | β +9.4% throughput |
| YCSB Type C (100% read) | 100K | 745.36 Β΅s | 1.34 Mops/s | β +12.4% throughput |
| YCSB Type C (100% read) | 1M | 1.290 ms | 775.0 Kops/s | β (new) |
| Benchmark | Size | Median | Throughput | Change vs previous |
|---|---|---|---|---|
| Balanced (mixed workload) | 10K | 1.080 ms | 925.9 Kops/s | |
| Balanced (mixed workload) | 100K | 2.831 ms | 353.2 Kops/s | β (no baseline) |
| Read Heavy (read-intensive) | 10K | 811.91 Β΅s | 1.23 Mops/s | β +15.7% throughput |
| Read Heavy (read-intensive) | 100K | 1.777 ms | 562.7 Kops/s | β (no baseline) |
| Write Heavy (write-intensive) | 10K | 1.187 ms | 82.3 KiB/s | |
| Write Heavy (write-intensive) | 100K | 3.486 ms | 28.0 KiB/s | β (no baseline) |
Hardware note: results above are conservative β measured on HDD SATA (vs. NVMe). On NVMe, expect 2β4Γ better throughput for I/O-bound operations.
Key Insights:
WAL_SYNC_MODE=asyncprovides 16x throughput vs fsync (trade durability for speed)- Cache hit rate > 80% when
block_cache_size_mb > 256- Bloom filter rejects 99.2% of non-existent key lookups
- Optimal
memtable_max_sizeis 16-32MB for write-heavy workloads
- MemTable: Concurrent
DashMap-backed in-memory store β lock-free reads and writes to different keys. - Write-Ahead Log (WAL): ACID-compliant durability with configurable sync interval and group commit support.
- SSTable V2: Block-based storage with Sparse Indexing, LZ4 Compression, and AES-GCM encryption.
- Bloom Filters: Drastically reduces unnecessary disk I/O for read operations.
- Crash Recovery: Automatic WAL replay on startup + SSTable auto-repair (truncated files detected and quarantined).
- Encryption at Rest: AES-256-GCM encryption enabled by default with configurable keys.
- Range Deletion: Efficient range tombstone support with compaction-aware filtering.
- TLS/HTTPS: Built-in rustls-based HTTPS with configurable certificates and ports.
- Authentication: Bearer token-based auth enabled by default.
- CORS: Configurable cross-origin resource sharing with restrictive defaults.
- Rate Limiting: Sharded per-endpoint rate limiting with per-IP connection limits.
- Audit Logging: Structured audit events with principal tracking for every API operation.
- CSRF Protection: Content-Type guard middleware for state-changing requests.
- Secrets Management: Constant-time token comparison to prevent timing attacks.
- File Permissions: Data files created with 0600/0700 permissions (owner-only access).
- Interactive CLI: REPL interface with token management (
token create,token list,token revoke). - REST API: Full HTTP API with JSON payloads, batch operations, and paginated scans.
- Admin Dashboard: Real-time web dashboard with live metrics and auto-refresh via fetch().
- WebSocket Sync: Real-time bidirectional sync with authentication.
- GraphQL API: Playground with production guard (disabled when auth enabled).
- Change Data Capture (CDC): HTTP webhook delivery with configurable retry, auth, and timeout.
- Unit Tests: 550+ unit tests covering all engine operations.
- Property-Based Tests:
proptestfor engine invariants (put/get/delete roundtrip, multi-key independence). - Fuzz Testing:
cargo-fuzztargets for WAL frame format and SSTable block decoding. - Chaos Testing: Fault injection tests for I/O failures, corruption handling, and crash recovery.
- Randomized Testing: Competitive test with reference
HashMapmodel for linearizability verification. - Integration Tests: SSTable roundtrip, CLI pagination, restart recovery, stress simulation.
The engine follows a modular architecture where each component has a single responsibility:
graph TB
subgraph "Interface Layer"
CLI[CLI / REPL]
API[REST API Server]
WS[WebSocket Sync]
end
subgraph "Security Layer"
TLS[TLS/HTTPS]
Auth[Bearer Auth]
RateLimit[Rate Limiter]
Audit[Audit Log]
CORS[CORS Middleware]
end
subgraph "Core Domain"
Engine[LSM Engine]
MemTable[MemTable<br/>DashMap Concurrent]
LogRecord[LogRecord<br/>Data Model]
Compaction[Compaction<br/>Strategy]
end
subgraph "Storage Layer"
WAL[Write-Ahead Log<br/>Durability]
SST[SSTable Manager<br/>V2 Format]
Builder[SSTable Builder<br/>LZ4 + AES-GCM]
Quarantine[Quarantine<br/>Auto-Repair]
end
subgraph "Infrastructure"
Codec[Serialization<br/>Bincode]
Metrics[Prometheus Metrics]
Error[Error Handling]
Config[Configuration<br/>Environment]
Degradation[Degradation<br/>Manager]
end
subgraph "Testing"
Proptest[Property Tests]
Fuzz[Fuzz Testing]
Chaos[Chaos Testing]
end
CLI --> Auth --> Engine
API --> TLS --> Auth --> Engine
WS --> Auth --> Engine
Engine --> WAL
Engine --> MemTable
MemTable -->|Flush| Builder
Builder --> SST
Engine -->|Read| MemTable
Engine -->|Read| SST
SST -->|Corrupt| Quarantine
WAL -.->|Recovery| MemTable
Engine --> Compaction
Engine --> Degradation
Engine --> Config
Engine --> Metrics
SST --> Codec
Builder --> Codec
WAL --> Codec
API --> RateLimit
API --> Audit
API --> CORS
style Engine fill:#f9a,stroke:#333,stroke-width:3px
style WAL fill:#9cf,stroke:#333,stroke-width:2px
style SST fill:#9cf,stroke:#333,stroke-width:2px
style TLS fill:#6c6,stroke:#333,stroke-width:2px
style Quarantine fill:#fc6,stroke:#333,stroke-width:2px
style Proptest fill:#cfc,stroke:#333,stroke-width:1px
style Fuzz fill:#cfc,stroke:#333,stroke-width:1px
style Chaos fill:#cfc,stroke:#333,stroke-width:1px
- Rust 1.70+: Install via rustup.rs
# Clone and enter
git clone https://github.com/ElioNeto/ApexStore.git && cd ApexStore
# Build and Start REPL
cargo run --release
# Available commands:
# > put user:1 "John Doe"
# > get user:1
# > stats# Generate self-signed certificates
openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes
# Start HTTPS server
TLS_ENABLED=true TLS_CERT_PATH=cert.pem TLS_KEY_PATH=key.pem cargo run --release --bin serverRun ApexStore as a standalone API server:
# Start with Docker Compose
docker-compose up -d
# Manual run with custom config
docker run -d \
--name apexstore-server \
-p 8080:8080 \
-e MEMTABLE_MAX_SIZE=33554432 \
-e TLS_ENABLED=true \
-e TLS_CERT_PATH=/certs/cert.pem \
-e TLS_KEY_PATH=/certs/key.pem \
-v apexstore-data:/data \
-v ./certs:/certs:ro \
elioneto/apexstore:latest| Method | Endpoint | Description |
|---|---|---|
POST |
/keys |
Insert/Update: {"key": "k1", "value": "v1"} |
GET |
/keys/{key} |
Retrieve value |
GET |
/health/check |
Comprehensive health (uptime, engine mode, memtable stats) |
GET |
/stats |
Engine telemetry (memory, disk, WAL, write/read amplification) |
DELETE |
/keys/{key} |
Delete a key |
POST |
/keys/batch |
Batch insert/update |
POST |
/admin/flush |
Force memtable flush |
POST |
/admin/compact |
Force compaction |
ApexStore is configured via environment variables:
| Variable | Default | Description |
|---|---|---|
TLS_ENABLED |
false |
Enable HTTPS |
TLS_CERT_PATH |
β | Path to TLS certificate (PEM) |
TLS_KEY_PATH |
β | Path to TLS private key (PEM) |
TLS_PORT |
443 |
HTTPS port |
AUTH_ENABLED |
true |
Enable bearer token authentication |
CORS_ENABLED |
false |
Enable CORS middleware |
RATE_LIMIT_ENABLED |
true |
Enable rate limiting |
MAX_CONNECTIONS_PER_IP |
100 |
Max concurrent connections per IP |
ENCRYPTION_ENABLED |
true |
Enable data encryption at rest |
WAL_SYNC_INTERVAL |
4 |
WAL fsync interval (writes between syncs) |
See docs/CONFIGURATION.md for a complete list.
ApexStore/
βββ src/
β βββ core/ # LSM Engine, MemTable (DashMap), Compaction, Domain logic
β βββ storage/ # WAL, SSTable V2, Block Builder, Encryption, Prefix Compression
β βββ infra/ # Codec, Error Handling, Config, Metrics, Scrubber, CDC
β βββ api/ # Actix-Web Server, Auth, Rate Limiter, Audit, Health, CORS
β βββ cli/ # REPL + Token management commands
βββ docs/ # Detailed documentation & Architecture
βββ tests/ # Integration, Chaos, Proptest, Fuzz test suites
βββ fuzz/ # cargo-fuzz targets for WAL and SSTable
βββ Dockerfile # Multi-stage build
# All tests (unit + integration + proptest + chaos)
cargo test --all-features
# Property-based tests (engine invariants)
cargo test proptest --all-features
# Chaos tests (fault tolerance)
cargo test chaos_ --all-features
# Fuzz testing (requires nightly)
cargo +nightly fuzz run wal -- -runs=10000
cargo +nightly fuzz run sstable -- -runs=10000
# Linting & formatting
cargo clippy --all-targets --all-features -- -D warnings
cargo fmt --all -- --check
# Security audit
cargo auditApexStore uses trunk-based development with automated releases:
graph LR
A[Feature Branch] -->|Open PR| B[CI Validation]
B -->|β
Pass| C[Merge to main]
C --> D[Auto Release]
D --> E[v2.1.X]
| Stage | What it checks |
|---|---|
Rustfmt |
Code formatting |
Clippy |
Lint warnings (deny by default) |
Build and Docs |
Compilation + documentation generation |
Run Tests |
Full test suite (550+ tests) |
Security Audit |
cargo audit for dependency vulnerabilities |
Benchmarks |
Performance regression gates (Write, Read, Scan, Mixed, Stress) |
report-status |
Summary with root cause analysis on failure |
- Create feature branch from
main - Open PR β CI runs all stages above
- Merge PR β Auto-increments version, creates tag & GitHub release
π Read: MIGRATION_GUIDE.md for team workflow
π Details: .github/workflows/README.md
- SSTable V2 with compression & Bloom Filters
- REST API & Feature Flags
- Global Block Cache
- Trunk-based CI/CD with auto-release
- v2.2: Concurrent read optimization (RwLock engine core)
- v2.3: WAL I/O decoupling, DashMap MemTable
- Security audit resolution (40+ issues): TLS, encryption, auth, rate limiting, CSRF, audit
- Testing infrastructure: proptest, fuzz, chaos testing
- v3.0: Leveled/Tiered Compaction Strategies
- v3.1: Distributed replication & consensus
- v3.2: SQL query layer via Apache DataFusion
Contributions are what make the open-source community an amazing place! Please check our Contributing Guidelines.
- Fork the Project
- Create your Feature Branch (
git checkout -b feat/amazing-feature) - Commit your Changes (
git commit -m 'feat: add amazing feature') - Push to the Branch (
git push origin feat/amazing-feature) - Open a Pull Request to
main - CI will auto-release on merge π
Distributed under the MIT License. See LICENSE for more information.
|
TeamCode |
TeamCode β an autonomous AI coding agent platform β uses ApexStore as its embedded storage engine for reliable, low-latency key-value storage. Managing task state, context, and agent coordination data at scale. |
Elio Neto - GitHub - netoo.elio@hotmail.com
Demo: lsm-admin-dev.up.railway.app
Built with π¦ Rust and β€οΈ for high-performance storage systems
