diff --git a/README.md b/README.md
index 1b9bcd7..740b71e 100644
--- a/README.md
+++ b/README.md
@@ -1,5 +1,5 @@
-
-
+
+
ApexStore
@@ -27,6 +27,8 @@ ApexStore is a modern, Rust-based storage engine designed for write-heavy worklo
Built from the ground up using **SOLID principles**, it provides a production-grade storage solution that is easy to reason about, test, and maintain, while delivering the performance expected from a systems-level language.
+> **๐ Used in production by [TeamCode](https://github.com/ElioNeto/teamcode)** โ an autonomous AI coding agent platform that relies on ApexStore for reliable, low-latency key-value storage.
+
## โ๏ธ Why ApexStore?
While industry giants like RocksDB or LevelDB focus on extreme complexity, ApexStore offers:
@@ -34,7 +36,8 @@ While industry giants like RocksDB or LevelDB focus on extreme complexity, ApexS
- **Educational Clarity**: A clean, modular implementation of LSM-Tree that serves as a blueprint for high-performance systems.
- **Strict SOLID Compliance**: Leveraging Rust's ownership model to enforce clear boundaries between MemTable, WAL, and SSTable layers.
- **Observability First**: Built-in real-time metrics for memory, disk usage, and WAL health.
-- **Modern Defaults**: Native LZ4 compression, Bloom Filters, and 35+ tunable parameters via environment variables.
+- **Modern Defaults**: Native LZ4 compression, Bloom Filters, encryption-at-rest (AES-GCM), and 45+ tunable parameters via environment variables.
+- **Security Hardened**: TLS/HTTPS support, CORS enforcement, rate limiting, per-IP connection limits, audit logging, and CSRF protection.
## ๐ Performance Benchmarks
@@ -49,40 +52,6 @@ While industry giants like RocksDB or LevelDB focus on extreme complexity, ApexS
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
### ๐ YCSB Mixed Workload โ `mixed_bench`
*Measured on **Intel Core i5-9300H @ 2.40GHz**, 16 GB DDR4 2667 MHz, HDD SATA 1TB (v2.1.39) โ `cargo bench --bench mixed_bench -- --sample-size 10`*
@@ -121,17 +90,39 @@ While industry giants like RocksDB or LevelDB focus on extreme complexity, ApexS
## โจ Key Features
### ๐ ๏ธ Storage Engine
-- **MemTable**: In-memory BTreeMap with configurable size limits.
-- **Write-Ahead Log (WAL)**: ACID-compliant durability with configurable sync modes.
-- **SSTable V2**: Block-based storage with Sparse Indexing and LZ4 Compression.
+- **MemTable**: Concurrent `DashMap`-backed in-memory store โ lock-free reads and writes to different keys.
+- **Write-Ahead Log (WAL)**: ACID-compliant durability with configurable sync interval and group commit support.
+- **SSTable V2**: Block-based storage with Sparse Indexing, LZ4 Compression, and AES-GCM encryption.
- **Bloom Filters**: Drastically reduces unnecessary disk I/O for read operations.
-- **Crash Recovery**: Automatic WAL replay on startup to ensure zero data loss.
+- **Crash Recovery**: Automatic WAL replay on startup + SSTable auto-repair (truncated files detected and quarantined).
+- **Encryption at Rest**: AES-256-GCM encryption enabled by default with configurable keys.
+- **Range Deletion**: Efficient range tombstone support with compaction-aware filtering.
+
+### ๐ Security
+- **TLS/HTTPS**: Built-in rustls-based HTTPS with configurable certificates and ports.
+- **Authentication**: Bearer token-based auth enabled by default.
+- **CORS**: Configurable cross-origin resource sharing with restrictive defaults.
+- **Rate Limiting**: Sharded per-endpoint rate limiting with per-IP connection limits.
+- **Audit Logging**: Structured audit events with principal tracking for every API operation.
+- **CSRF Protection**: Content-Type guard middleware for state-changing requests.
+- **Secrets Management**: Constant-time token comparison to prevent timing attacks.
+- **File Permissions**: Data files created with 0600/0700 permissions (owner-only access).
### ๐ Access Patterns
-- **Interactive CLI**: REPL interface for development and debugging.
-- **REST API**: Full HTTP API with JSON payloads for microservices.
-- **Batch Operations**: Efficient bulk inserts and updates.
-- **Search Capabilities**: Prefix and substring search (Optimized iterators coming in v2.0).
+- **Interactive CLI**: REPL interface with token management (`token create`, `token list`, `token revoke`).
+- **REST API**: Full HTTP API with JSON payloads, batch operations, and paginated scans.
+- **Admin Dashboard**: Real-time web dashboard with live metrics and auto-refresh via fetch().
+- **WebSocket Sync**: Real-time bidirectional sync with authentication.
+- **GraphQL API**: Playground with production guard (disabled when auth enabled).
+- **Change Data Capture (CDC)**: HTTP webhook delivery with configurable retry, auth, and timeout.
+
+### ๐ฌ Testing Infrastructure
+- **Unit Tests**: 550+ unit tests covering all engine operations.
+- **Property-Based Tests**: `proptest` for engine invariants (put/get/delete roundtrip, multi-key independence).
+- **Fuzz Testing**: `cargo-fuzz` targets for WAL frame format and SSTable block decoding.
+- **Chaos Testing**: Fault injection tests for I/O failures, corruption handling, and crash recovery.
+- **Randomized Testing**: Competitive test with reference `HashMap` model for linearizability verification.
+- **Integration Tests**: SSTable roundtrip, CLI pagination, restart recovery, stress simulation.
## ๐๏ธ Architecture
@@ -142,44 +133,77 @@ graph TB
subgraph "Interface Layer"
CLI[CLI / REPL]
API[REST API Server]
+ WS[WebSocket Sync]
+ end
+
+ subgraph "Security Layer"
+ TLS[TLS/HTTPS]
+ Auth[Bearer Auth]
+ RateLimit[Rate Limiter]
+ Audit[Audit Log]
+ CORS[CORS Middleware]
end
subgraph "Core Domain"
Engine[LSM Engine]
- MemTable[MemTable
BTreeMap]
+ MemTable[MemTable
DashMap Concurrent]
LogRecord[LogRecord
Data Model]
+ Compaction[Compaction
Strategy]
end
subgraph "Storage Layer"
WAL[Write-Ahead Log
Durability]
SST[SSTable Manager
V2 Format]
- Builder[SSTable Builder
Compression]
+ Builder[SSTable Builder
LZ4 + AES-GCM]
+ Quarantine[Quarantine
Auto-Repair]
end
subgraph "Infrastructure"
Codec[Serialization
Bincode]
+ Metrics[Prometheus Metrics]
Error[Error Handling]
Config[Configuration
Environment]
+ Degradation[Degradation
Manager]
+ end
+
+ subgraph "Testing"
+ Proptest[Property Tests]
+ Fuzz[Fuzz Testing]
+ Chaos[Chaos Testing]
end
- CLI --> Engine
- API --> Engine
+ CLI --> Auth --> Engine
+ API --> TLS --> Auth --> Engine
+ WS --> Auth --> Engine
Engine --> WAL
Engine --> MemTable
MemTable -->|Flush| Builder
Builder --> SST
Engine -->|Read| MemTable
Engine -->|Read| SST
+ SST -->|Corrupt| Quarantine
WAL -.->|Recovery| MemTable
+ Engine --> Compaction
+ Engine --> Degradation
Engine --> Config
+ Engine --> Metrics
SST --> Codec
Builder --> Codec
WAL --> Codec
+ API --> RateLimit
+ API --> Audit
+ API --> CORS
+
style Engine fill:#f9a,stroke:#333,stroke-width:3px
style WAL fill:#9cf,stroke:#333,stroke-width:2px
style SST fill:#9cf,stroke:#333,stroke-width:2px
+ style TLS fill:#6c6,stroke:#333,stroke-width:2px
+ style Quarantine fill:#fc6,stroke:#333,stroke-width:2px
+ style Proptest fill:#cfc,stroke:#333,stroke-width:1px
+ style Fuzz fill:#cfc,stroke:#333,stroke-width:1px
+ style Chaos fill:#cfc,stroke:#333,stroke-width:1px
```
## ๐ Quick Start
@@ -201,6 +225,15 @@ cargo run --release
# > stats
```
+### Server with TLS
+```bash
+# Generate self-signed certificates
+openssl req -x509 -newkey rsa:4096 -keyout key.pem -out cert.pem -days 365 -nodes
+
+# Start HTTPS server
+TLS_ENABLED=true TLS_CERT_PATH=cert.pem TLS_KEY_PATH=key.pem cargo run --release --bin server
+```
+
## ๐ณ Docker Deployment
Run ApexStore as a standalone API server:
@@ -214,7 +247,11 @@ docker run -d \
--name apexstore-server \
-p 8080:8080 \
-e MEMTABLE_MAX_SIZE=33554432 \
+ -e TLS_ENABLED=true \
+ -e TLS_CERT_PATH=/certs/cert.pem \
+ -e TLS_KEY_PATH=/certs/key.pem \
-v apexstore-data:/data \
+ -v ./certs:/certs:ro \
elioneto/apexstore:latest
```
@@ -224,29 +261,70 @@ docker run -d \
|--------|----------|-------------|
| `POST` | `/keys` | Insert/Update: `{"key": "k1", "value": "v1"}` |
| `GET` | `/keys/{key}` | Retrieve value |
-| `GET` | `/stats/all` | Full telemetry (Memory, Disk, WAL) |
+| `GET` | `/health/check` | Comprehensive health (uptime, engine mode, memtable stats) |
+| `GET` | `/stats` | Engine telemetry (memory, disk, WAL, write/read amplification) |
+| `DELETE` | `/keys/{key}` | Delete a key |
+| `POST` | `/keys/batch` | Batch insert/update |
+| `POST` | `/admin/flush` | Force memtable flush |
+| `POST` | `/admin/compact` | Force compaction |
+
+## ๐ง Configuration
+
+ApexStore is configured via environment variables:
+
+| Variable | Default | Description |
+|----------|---------|-------------|
+| `TLS_ENABLED` | `false` | Enable HTTPS |
+| `TLS_CERT_PATH` | โ | Path to TLS certificate (PEM) |
+| `TLS_KEY_PATH` | โ | Path to TLS private key (PEM) |
+| `TLS_PORT` | `443` | HTTPS port |
+| `AUTH_ENABLED` | `true` | Enable bearer token authentication |
+| `CORS_ENABLED` | `false` | Enable CORS middleware |
+| `RATE_LIMIT_ENABLED` | `true` | Enable rate limiting |
+| `MAX_CONNECTIONS_PER_IP` | `100` | Max concurrent connections per IP |
+| `ENCRYPTION_ENABLED` | `true` | Enable data encryption at rest |
+| `WAL_SYNC_INTERVAL` | `4` | WAL fsync interval (writes between syncs) |
+
+See [docs/CONFIGURATION.md](docs/CONFIGURATION.md) for a complete list.
## ๐ Project Structure
```
ApexStore/
โโโ src/
-โ โโโ core/ # LSM Engine, MemTable, Domain logic
-โ โโโ storage/ # WAL, SSTable V2, Block Builder
-โ โโโ infra/ # Codec, Error Handling, Config
-โ โโโ api/ # Actix-Web Server & Handlers
-โ โโโ cli/ # REPL Implementation
+โ โโโ core/ # LSM Engine, MemTable (DashMap), Compaction, Domain logic
+โ โโโ storage/ # WAL, SSTable V2, Block Builder, Encryption, Prefix Compression
+โ โโโ infra/ # Codec, Error Handling, Config, Metrics, Scrubber, CDC
+โ โโโ api/ # Actix-Web Server, Auth, Rate Limiter, Audit, Health, CORS
+โ โโโ cli/ # REPL + Token management commands
โโโ docs/ # Detailed documentation & Architecture
-โโโ tests/ # Integration test suite
+โโโ tests/ # Integration, Chaos, Proptest, Fuzz test suites
+โโโ fuzz/ # cargo-fuzz targets for WAL and SSTable
โโโ Dockerfile # Multi-stage build
```
## ๐งช Testing & Quality
```bash
-cargo test # Run all tests
-cargo clippy -- -D warnings # Linting
-cargo fmt # Formatting
+# All tests (unit + integration + proptest + chaos)
+cargo test --all-features
+
+# Property-based tests (engine invariants)
+cargo test proptest --all-features
+
+# Chaos tests (fault tolerance)
+cargo test chaos_ --all-features
+
+# Fuzz testing (requires nightly)
+cargo +nightly fuzz run wal -- -runs=10000
+cargo +nightly fuzz run sstable -- -runs=10000
+
+# Linting & formatting
+cargo clippy --all-targets --all-features -- -D warnings
+cargo fmt --all -- --check
+
+# Security audit
+cargo audit
```
## ๐ CI/CD & Development Workflow
@@ -261,11 +339,23 @@ graph LR
D --> E[v2.1.X]
```
+### PR Validation Pipeline
+
+| Stage | What it checks |
+|-------|----------------|
+| `Rustfmt` | Code formatting |
+| `Clippy` | Lint warnings (deny by default) |
+| `Build and Docs` | Compilation + documentation generation |
+| `Run Tests` | Full test suite (550+ tests) |
+| `Security Audit` | `cargo audit` for dependency vulnerabilities |
+| `Benchmarks` | Performance regression gates (Write, Read, Scan, Mixed, Stress) |
+| `report-status` | Summary with root cause analysis on failure |
+
### Development Flow
1. **Create feature branch** from `main`
-2. **Open PR** โ CI runs `cargo fmt`, `clippy`, `test`, `build`
-3. **Merge PR** โ Auto-increments version in `Cargo.toml`, creates tag & GitHub release
+2. **Open PR** โ CI runs all stages above
+3. **Merge PR** โ Auto-increments version, creates tag & GitHub release
๐ **Read:** [`MIGRATION_GUIDE.md`](MIGRATION_GUIDE.md) for team workflow
๐ **Details:** [`.github/workflows/README.md`](.github/workflows/README.md)
@@ -276,9 +366,13 @@ graph LR
- [x] REST API & Feature Flags
- [x] Global Block Cache
- [x] Trunk-based CI/CD with auto-release
-- [ ] **v2.2**: Storage iterators for range queries
-- [ ] **v2.3**: Concurrent read optimization
+- [x] **v2.2**: Concurrent read optimization (RwLock engine core)
+- [x] **v2.3**: WAL I/O decoupling, DashMap MemTable
+- [x] Security audit resolution (40+ issues): TLS, encryption, auth, rate limiting, CSRF, audit
+- [x] Testing infrastructure: proptest, fuzz, chaos testing
- [ ] **v3.0**: Leveled/Tiered Compaction Strategies
+- [ ] **v3.1**: Distributed replication & consensus
+- [ ] **v3.2**: SQL query layer via Apache DataFusion
## ๐ค Contributing
@@ -295,6 +389,24 @@ Contributions are what make the open-source community an amazing place! Please c
Distributed under the MIT License. See `LICENSE` for more information.
+## ๐ผ Powered By
+
+
+
+
+
+ 
+ TeamCode
+
+ |
+
+ TeamCode โ an autonomous AI coding agent platform โ
+ uses ApexStore as its embedded storage engine for reliable, low-latency key-value storage.
+ Managing task state, context, and agent coordination data at scale.
+ |
+
+
+
## ๐ง Contact
**Elio Neto** - [GitHub](https://github.com/ElioNeto) - netoo.elio@hotmail.com