tidesdb/mwbench
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
==============================================================================
mwbench -- a mixed-workload bench for tidesdb
==============================================================================
OVERVIEW
--------
mwbench drives a tidesdb instance through a long sequential ingest while a
pool of reader threads continuously probes the same column family with point
gets, iterator seeks, and short range scans. It records per-op latency
histograms, db-wide stats, level distribution, clock-cache hit rate, and
on-disk size every N seconds, then runs a small delete-then-compact reclaim
experiment at the tail so you can see how tombstone removal and full
compaction move disk usage and read latency.
Every value read back is recomputed from its key and byte-compared against
what was stored, so any lost write or corrupted byte shows up as a counter
in the CSV. Reads that hit deliberately-deleted keys are flagged and
excluded from the data-loss signal.
One run produces:
out/run_YYYYMMDD_HHMMSS/samples.csv time-series, one row per
sample interval, tagged
with a phase column
out/run_YYYYMMDD_HHMMSS/delete_experiment.csv four discrete snapshots
(peak, post-delete,
post-compact, settle)
The sibling plot.py renders both into ten PNGs.
BUILD
-----
mwbench is a single .c file linked against tidesdb and pthreads, built
with CMake. The CMake project name and resulting binary are mwbench --
historical artifact; same program.
cmake -S . -B build \
-DTIDESDB_ROOT=/path/to/tidesdb \
-DTIDESDB_BUILD=/path/to/tidesdb/build
cmake --build build -j
That produces ./build/mwbench. Every invocation below is written as
./mwbench for readability -- substitute ./build/mwbench (or symlink it):
ln -s build/mwbench mwbench
TIDESDB_ROOT and TIDESDB_BUILD default to the paths baked into
CMakeLists.txt; override on the cmake line if your tidesdb checkout
lives elsewhere. The build embeds an rpath pointing at the tidesdb
build dir so libtidesdb.so is found at runtime without LD_LIBRARY_PATH.
plot.py needs python3 and matplotlib:
pip install matplotlib
RUNNING
-------
Defaults target a 1 TiB ingest, which takes hours. For a smoke test use
the --quick preset (1 GiB target, short cooldowns):
./mwbench --quick
A normal-sized run on a fast SSD:
./mwbench --target-gib 64 --write-threads 8 --read-threads 4 \
--data-dir /mnt/ssd/mwbench/data \
--out-dir /mnt/ssd/mwbench/out
When the run finishes, point plot.py at the newest run dir (or let it find
the newest one itself):
./plot.py # auto-picks newest ./out/run_*
./plot.py /path/to/run_dir # explicit
OPTIONS
-------
Workload size:
--target-bytes N total bytes to ingest 1 TiB
--target-gib N same, expressed in GiB --
--value-size N per-value payload size, bytes 1024
--batch-size N keys per writer txn 256
Concurrency:
--write-threads N parallel writer threads 4
--read-threads N parallel reader threads 2
--flush-threads N tidesdb flush worker count 4
--compaction-threads N tidesdb compaction worker count 4
Engine tuning:
--write-buffer N memtable size, bytes 512 MiB
--block-cache N clock-cache size, bytes 8 GiB
Sampling and timing:
--sample-interval-sec N seconds between CSV rows 10
--range-scan-len N keys per range probe 100
--cooldown-sec N idle window after writers finish 30
--delete-compact-wait-sec N seconds to sit in the compaction 60
phase after tidesdb_compact()
Paths and lifecycle:
--data-dir PATH tidesdb data directory
--out-dir PATH parent of the run_* subdirs
--resume open existing data dir instead of fresh
--keep-data do not wipe data dir on clean exit
Presets:
--quick 1 GiB target, 2 s sample interval,
5 s cooldown, 10 s compact wait
Notes:
* data-dir is wiped on close by default; pass --keep-data to inspect it.
* out-dir is never overwritten; each invocation creates a new run_* dir.
* --resume only skips the wipe-before-open step; key id allocation still
starts from zero, so resumed runs only make sense for inspecting an
already-populated db, not for extending one.
PHASES
------
Each samples.csv row carries a phase tag so plot.py can shade the time
axis. The run walks through these in order:
0 ingest writers active, readers probing live keys
1 cooldown writers stopped, db quiescing, readers continue
2 delete 100 evenly-spaced keys removed; deleted set
published to readers so a configurable fraction of
probes intentionally hits tombstones
3 post-delete settle window, snapshot taken
4 compaction tidesdb_compact() driven, readers still live
5 post-compact final settle and snapshot
delete_experiment.csv has one row per snapshot, labelled:
peak end of ingest, just before any deletes
post_delete after the 100 deletes have been applied
post_compact after the explicit compaction completes
post_compact_settle short settle window after compaction
OUTPUT FILES
------------
samples.csv columns (per row):
elapsed_s, phase, bytes_written, keys_written, write_mibs,
point_p50_us, point_p95_us, point_p99_us, point_ops_s, point_misses,
seek_p50_us, seek_p95_us, seek_p99_us, seek_ops_s, seek_misses,
range_p50_us, range_p95_us, range_p99_us, range_ops_s, range_misses,
mismatches, deleted_keys_published,
disk_bytes, data_size_bytes, memtable_bytes,
sstable_count, immutable_count, open_sstables,
global_seq, txn_memory_bytes, memory_pressure_level,
flush_qsize, compact_qsize,
cf_num_levels, cf_total_keys, cf_tombstones, cf_tombstone_ratio,
cf_max_density, cf_max_density_level, cf_memtable_size,
cf_avg_key, cf_avg_val,
lvl0_ssts, lvl1_ssts, lvl2_ssts, lvl3_ssts, lvl4_ssts, lvl5_ssts,
cache_hits, cache_misses, cache_hit_rate
delete_experiment.csv columns:
phase, disk_bytes, data_bytes, sstable_count, tombstones,
tombstone_ratio
PNGs written by plot.py into the same run dir:
ingest.png written GiB + write MiB/s, phase-shaded
latency.png p50/p95/p99 per op type, log y
read_ops.png point/seek/range ops-per-sec, log y
db_stats.png sstable count, immutables, memtable, queues
level_dist.png stacked L0-L5 sstable counts
cache.png clock-cache hit rate + cumulative hits/misses
disk.png du disk size vs db-stats data size vs memtable
integrity.png unexpected misses + value mismatches
p99_vs_size.png p99 latency vs ingested bytes
delete_experiment.png bar chart of the four snapshots
EXAMPLES
--------
Smoke test, ~1 GiB, just to verify the binary and plot pipeline:
./mwbench --quick
./plot.py
Medium overnight run, 256 GiB, larger memtable, 8 writers and 4 readers:
./mwbench --target-gib 256 \
--write-threads 8 --read-threads 4 \
--write-buffer $((1024 * 1024 * 1024)) \
--block-cache $((16 * 1024 * 1024 * 1024)) \
--sample-interval-sec 15
Read-heavy probe with minimal writer pressure (good for steady-state
latency curves):
./mwbench --target-gib 32 \
--write-threads 1 --read-threads 8 \
--range-scan-len 50
Long compaction-watch run -- give the compactor 10 minutes after the
deletes so the reclaim curve is well sampled:
./mwbench --target-gib 128 \
--delete-compact-wait-sec 600 \
--sample-interval-sec 5
Keep the data dir so you can poke at it after the run:
./mwbench --quick --keep-data --data-dir ./scratch/data
Custom output location (e.g. on a separate drive from the data dir, so the
CSVs aren't on the same spindle as the workload):
./mwbench --target-gib 64 \
--data-dir /mnt/nvme/mwbench/data \
--out-dir /home/me/mwbench-runs
CONSOLE OUTPUT
--------------
A startup banner echoes the resolved config. During the run you get a
one-line-per-second progress tick plus a fuller line on every sample
interval, e.g.
t= 120.0s GiB= 3.42 W= 291.5 MiB/s pt(p50/p99)= 18.3/ 92.4 us
sk(p99)= 410.1 rg(p99)= 1820.6 ssts=37 L0imm=2 mtab=412.0MiB
miss=0/0 corrupt=0
At exit it prints an ingest summary (keys, bytes, time, throughput) and
the two CSV paths.
TROUBLESHOOTING
---------------
Disk fills up mid-run:
Lower --target-gib, or move --data-dir to a bigger volume. The data
dir holds the working set; out-dir is small (CSVs + PNGs).
Reader latency dominated by L0:
Bump --flush-threads and --write-buffer so memtables flush sooner,
or raise --compaction-threads so L0->L1 doesn't back up.
mismatches > 0 in samples.csv:
Real data corruption. The mismatches column counts byte-compare
failures on values that were definitely written and not deleted.
Reproduce with the same config and file an issue.
unexpected misses creeping up before the delete phase:
Indicates lost writes. The reader only probes key ids it knows have
been published as written, so a miss in the ingest phase is real.
plot.py "missing samples.csv":
The run aborted early or you pointed plot.py at the wrong dir. By
default it picks the newest out/run_*; pass the run dir explicitly
to override.
FILES
-----
main.c the bench
CMakeLists.txt build definition (produces build/mwbench)
plot.py CSV -> PNG renderer
out/run_YYYYMMDD_HHMMSS/ per-run output (CSVs + PNGs)
==============================================================================