Skip to content

tidesdb/mwbench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

==============================================================================
                  mwbench -- a mixed-workload bench for tidesdb
==============================================================================


OVERVIEW
--------

mwbench drives a tidesdb instance through a long sequential ingest while a
pool of reader threads continuously probes the same column family with point
gets, iterator seeks, and short range scans. It records per-op latency
histograms, db-wide stats, level distribution, clock-cache hit rate, and
on-disk size every N seconds, then runs a small delete-then-compact reclaim
experiment at the tail so you can see how tombstone removal and full
compaction move disk usage and read latency.

Every value read back is recomputed from its key and byte-compared against
what was stored, so any lost write or corrupted byte shows up as a counter
in the CSV. Reads that hit deliberately-deleted keys are flagged and
excluded from the data-loss signal.

One run produces:

    out/run_YYYYMMDD_HHMMSS/samples.csv             time-series, one row per
                                                    sample interval, tagged
                                                    with a phase column

    out/run_YYYYMMDD_HHMMSS/delete_experiment.csv   four discrete snapshots
                                                    (peak, post-delete,
                                                    post-compact, settle)

The sibling plot.py renders both into ten PNGs.


BUILD
-----

mwbench is a single .c file linked against tidesdb and pthreads, built
with CMake. The CMake project name and resulting binary are mwbench --
historical artifact; same program.

    cmake -S . -B build \
          -DTIDESDB_ROOT=/path/to/tidesdb \
          -DTIDESDB_BUILD=/path/to/tidesdb/build
    cmake --build build -j

That produces ./build/mwbench. Every invocation below is written as
./mwbench for readability -- substitute ./build/mwbench (or symlink it):

    ln -s build/mwbench mwbench

TIDESDB_ROOT and TIDESDB_BUILD default to the paths baked into
CMakeLists.txt; override on the cmake line if your tidesdb checkout
lives elsewhere. The build embeds an rpath pointing at the tidesdb
build dir so libtidesdb.so is found at runtime without LD_LIBRARY_PATH.

plot.py needs python3 and matplotlib:

    pip install matplotlib


RUNNING
-------

Defaults target a 1 TiB ingest, which takes hours. For a smoke test use
the --quick preset (1 GiB target, short cooldowns):

    ./mwbench --quick

A normal-sized run on a fast SSD:

    ./mwbench --target-gib 64 --write-threads 8 --read-threads 4 \
              --data-dir /mnt/ssd/mwbench/data \
              --out-dir  /mnt/ssd/mwbench/out

When the run finishes, point plot.py at the newest run dir (or let it find
the newest one itself):

    ./plot.py                       # auto-picks newest ./out/run_*
    ./plot.py /path/to/run_dir      # explicit


OPTIONS
-------

Workload size:

    --target-bytes N            total bytes to ingest                 1 TiB
    --target-gib   N            same, expressed in GiB                  --
    --value-size   N            per-value payload size, bytes         1024
    --batch-size   N            keys per writer txn                    256

Concurrency:

    --write-threads N           parallel writer threads                  4
    --read-threads  N           parallel reader threads                  2
    --flush-threads N           tidesdb flush worker count               4
    --compaction-threads N      tidesdb compaction worker count          4

Engine tuning:

    --write-buffer  N           memtable size, bytes              512 MiB
    --block-cache   N           clock-cache size, bytes             8 GiB

Sampling and timing:

    --sample-interval-sec N     seconds between CSV rows                10
    --range-scan-len     N      keys per range probe                   100
    --cooldown-sec       N      idle window after writers finish        30
    --delete-compact-wait-sec N seconds to sit in the compaction        60
                                phase after tidesdb_compact()

Paths and lifecycle:

    --data-dir PATH             tidesdb data directory
    --out-dir  PATH             parent of the run_* subdirs
    --resume                    open existing data dir instead of fresh
    --keep-data                 do not wipe data dir on clean exit

Presets:

    --quick                     1 GiB target, 2 s sample interval,
                                5 s cooldown, 10 s compact wait

Notes:

  * data-dir is wiped on close by default; pass --keep-data to inspect it.
  * out-dir is never overwritten; each invocation creates a new run_* dir.
  * --resume only skips the wipe-before-open step; key id allocation still
    starts from zero, so resumed runs only make sense for inspecting an
    already-populated db, not for extending one.


PHASES
------

Each samples.csv row carries a phase tag so plot.py can shade the time
axis. The run walks through these in order:

    0  ingest         writers active, readers probing live keys
    1  cooldown       writers stopped, db quiescing, readers continue
    2  delete         100 evenly-spaced keys removed; deleted set
                      published to readers so a configurable fraction of
                      probes intentionally hits tombstones
    3  post-delete    settle window, snapshot taken
    4  compaction     tidesdb_compact() driven, readers still live
    5  post-compact   final settle and snapshot

delete_experiment.csv has one row per snapshot, labelled:

    peak                end of ingest, just before any deletes
    post_delete         after the 100 deletes have been applied
    post_compact        after the explicit compaction completes
    post_compact_settle short settle window after compaction


OUTPUT FILES
------------

samples.csv columns (per row):

    elapsed_s, phase, bytes_written, keys_written, write_mibs,
    point_p50_us, point_p95_us, point_p99_us, point_ops_s, point_misses,
    seek_p50_us,  seek_p95_us,  seek_p99_us,  seek_ops_s,  seek_misses,
    range_p50_us, range_p95_us, range_p99_us, range_ops_s, range_misses,
    mismatches, deleted_keys_published,
    disk_bytes, data_size_bytes, memtable_bytes,
    sstable_count, immutable_count, open_sstables,
    global_seq, txn_memory_bytes, memory_pressure_level,
    flush_qsize, compact_qsize,
    cf_num_levels, cf_total_keys, cf_tombstones, cf_tombstone_ratio,
    cf_max_density, cf_max_density_level, cf_memtable_size,
    cf_avg_key, cf_avg_val,
    lvl0_ssts, lvl1_ssts, lvl2_ssts, lvl3_ssts, lvl4_ssts, lvl5_ssts,
    cache_hits, cache_misses, cache_hit_rate

delete_experiment.csv columns:

    phase, disk_bytes, data_bytes, sstable_count, tombstones,
    tombstone_ratio

PNGs written by plot.py into the same run dir:

    ingest.png            written GiB + write MiB/s, phase-shaded
    latency.png           p50/p95/p99 per op type, log y
    read_ops.png          point/seek/range ops-per-sec, log y
    db_stats.png          sstable count, immutables, memtable, queues
    level_dist.png        stacked L0-L5 sstable counts
    cache.png             clock-cache hit rate + cumulative hits/misses
    disk.png              du disk size vs db-stats data size vs memtable
    integrity.png         unexpected misses + value mismatches
    p99_vs_size.png       p99 latency vs ingested bytes
    delete_experiment.png bar chart of the four snapshots


EXAMPLES
--------

Smoke test, ~1 GiB, just to verify the binary and plot pipeline:

    ./mwbench --quick
    ./plot.py

Medium overnight run, 256 GiB, larger memtable, 8 writers and 4 readers:

    ./mwbench --target-gib 256 \
              --write-threads 8 --read-threads 4 \
              --write-buffer $((1024 * 1024 * 1024)) \
              --block-cache  $((16 * 1024 * 1024 * 1024)) \
              --sample-interval-sec 15

Read-heavy probe with minimal writer pressure (good for steady-state
latency curves):

    ./mwbench --target-gib 32 \
              --write-threads 1 --read-threads 8 \
              --range-scan-len 50

Long compaction-watch run -- give the compactor 10 minutes after the
deletes so the reclaim curve is well sampled:

    ./mwbench --target-gib 128 \
              --delete-compact-wait-sec 600 \
              --sample-interval-sec 5

Keep the data dir so you can poke at it after the run:

    ./mwbench --quick --keep-data --data-dir ./scratch/data

Custom output location (e.g. on a separate drive from the data dir, so the
CSVs aren't on the same spindle as the workload):

    ./mwbench --target-gib 64 \
              --data-dir /mnt/nvme/mwbench/data \
              --out-dir  /home/me/mwbench-runs


CONSOLE OUTPUT
--------------

A startup banner echoes the resolved config. During the run you get a
one-line-per-second progress tick plus a fuller line on every sample
interval, e.g.

    t=  120.0s  GiB=  3.42  W=  291.5 MiB/s  pt(p50/p99)=  18.3/  92.4 us
    sk(p99)=  410.1  rg(p99)= 1820.6  ssts=37  L0imm=2  mtab=412.0MiB
    miss=0/0 corrupt=0

At exit it prints an ingest summary (keys, bytes, time, throughput) and
the two CSV paths.


TROUBLESHOOTING
---------------

Disk fills up mid-run:
    Lower --target-gib, or move --data-dir to a bigger volume. The data
    dir holds the working set; out-dir is small (CSVs + PNGs).

Reader latency dominated by L0:
    Bump --flush-threads and --write-buffer so memtables flush sooner,
    or raise --compaction-threads so L0->L1 doesn't back up.

mismatches > 0 in samples.csv:
    Real data corruption. The mismatches column counts byte-compare
    failures on values that were definitely written and not deleted.
    Reproduce with the same config and file an issue.

unexpected misses creeping up before the delete phase:
    Indicates lost writes. The reader only probes key ids it knows have
    been published as written, so a miss in the ingest phase is real.

plot.py "missing samples.csv":
    The run aborted early or you pointed plot.py at the wrong dir. By
    default it picks the newest out/run_*; pass the run dir explicitly
    to override.


FILES
-----

    main.c                   the bench
    CMakeLists.txt           build definition (produces build/mwbench)
    plot.py                  CSV -> PNG renderer
    out/run_YYYYMMDD_HHMMSS/ per-run output (CSVs + PNGs)


==============================================================================

About

Mixed stress workload for TidesDB

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors