You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(distributed): routing strategies for sharded tables (issues 03, 09)
Resolves issue 09 (P0) and lands its prerequisite (issue 03, P1) in
one cohesive change because the strategy syntax extension naturally
lives in the shared tool config parser introduced by issue 03.
Issue 09 (P0) — single-shard route by shard key returned wrong
(empty) results for some integer keys. Root cause: ShardMap used
std::hash<int64_t> which on libstdc++ is the identity function, so
route(K) collapsed to K mod num_shards. The two-shard demo loaded
data by range (1-5 -> shard1, 6-10 -> shard2), which is misaligned
with hash routing for 6 of 10 keys. Wrong-results-without-error path.
Fixes:
- Replace std::hash<int64_t> with FNV-1a 64-bit. Deterministic across
compilers, regression-guarded by ShardMapHashTest.NotIdentityFunction.
- Add RoutingStrategy enum to TableShardConfig: HASH (default,
back-compat), RANGE (sorted upper-inclusive entries; first match;
values above max route to last entry's shard, matching
PARTITION BY RANGE LESS THAN MAXVALUE), LIST (explicit
value -> shard_index; supports int and string keys; misses fall
back to shard 0). The enum is the deliberate seam where a future
PROXYSQL_RULES strategy will plug in to delegate to ProxySQL's
mysql_query_rules table when this engine is embedded inside ProxySQL.
- Extend parse_shard_spec to accept an optional strategy qualifier:
table:key:backend1,backend2 (HASH, implicit)
table:key:hash:backend1,backend2 (HASH, explicit)
table:key:range:5=shard1,10=shard2 (RANGE)
table:key:list:1=a,2=a,6=b (LIST, int keys)
table:key:list:us-east=a,us-west=b (LIST, string keys)
Repeated backend names across entries are interned once.
- Update scripts/run_sharding_demo.sh and scripts/test_sqlengine.sh
to declare range routing matching the demo's data placement. The
two previously failing assertions in test_sqlengine.sh sharded
('route to shard1 (Eve)', 'route returns one row') pass without
further change. Suite now reports 16/16 sharded, 32/34 overall
(the two remaining failures are issue 08, separate).
Issue 03 (P1) — extract shared backend and shard configuration
parsing. tool_config_parser.{h,cpp} centralises parse_backend_url
and parse_shard_spec; sqlengine, mysql_server, bench_distributed and
engine_stress_test all delegate to it instead of duplicating the
logic. ~135 lines removed per tool.
Tests:
- New tests/test_shard_map.cpp — 14 cases covering HASH determinism,
FNV-1a divergence from identity (regression guard), RANGE matching
demo placement, RANGE above-max fallback, RANGE accepts unsorted
input and sorts internally, RANGE rejects spurious string lookups,
LIST int + string keys, LIST miss fallback, default-strategy is
HASH, out-of-range shard_index clamping, unknown table.
- 11 new cases in tests/test_ssl_config.cpp covering the new --shard
syntax (HASH default, explicit hash/range/list, malformed entries,
non-int range bound, empty body rejection, backend interning).
- scripts/test_sqlengine.sh sharded: 16/16 with range routing.
Docs:
- docs/issues/09 marked Resolved with Resolution Notes section.
- docs/issues/README.md status line updated.
- docs/architecture-and-status.md adds a Routing Strategies subsection
to section 6 and renumbers connection-pool to 6.4.
- docs/sqlengine.md documents the strategy qualifier and updates the
scatter/gather recipe to use range routing.
|`ShardMap` — unsharded tables + three routing strategies (HASH / RANGE / LIST) for sharded tables | ✅ |`shard_map.h`(FNV-1a hash; RANGE int-keyed; LIST int + string)|
The `--shard` flag accepts the strategy as an optional qualifier (back-compat: `table:key:s1,s2` defaults to `HASH`). The `RoutingStrategy` enum is the seam where a future `PROXYSQL_RULES` strategy will plug in to delegate to ProxySQL's `mysql_query_rules`.
|`SELECT … FROM unsharded`| Single REMOTE_SCAN to the pinned backend. |
436
-
|`SELECT … FROM sharded WHERE shard_key = K`| Routed REMOTE_SCAN to the one shard. |
448
+
|`SELECT … FROM sharded WHERE shard_key = K`| Routed REMOTE_SCAN to the one shard chosen by the configured strategy.|
437
449
|`SELECT … FROM sharded` (no key predicate) | Scatter REMOTE_SCAN to every shard, gather locally. |
438
450
|`SELECT col, AGG(x) … GROUP BY col`| Two-phase: per-shard partial aggregate + local MERGE_AGGREGATE. |
439
451
|`SELECT … ORDER BY k LIMIT n`| Per-shard ORDER BY + LIMIT n; local MERGE_SORT + LIMIT. |
440
452
|`SELECT … FROM sharded JOIN sharded ON …`| Materialize one side, hash-join locally (HASH_JOIN + REMOTE_SCAN inputs). |
441
453
|`INSERT/UPDATE/DELETE on sharded`| Distributed DML: one REMOTE_SCAN per affected shard, in DML mode. |
442
454
443
-
### 6.3 Connection pool + pinning
455
+
### 6.4 Connection pool + pinning
444
456
445
457
```mermaid
446
458
flowchart TB
@@ -572,8 +584,8 @@ Source: `docs/issues/README.md`. Status reflects the working tree on 2026-04-17.
572
584
573
585
1.**[01] Distributed 2PC must require safe session pinning** — 🟦 in working tree.
574
586
*Add `allows_unpinned_distributed_2pc()` flag. Default false. Pooled executors without pinning cannot enlist.*
575
-
9.**[09] Single-shard route by shard key misdirects for some keys** — 📋 open. Surfaced 2026-04-18 by `scripts/test_sqlengine.sh sharded`.
576
-
*`WHERE id = 5` against `users:id:shard1,shard2` returns empty rows; `id = 7` works. Wrong-results-without-error path. Either the router and the demo's data placement disagree, or `DistributedPlanner` ignores the shard key.*
587
+
9.**[09] Single-shard route by shard key misdirects for some keys** — ✅ resolved 2026-04-18.
588
+
*Root cause: `std::hash<int64_t>` on libstdc++ is identity, so route `= K mod num_shards`; demo data placement was range-based. Fix: replace with FNV-1a 64-bit hash and add a `RoutingStrategy` enum to `TableShardConfig` (HASH default, RANGE, LIST). The new `--shard "table:key:range:5=s1,10=s2"` syntax matches the demo's range placement. Engine is now ProxySQL-shaped — a future `PROXYSQL_RULES` strategy can join the same enum.*
**Hash function.** Replaced `std::hash<int64_t>` (libstdc++ identity → `K mod num_shards`) with FNV-1a 64-bit applied to all 8 bytes of the key. Deterministic across compilers, doesn't have the small-key collision pattern, regression-guarded by `ShardMapHashTest.NotIdentityFunction`.
110
+
111
+
**Routing strategy.** Added `RoutingStrategy` enum to `TableShardConfig` with three values:
-`RANGE` — ordered `(upper_inclusive, shard_index)` entries; first match wins; values exceeding the largest bound route to the last entry's shard (matches the MySQL `PARTITION BY RANGE LESS THAN MAXVALUE` idiom). Integer-keyed only today.
115
+
-`LIST` — explicit `value -> shard_index` map. Supports both int and string keys. Misses fall back to shard 0.
116
+
117
+
**Demo aligned with routing.**`scripts/start_sharding_demo.sh` loads users 1-5 to shard1 and 6-10 to shard2; `scripts/run_sharding_demo.sh` and `scripts/test_sqlengine.sh` now declare `--shard "users:id:range:5=shard1,10=shard2"` (and the analogous `orders` line) so the routing matches the data placement. The two previously failing route assertions in `scripts/test_sqlengine.sh sharded` (`route to shard1 (Eve)`, `route returns one row`) pass without further change.
118
+
119
+
**ProxySQL alignment.** The strategy enum is the deliberate seam for a future `PROXYSQL_RULES` value that delegates to ProxySQL's `mysql_query_rules` table. The existing planner code paths in `distributed_planner.h` (`literal_to_shard_index`) call through `ShardMap` unchanged, so adding a new strategy is a one-place addition rather than a cross-cutting refactor.
120
+
121
+
**Shard-spec syntax extension.**`parse_shard_spec` in `src/sql_engine/tool_config_parser.cpp` accepts an optional strategy qualifier:
table:key:list:1=shard1,6=shard2 # LIST (int keys)
128
+
table:key:list:us-east=a,us-west=b # LIST (string keys)
129
+
```
130
+
131
+
A repeated backend in `range:` or `list:` is interned once into `cfg.shards`.
132
+
133
+
## Test Coverage
134
+
135
+
-`tests/test_shard_map.cpp` — 14 cases covering HASH determinism, FNV-1a divergence from identity (regression guard), RANGE matching demo placement, RANGE above-max fallback, RANGE accepts unsorted input and sorts internally, RANGE rejects spurious string lookups, LIST int + string keys, LIST miss fallback, default-strategy is HASH, out-of-range shard_index clamping, unknown table.
136
+
-`tests/test_ssl_config.cpp` — 11 added cases for the new `--shard` syntax (HASH default, explicit hash/range/list, malformed entries, non-int range bound, empty body rejection, backend interning).
137
+
-`scripts/test_sqlengine.sh sharded` — 16/16 with the new range routing.
138
+
139
+
## Future Work
140
+
141
+
- Add a `PROXYSQL_RULES` strategy that consults `mysql_query_rules`-shaped data when the engine is embedded inside ProxySQL.
142
+
- Allow RANGE on string keys (lexicographic upper bounds) if a real use case arises.
143
+
- Allow multiple key columns per table for composite sharding.
144
+
- Allow a "shard not found" sentinel so unrouteable values can be made an error rather than silently routed.
Copy file name to clipboardExpand all lines: docs/issues/README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,7 +7,7 @@ This directory is the local working backlog for implementation gaps identified f
7
7
### P0
8
8
9
9
1.[Distributed 2PC must require safe session pinning](01-distributed-2pc-safe-session-pinning.md) — implemented in current working tree
10
-
9.[Single-shard route by shard key misdirects for some keys](09-shard-key-route-misdirection.md) — open; surfaced 2026-04-18 by `scripts/test_sqlengine.sh sharded`
10
+
9.[Single-shard route by shard key misdirects for some keys](09-shard-key-route-misdirection.md) — resolved 2026-04-18 (FNV-1a hash + RoutingStrategy enum: HASH/RANGE/LIST)
Backend names refer to the `name=` value from the backend URLs. A table with one backend is unsharded but pinned. Two or more backends turns on hash-based sharding by `SHARD_KEY`.
61
+
`STRATEGY` is optional and selects a `RoutingStrategy` from `ShardMap`. Backend names refer to the `name=` value from the backend URLs. The strategy determines what `ROUTING_BODY` looks like:
|`range:`|`upper=backend,upper=backend,...` (sorted internally) |`users:id:range:5=shard1,10=shard2,MAXVALUE on last`|
68
+
|`list:`|`value=backend,value=backend,...` (int or string) |`users:id:list:1=shard1,6=shard2` or `users:region:list:us-east=a,us-west=b`|
64
69
65
-
```text
66
-
--shard "users:id:shard1,shard2,shard3"
67
-
```
70
+
Notes:
71
+
72
+
-`hash` (default) uses **FNV-1a 64-bit** modulo `num_shards`. Deterministic across compilers.
73
+
-`range` is integer-keyed only. Values above the largest `upper` route to the last entry's backend (MySQL `LESS THAN MAXVALUE` idiom).
74
+
-`list` accepts both int and string keys. A miss falls back to shard 0.
75
+
- Repeated backends across `range:` or `list:` entries are interned once, so the order of distinct backends in the resulting shard list is the order of first appearance.
68
76
69
77
### Multiple flags
70
78
@@ -184,13 +192,13 @@ Useful smoke test that the executor and connection pool can talk to a real backe
184
192
185
193
### 5.4 Sharded query with scatter/gather
186
194
187
-
Two backends, one sharded table:
195
+
Two backends, one sharded table. Pick the routing strategy that matches how the data was loaded — for the demo set (1-5 on shard1, 6-10 on shard2) that's `range`, not `hash`:
0 commit comments