Skip to content

Conversation

@AndreasHolt
Copy link
Contributor

What changed?

  • Statistics for shards are now stored under executor keys at <prefix>/<namespace>/executors/<executorID>/statistics instead of under /shards/.../statistics. GetState is also updated to support this.

  • DeleteShardStats deletes shard stats by editing those executor maps and committing a single txn.

  • We no longer do batching as I don't believe we will reach etcd limits anymore (stats under ~128 different executors that are stale doesn't seem plausible)

  • Delete now unused helpers and tests for shard keys

  • The Subscribe logic was updated so changes to the statistics key are treated like heartbeats/assigned_state (so they dont trigger rebalances), and the etcd store tests were updated to use the new executor keyed setup.

Why?
Reduce load to etcd

How did you test it?
Unit test and running the canary

Potential risks

  • The etcd schema for shard statistics has changed, so any leftover stats keys under shard prefix will be ignored by the new code.

  • DeleteShardStats now issues all per-executor updates in a single transaction. So in an extreme scenario with many executors in one cleanup pass this could hit etcd’s op limit and leave some stale stats until the next run. Consider if it's worth adding batching if this we think this is ever possible.

  • Concurrent AssignShard calls to the same executor now share a single stats map without a CAS on the stats key, so shard telemetry (not assignments) could be lost in a rare ocasion where there will be a race which might affect future metrics or load-based decisions. But we made this trade-off since it's telemetry, and we don't want it to cause assignments to retry.

Release notes

Documentation Changes

…ove old fetching logic of shard-level stats

Signed-off-by: Andreas Holt <6665487+AndreasHolt@users.noreply.github.com>
Signed-off-by: Andreas Holt <6665487+AndreasHolt@users.noreply.github.com>
Signed-off-by: Andreas Holt <6665487+AndreasHolt@users.noreply.github.com>
Signed-off-by: Andreas Holt <6665487+AndreasHolt@users.noreply.github.com>
Signed-off-by: Andreas Holt <6665487+AndreasHolt@users.noreply.github.com>
Signed-off-by: Andreas Holt <6665487+AndreasHolt@users.noreply.github.com>
Signed-off-by: Andreas Holt <6665487+AndreasHolt@users.noreply.github.com>
Signed-off-by: Andreas Holt <6665487+AndreasHolt@users.noreply.github.com>
Signed-off-by: Andreas Holt <6665487+AndreasHolt@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant