Skip to content

[server] Add request queue size metric per processor#3232

Open
zuston wants to merge 1 commit intoapache:mainfrom
zuston:queueSizeProcessor
Open

[server] Add request queue size metric per processor#3232
zuston wants to merge 1 commit intoapache:mainfrom
zuston:queueSizeProcessor

Conversation

@zuston
Copy link
Copy Markdown
Member

@zuston zuston commented Apr 29, 2026

Purpose

Currently, each connection is bound to a dedicated RPC processor. Based on the aggregated RPC queue metrics, we cannot determine whether high latency is caused by a specific processor queue being blocked.
When latency is high, I want to identify whether a particular processor’s queue is congested or stuck.

Brief change log

Tests

API and Format

Documentation

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds per-RPC-processor visibility into request queue depth by exporting a requestQueueSize gauge scoped by processor index, enabling operators to pinpoint queue congestion to a specific processor rather than relying on only aggregated queue metrics.

Changes:

  • Register a per-processor requestQueueSize gauge under a request_processor_index metric group keyed by processor_index.
  • Extend RequestsMetrics with a helper to register gauges in a keyed child metric group.
  • Document the new coordinator/tabletserver per-processor queue size metric in the observability metrics guide.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
website/docs/maintenance/observability/monitor-metrics.md Documents the new request_processor_index-scoped requestQueueSize gauge for coordinator and tabletserver.
fluss-rpc/src/main/java/org/apache/fluss/rpc/netty/server/RequestsMetrics.java Adds an overload to register a gauge in a keyed child metric group (key/value).
fluss-rpc/src/main/java/org/apache/fluss/rpc/netty/server/RequestProcessorPool.java Registers requestQueueSize per processor (keyed by processor_index) in addition to the existing aggregated gauge.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +64 to +68
requestsMetrics.gauge(
PROCESSOR_INDEX,
String.valueOf(i),
MetricNames.REQUEST_QUEUE_SIZE,
requestChannel::requestsCount);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants