Skip to content

[Enhancement] Cache OTel Attributes objects in broker/remoting metrics hot path #10481

@wang-jiahua

Description

@wang-jiahua

Before Creating the Enhancement Request

  • I have confirmed that this should be classified as an enhancement rather than a bug/feature.

Summary

After PR #10443 introduces static AttributeKey instances, the next layer of allocation in the metrics hot path is the Attributes object itself. For repeated parameter combinations (e.g. same (topic, messageType, isSystem) or same (requestCode, responseCode, isLongPolling, result)), the broker keeps rebuilding identical immutable Attributes objects on every message/RPC. This issue proposes a two-level cache (volatile inline cache + ConcurrentHashMap) to reuse those Attributes objects.

It also includes a concurrency fix for RemotingCodeDistributionHandler's inline cache, which is reachable from multiple Netty EventLoop threads via @Sharable and would otherwise have a torn-read window between two independent volatile fields.

Motivation

JFR heap dump analysis on a running broker shows:

  • BrokerMetricsManager.incTopicMessage is called once per send. Each call rebuilds an Attributes for the same (topic, messageType, isSystem) tuple. With ~50 unique combinations under steady-state and 100k QPS, ~99.95% of these Attributes objects are exact duplicates.
  • Each Attributes.builder().put().put().build() allocates ~200–300 bytes (builder, internal ArrayList<Object>, sort/dedup temporaries, final immutable backing array). 100k QPS × 250 B ≈ 25 MB/s metrics-only allocation rate.
  • RemotingMetricsManager builds a 4-tag Attributes for every RPC response, also dominated by repeats.
  • RemotingCodeDistributionHandler already maintains a per-code LongAdder map; it would benefit from an inline cache to avoid ConcurrentHashMap.get on every report.

Describe the Solution You'd Like

Two-level cache structure (used in both broker and remoting):

inline cache (volatile)         → near-100% hit in single-tenant scenario
   │ miss
   ↓
ConcurrentHashMap<key, Attributes>  → multi-tenant fallback
  1. BrokerMetricsManager.getOrBuildTopicAttributes(topic, messageType, isSystem) — 4-field volatile inline cache + CHM keyed by topic|messageType|isSystem.
  2. RemotingMetricsManager.getOrBuildAttributes(requestCode, responseCode, isLongPolling, result) — long-packed cache key (35 bits across 4 args, no String allocation), volatile inline cache + CHM.
  3. RemotingCodeDistributionHandler — replace two independent volatile fields with an immutable CachedCounter { final int code; final LongAdder adder; } holder published through a single volatile CachedCounter. Single volatile read per inc() returns a self-consistent snapshot, eliminating a torn-read between lastCode and lastAdder under @Sharable concurrent access.

Describe Alternatives You've Considered

  • String concatenation as cache key (e.g. topic + "|" + msgType + "|" + isSystem) — rejected because each lookup would itself allocate a String + char[], defeating the purpose of caching.
  • Single volatile Attributes cache without CHM fallback — rejected because in multi-tenant scenarios it would constantly be replaced and degenerate to a per-call build.
  • Two independent volatile fields for RemotingCodeDistributionHandler — initially proposed, rejected after review identified a torn-read between writing lastCode and lastAdder.

Additional Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions