[Feature] Discuss DoS Protection Strategy for Constant Call APIs

# Summary

Nodes with `vm.supportConstant = true` expose `triggerConstantContract` and `estimateEnergy` APIs that are **free for callers** but consume **node CPU**. We ran pressure tests on a private network and found that a single machine with **zero TRX** can degrade a 16-vCPU node to **1.5% success rate** within seconds using concurrent requests.

This issue presents test data and aims to discuss: **should java-tron provide built-in concurrency protection for these APIs, and if so, should it be enabled by default or configured by operators?**

# Problem

### Motivation

`triggerConstantContract` is widely used for reading smart contract state without broadcasting transactions. Because it is free, attackers can send high volumes of complex calls to exhaust node CPU. We investigated whether the existing configuration parameters effectively mitigate this.

### Current State

#### Background: Real-world Energy Consumption

We measured the Top 100 most-called contracts on TRON mainnet (456 view/pure functions):

| Statistic | Energy |
| --- | --- |
| Min | 261 |
| Average | 648 |
| P50 | 471 |
| P90 | 1,024 |
| P95 | 1,458 |
| **Max** | **4,818** |

The current default `maxEnergyLimitForConstant` = 100M provides a ~20,000x safety margin over the observed maximum. However, as shown in Test 3 below, this parameter is **not the real execution boundary** — CPU time is.

---

We ran a controlled DoS test suite on an AWS EC2 `c6a.4xlarge` instance running java-tron `4.8.1`.

**Test environment**

| Item | Value |
| --- | --- |
| **Instance** | AWS EC2 c6a.4xlarge (16 vCPU, 32GB RAM) |
| **java-tron version** | 4.8.1 |
| **Test contract** | `EnergyLevelFlexible.consumeWithCount(uint256)` (see Appendix) |
| **Test duration** | 10 seconds per concurrency level |

#### Test 1: Baseline Concurrent Attack

Default configuration with `supportConstant=true` and no concurrent-request limiting. The test contract executes a configurable number of hash operations:
- **Light payload** (`count=1000`): ~0.7M Energy, ~25ms per call
- **Heavy payload** (`count=1500`): ~1.0M Energy, ~36ms per call — closer to the CPU time limit

**Light payload (count=1000)**

| Concurrency | Success Rate | Throughput (req/s) | Avg Latency (ms) | P99 Latency (ms) |
| --- | --- | --- | --- | --- |
| 1 | 100.0% | 40.7 | 25 | 34 |
| 10 | 100.0% | 303.4 | 33 | 45 |
| 20 | 99.7% | 300.3 | 66 | 85 |
| 30 | 100.0% | 300.4 | 100 | 118 |

**Heavy payload (count=1500)**

| Concurrency | Success Rate | Throughput (req/s) | Avg Latency (ms) | P99 Latency (ms) |
| --- | --- | --- | --- | --- |
| 1 | 99.9% | 28.0 | 36 | 43 |
| 10 | 56.9% | 123.6 | 46 | 52 |
| 20 | 49.9% | 106.3 | 94 | 112 |
| 30 | 49.5% | 105.4 | 140 | 155 |

With a heavy payload, success rate drops to **~50%** at concurrency 10+. An attacker only needs to increase the `count` parameter to maximize per-request CPU cost.

---

#### Test 2: `GlobalPreemptibleAdapter` Protection

`GlobalPreemptibleAdapter` uses a semaphore (`tryAcquire(2, TimeUnit.SECONDS)`) to limit concurrent execution. Excess requests queue for up to 2 seconds rather than being rejected immediately.

| Config | Concurrency | Success Rate | Throughput (req/s) | Avg Latency (ms) | P99 Latency (ms) |
| --- | --- | --- | --- | --- | --- |
| Baseline (no protection) | 300 | **1.5%** | **10.4** | 348 | 4212 |
| `GlobalPreemptibleAdapter` permit=10 | 300 | **100.0%** | **285.1** | 546 | 3933 |

Without protection, 300 concurrent connections collapse the node to 1.5% success. With `permit=10`, the node stays stable at ~285 req/s with 100% success. The 0 rejections are because each call completes in ~35ms, releasing permit slots fast enough for queued requests to acquire within the 2-second window.

---

#### Test 3: `maxEnergyLimitForConstant` Is Not the Real Limit

We swept the `count` parameter under different `maxEnergyLimitForConstant` configurations:

| Energy Limit | Count | Success Rate | Avg Energy | Avg Latency (ms) | Error |
| --- | --- | --- | --- | --- | --- |
| 100M | 100 | 100% | 67636 | 37 | - |
| 100M | 500 | 100% | 340287 | 30 | - |
| 100M | 1000 | 100% | 689012 | 47 | - |
| 100M | 1500 | 100% | 1046525 | 43 | - |
| 10M | 100 | 100% | 67636 | 34 | - |
| 10M | 500 | 100% | 340287 | 26 | - |
| 10M | 1000 | 100% | 689012 | 48 | - |
| 10M | 1500 | 100% | 1046525 | 46 | - |
| 5M | 100 | 100% | 67636 | 15 | - |
| 5M | 500 | 100% | 340287 | 39 | - |
| 5M | 1000 | 100% | 689012 | 27 | - |
| 5M | 1500 | 100% | 1046525 | 37 | - |
| 3M | 100 | 100% | 67636 | 34 | - |
| 3M | 500 | 100% | 340287 | 29 | - |
| 3M | 1000 | 100% | 689012 | 54 | - |
| 3M | 1500 | 100% | 1046525 | 41 | - |

Regardless of `maxEnergyLimitForConstant` (100M, 10M, 5M, or 3M), the Energy limit is **never the binding constraint**. CPU timeout (`OutOfTimeException`) is always what stops execution first. Lowering the Energy limit is a semantic cleanup, not a security fix.

---

#### Test 4: QPS Blocking Mode Does Not Help

The default `QpsRateLimiterAdapter` uses Guava `RateLimiter.acquire()` which blocks but **never rejects**:

| Config | Concurrency | Success Rate | Throughput (req/s) | Rejected |
| --- | --- | --- | --- | --- |
| Default (global.qps=50000, blocking) | 300 | 1.5% | 10.4 | 0 |
| Low QPS (global.qps=100, blocking) | 30 | 10.7% | 58.8 | 0 |
| `GlobalPreemptibleAdapter` permit=10 | 300 | 100.0% | 285.1 | 0 |

Even with `global.qps=100`, the blocking limiter still only achieves 10.7% success. The fundamental issue: `RateLimiter.acquire()` queues all requests and exhausts the thread pool, regardless of the QPS setting. Only `GlobalPreemptibleAdapter` effectively limits concurrent execution.

---

#### Test 5: `estimateEnergy` CPU Amplification

Comparing `triggerConstantContract` vs `estimateEnergy` with identical contract calls (count=1000, 10 concurrent):

| Endpoint | Success Rate | Throughput (req/s) | Avg Latency (ms) |
| --- | --- | --- | --- |
| triggerConstantContract | 99.0% | 292.4 | 34 |
| estimateEnergy | 100.0% | 33.3 | 299 |

`estimateEnergy` is **~9x slower** due to binary search retries (`estimateEnergyMaxRetry=3`), each executing the full EVM contract. Any concurrency protection should cover both endpoints.

---

#### Test 6: `maxConnectionAge` and `maxConcurrentCallsPerConnection` (Code Analysis)

These gRPC-only parameters were investigated via code analysis (empirical testing planned as follow-up):

| Parameter | Default | Risk |
| --- | --- | --- |
| `maxConcurrentCallsPerConnection` | `Integer.MAX_VALUE` | Unlimited concurrent calls per connection via HTTP/2 multiplexing |
| `maxConnectionAgeInMillis` | `Long.MAX_VALUE` | Connections never expire |

A single gRPC connection can bypass `maxConnectionsWithSameIp=2` and exhaust all `rpcThread` workers.

### Limitations or Risks

If left unaddressed, publicly accessible API nodes remain vulnerable to trivial DoS attacks that require no TRX and no on-chain transactions.

# Proposed Solution

### Reference: Comparison with Ethereum Geth

Ethereum's [Geth](https://geth.ethereum.org/) faces the same problem with [`eth_call`](https://ethereum.org/en/developers/docs/apis/json-rpc/#eth_call) and [`eth_estimateGas`](https://ethereum.org/en/developers/docs/apis/json-rpc/#eth_estimategas). Geth provides per-call protections ([`RPCGasCap`](https://geth.ethereum.org/docs/fundamentals/config-files) = 50M, [`RPCEVMTimeout`](https://geth.ethereum.org/docs/fundamentals/config-files) = 5s, `BatchRequestLimit` = 1000) but **no native concurrent execution limit**. This is likely because Go's goroutine model (~few KB each) is more resilient to concurrent load than Java's thread pool model (~1MB per thread). In practice, Geth operators rely on external infrastructure (Nginx, cloud load balancers) for concurrency control.

java-tron already has a built-in mechanism — `GlobalPreemptibleAdapter` — that can provide this protection natively.

### Proposed Design

Based on our test results, `GlobalPreemptibleAdapter` is the most effective built-in defense. We'd like community input on the deployment strategy.

### Key Changes

| # | Item | Current | Suggestion |
| --- | --- | --- | --- |
| 1 | `GlobalPreemptibleAdapter` for `TriggerConstantContract` | Not configured | Enable with configurable `permit` value |
| 2 | `GlobalPreemptibleAdapter` for `EstimateEnergyServlet` | Not configured | Enable with a lower `permit` (higher CPU cost per request) |
| 3 | `maxEnergyLimitForConstant` | 100M | Consider lowering for semantic alignment (not a security fix) |

# Impact

- **Security**: Limits the CPU that free constant calls can consume concurrently.
- **Stability**: Prevents node collapse under concurrent load.
- **Performance**: Normal queries are unaffected — permit=10 still delivers ~285 req/s.

# Compatibility

- **Breaking Change**: No.
- **Default Behavior Change**: Depends on discussion outcome.
- **Migration Required**: No.

# References (Optional)

- Related source code:
  - [`GlobalPreemptibleStrategy.java`](https://github.com/tronprotocol/java-tron/blob/develop/framework/src/main/java/org/tron/core/services/filter/GlobalPreemptibleStrategy.java)
  - [`RateLimiterServlet.java`](https://github.com/tronprotocol/java-tron/blob/develop/framework/src/main/java/org/tron/core/services/filter/RateLimiterServlet.java)
- Related issue: #6681 (proposal to extend constant call timeout — makes concurrency control even more important)

# Additional Notes

We'd like to discuss the following questions with the community:

1. **Default-on vs operator-configured?** Should `GlobalPreemptibleAdapter` be enabled by default, or should operators enable it when needed? Enabling by default protects out-of-the-box but may affect high-concurrency use cases; leaving it off means most operators remain unprotected unless they discover the option.

2. **Node-level vs external protection?** Some operators may prefer to handle rate limiting externally (Nginx, cloud LB). Should java-tron focus on providing the capability and documenting it, rather than enabling it by default?

3. **What is a reasonable `permit` value?** Our test used `permit=10` on 16 vCPU, achieving ~285 req/s with 100% success. Should the default be tied to CPU cores (e.g., `cores / 2`), or a fixed conservative value?

4. **Should `maxEnergyLimitForConstant` be lowered?** The Top 100 contracts peak at only 4,818 Energy vs the 100M default (~20,000x gap). Lowering doesn't improve DoS resilience but better reflects real-world usage.

- Do you have ideas regarding implementation? Yes
- Are you willing to implement this feature? Yes

# Appendix: Test Contract

The `consumeWithCount(uint)` function performs `count` iterations of keccak256 hashing and arithmetic, allowing precise control over Energy consumption and CPU time per call.

```solidity
// Solidity 0.4.23 (compatible with TRON TVM)
contract EnergyLevelFlexible {
    function consumeWithCount(uint count) public pure returns (uint) {
        uint result = 1;
        for (uint i = 0; i < count && i < 500000; i++) {
            result = uint(keccak256(abi.encodePacked(result, i)));
            result = result * 3 + i;
            if (result > 1000000000000) {
                result = result / 2;
            }
        }
        return result;
    }
}
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Discuss DoS Protection Strategy for Constant Call APIs #6682

Summary

Problem

Motivation

Current State

Background: Real-world Energy Consumption

Test 1: Baseline Concurrent Attack

Test 2: `GlobalPreemptibleAdapter` Protection

Test 3: `maxEnergyLimitForConstant` Is Not the Real Limit

Test 4: QPS Blocking Mode Does Not Help

Test 5: `estimateEnergy` CPU Amplification

Test 6: `maxConnectionAge` and `maxConcurrentCallsPerConnection` (Code Analysis)

Limitations or Risks

Proposed Solution

Reference: Comparison with Ethereum Geth

Proposed Design

Key Changes

Impact

Compatibility

References (Optional)

Additional Notes

Appendix: Test Contract

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Item	Value
Instance	AWS EC2 c6a.4xlarge (16 vCPU, 32GB RAM)
java-tron version	4.8.1
Test contract	`EnergyLevelFlexible.consumeWithCount(uint256)` (see Appendix)
Test duration	10 seconds per concurrency level

Concurrency	Success Rate	Throughput (req/s)	Avg Latency (ms)	P99 Latency (ms)
1	100.0%	40.7	25	34
10	100.0%	303.4	33	45
20	99.7%	300.3	66	85
30	100.0%	300.4	100	118

Concurrency	Success Rate	Throughput (req/s)	Avg Latency (ms)	P99 Latency (ms)
1	99.9%	28.0	36	43
10	56.9%	123.6	46	52
20	49.9%	106.3	94	112
30	49.5%	105.4	140	155

Config	Concurrency	Success Rate	Throughput (req/s)	Avg Latency (ms)	P99 Latency (ms)
Baseline (no protection)	300	1.5%	10.4	348	4212
`GlobalPreemptibleAdapter` permit=10	300	100.0%	285.1	546	3933

Energy Limit	Count	Success Rate	Avg Energy	Avg Latency (ms)	Error
100M	100	100%	67636	37	-
100M	500	100%	340287	30	-
100M	1000	100%	689012	47	-
100M	1500	100%	1046525	43	-
10M	100	100%	67636	34	-
10M	500	100%	340287	26	-
10M	1000	100%	689012	48	-
10M	1500	100%	1046525	46	-
5M	100	100%	67636	15	-
5M	500	100%	340287	39	-
5M	1000	100%	689012	27	-
5M	1500	100%	1046525	37	-
3M	100	100%	67636	34	-
3M	500	100%	340287	29	-
3M	1000	100%	689012	54	-
3M	1500	100%	1046525	41	-

Config	Concurrency	Success Rate	Throughput (req/s)
Default (global.qps=50000, blocking)	300	1.5%	10.4
Low QPS (global.qps=100, blocking)	30	10.7%	58.8
`GlobalPreemptibleAdapter` permit=10	300	100.0%	285.1

Endpoint	Success Rate	Throughput (req/s)	Avg Latency (ms)
triggerConstantContract	99.0%	292.4	34
estimateEnergy	100.0%	33.3	299

Parameter	Default	Risk
`maxConcurrentCallsPerConnection`	`Integer.MAX_VALUE`	Unlimited concurrent calls per connection via HTTP/2 multiplexing
`maxConnectionAgeInMillis`	`Long.MAX_VALUE`	Connections never expire

#	Item	Current	Suggestion
1	`GlobalPreemptibleAdapter` for `TriggerConstantContract`	Not configured	Enable with configurable `permit` value
2	`GlobalPreemptibleAdapter` for `EstimateEnergyServlet`	Not configured	Enable with a lower `permit` (higher CPU cost per request)
3	`maxEnergyLimitForConstant`	100M	Consider lowering for semantic alignment (not a security fix)

[Feature] Discuss DoS Protection Strategy for Constant Call APIs #6682

Description

Summary

Problem

Motivation

Current State

Background: Real-world Energy Consumption

Test 1: Baseline Concurrent Attack

Test 2: GlobalPreemptibleAdapter Protection

Test 3: maxEnergyLimitForConstant Is Not the Real Limit

Test 4: QPS Blocking Mode Does Not Help

Test 5: estimateEnergy CPU Amplification

Test 6: maxConnectionAge and maxConcurrentCallsPerConnection (Code Analysis)

Limitations or Risks

Proposed Solution

Reference: Comparison with Ethereum Geth

Proposed Design

Key Changes

Impact

Compatibility

References (Optional)

Additional Notes

Appendix: Test Contract

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Test 2: `GlobalPreemptibleAdapter` Protection

Test 3: `maxEnergyLimitForConstant` Is Not the Real Limit

Test 5: `estimateEnergy` CPU Amplification

Test 6: `maxConnectionAge` and `maxConcurrentCallsPerConnection` (Code Analysis)