fix: deflake //rs/dogecoin/ckdoge/minter:integration_tests#10346
fix: deflake //rs/dogecoin/ckdoge/minter:integration_tests#10346gregorydemay wants to merge 9 commits into
Conversation
The ckDOGE withdrawal flow asserts that the fee returned by the `estimate_withdrawal_fee` query equals the fee used in the transaction that the minter actually submits. The query reflects the last refreshed median fee percentiles, which the minter only updates once its periodic refresh task runs after the dogecoin canister is synced. The estimate was therefore captured while the median was still the initialization default, racing the refresh and producing a fee mismatch only on the first withdrawal of a setup. Introduce a mandatory `minter_await_fee_refresh` step at the start of the withdrawal flow that drives the refresh and waits until the fee estimate stabilizes before the withdrawal request is submitted. The type-state makes the step impossible to skip for any caller reaching `expect_withdrawal_request_accepted`. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This PR aims to deflake ckDOGE withdrawal integration tests by ensuring the minter’s fee-percentile refresh runs (and the fee estimate stabilizes) before submitting the first withdrawal request, avoiding a race between estimate_withdrawal_fee and the minter’s periodic refresh.
Changes:
- Introduces a mandatory
minter_await_fee_refreshstep in the withdrawal flow type-state. - Adds a shared refresh interval constant to drive the periodic fee-percentiles refresh in PocketIC time.
- Updates ckDOGE minter integration tests to use the new mandatory flow step.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| rs/dogecoin/ckdoge/test_utils/src/lib.rs | Adds a constant used to advance PocketIC time past the minter’s refresh cadence. |
| rs/dogecoin/ckdoge/test_utils/src/flow/withdrawal.rs | Adds a new withdrawal-flow step that advances time/ticks until fee estimates stabilize. |
| rs/dogecoin/ckdoge/minter/tests/tests.rs | Updates integration tests to call the new mandatory fee-refresh step before approval/retrieval. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
The standalone fee-estimation test does not go through the withdrawal flow and was hitting the same stale median fee percentiles. Move the refresh helper onto MinterCanister so both the withdrawal flow and the estimation test can warm up the fee state. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Addresses review feedback: comparing fee estimates cannot distinguish an already-fresh estimate from one whose refresh has not yet succeeded, so a stale estimate could slip through. Wait until the minter logs a successful `estimate_fee_per_vbyte` refresh, and fail loudly if none occurs. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
… latency After a second upgrade, both ProcessLogic and RefreshFeePercentiles are rescheduled at the current time T0. The upgrade's single tick fires ProcessLogic (lower enum order). When await_fee_refresh then runs, tick 1 fires RefreshFeePercentiles (starts XNet call) and tick 2 fires ProcessLogic again (T0+5s < T0+361s), consuming an extra tick. The remaining 8 ticks were insufficient for the full XNet round trip through the management canister to the bitcoin adapter and back. Increasing to 100 gives ample headroom while still returning early once the refresh completes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
After minter install (in setup_tasks()) and after each upgrade, both ProcessLogic and RefreshFeePercentiles are scheduled at T0. Previously only one tick was called, firing only ProcessLogic (lower enum order), leaving RefreshFeePercentiles at T0. It would then fire during the first DogecoinSyncGuard tick in a subsequent mine_blocks call, sending a fee- percentile XNet call to the dogecoin canister while it was busy syncing new blocks. This put the canister in a persistent "not fully synced" state that prevented DogecoinSyncGuard from observing the target block height, causing it to time out after 1000 ticks. Add 20 ticks in Setup::new (after install) and in upgrade() so that: - Tick 1 fires ProcessLogic synchronously - Tick 2 fires RefreshFeePercentiles and starts the XNet call - Remaining ticks let the XNet round-trip complete After this, the RefreshFeePercentiles timer is at T0+360s, well outside any DogecoinSyncGuard window, so mine_blocks never sees the concurrent fee-percentile call. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
mbjorkqvist
left a comment
There was a problem hiding this comment.
Thanks @gregorydemay! Claude found some nits, but overall LGTM!
| // Tick 1 fires ProcessLogic; tick 2 fires RefreshFeePercentiles and starts the XNet call; | ||
| // the remaining ticks let the XNet round-trip complete. After this the timer is at T0+360s, | ||
| // far beyond any DogecoinSyncGuard window. | ||
| for _ in 0..20 { |
There was a problem hiding this comment.
🤖 Minor: this 8-line comment + for _ in 0..20 { env.tick() } loop is duplicated almost verbatim in minter.rs upgrade() (post-upgrade drain). Consider extracting a shared helper (e.g. drain_startup_tasks(env)) so the explanation lives in one place and the magic 20 can't drift between the two sites.
There was a problem hiding this comment.
🤖 Done in 59fc26b. Extracted a shared drain_startup_tasks(env) helper (in lib.rs) that holds the explanation in one place and puts the magic 20 behind a named STARTUP_DRAIN_TICKS const. Both Setup::new and upgrade() now call it, so the two copies can no longer drift.
Extract a shared drain_startup_tasks helper used by Setup::new and upgrade(), strengthen the FEE_PERCENTILES_REFRESH_INTERVAL coupling note, and clarify the await_fee_refresh panic message. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
|
✅ No security or compliance issues detected. Reviewed everything up to 59fc26b. Security Overview
Detected Code Changes| Change Type | Relevant files ... (code changes summary truncated to fit VCS comment limits.) |
Purpose
The ckDOGE withdrawal integration tests (
should_resubmit_transactionandshould_resubmit_transaction_when_many_utxos) are flaky: they intermittently fail with a withdrawal-fee mismatch between theestimate_withdrawal_feequery and the fee in theSentDogeTransactionevent (e.g.dogecoin_fee: 1vs376000000).Root cause
estimate_withdrawal_feereflects the minter's last refreshed median fee percentiles. That value starts at the initialization default and is only updated once the minter's periodic refresh task runs successfully after the dogecoin canister is synced. The withdrawal flow captures the estimate before the withdrawal's own submission refreshes the median, so on the first withdrawal of a setup the estimate races the refresh — passing when a refresh already landed, failing otherwise.Fix
Add a mandatory
minter_await_fee_refreshstep at the start of the withdrawal flow that drives the fee-percentile refresh and waits until the estimate stabilizes, before the withdrawal request is submitted (kept before submission so it cannot trigger an early transaction). The flow's type-state makes the step impossible to skip for any caller reachingexpect_withdrawal_request_accepted.Successfully tried out 100 times with
Supersedes #10336.
This PR was created following the steps in
.claude/skills/fix-flaky-tests/SKILL.md.