Skip to content

Replace Mutex with CountDownLatch to fix ANR on cold start#7996

Merged
mrober merged 3 commits intofirebase:mainfrom
jrodiz:hotfix/crashlytics-anr-mutex-class-loading
Apr 1, 2026
Merged

Replace Mutex with CountDownLatch to fix ANR on cold start#7996
mrober merged 3 commits intofirebase:mainfrom
jrodiz:hotfix/crashlytics-anr-mutex-class-loading

Conversation

@jrodiz
Copy link
Copy Markdown
Contributor

@jrodiz jrodiz commented Mar 30, 2026

Summary

  • Fix ANR during cold start on budget devices (Realme, Vivo) caused by CrashlyticsRegistrar. eagerly
    loading the entire kotlinx.coroutines synchronization infrastructure on the main thread
  • Replace kotlinx.coroutines.sync.Mutex with java.util.concurrent.CountDownLatch in
    FirebaseSessionsDependencies — a JVM primitive with zero class-loading overhead and identical one-shot gate
    semantics
  • Update test to use runBlocking for real-time timeout behavior with the blocking CountDownLatch.await() call

Problem #7882(#7882)
When ComponentDiscovery loads CrashlyticsRegistrar via Class.forName() on the main thread, the static block
calls FirebaseSessionsDependencies.addDependency(), which creates a Mutex(locked = true). This cascades into
loading ~10 coroutines classes (SemaphoreAndMutexImpl, SemaphoreSegment, ConcurrentLinkedListNode,
SystemPropsKt, etc.) via chains — all before Application.onCreate(). On budget devices where APK
class loading is slow, this may exceed the ANR timeout.

Fix
The Mutex was only used as a one-shot gate (create locked, unlock once when registered, waiters proceed).
CountDownLatch(1) provides the exact same semantics as a JVM bootstrap class with no additional class loading.
runInterruptible { latch.await() } preserves suspend/cancellation support in getRegisteredSubscribers().

I have also added a test cases representing the times to load both approaches, here's the result

=== Mutex class-loading cascade (from ANR stack traces) ===
Classes loaded when calling Mutex(locked=true):
  loaded kotlinx.coroutines.sync.Mutex (493us)
  loaded kotlinx.coroutines.sync.MutexImpl (4319us)
  loaded kotlinx.coroutines.sync.SemaphoreAndMutexImpl (2us)
  loaded kotlinx.coroutines.sync.SemaphoreSegment (1544us)
  loaded kotlinx.coroutines.sync.SemaphoreKt (1392us)
  loaded kotlinx.coroutines.internal.ConcurrentLinkedListNode (2us)
  loaded kotlinx.coroutines.internal.ConcurrentLinkedListKt (673us)
  loaded kotlinx.coroutines.internal.Segment (1us)
  loaded kotlinx.coroutines.internal.SystemPropsKt (1us)
  loaded kotlinx.coroutines.internal.SystemPropsKt__SystemPropsKt (1us)
Mutex total: 8431us for 10 classes

=== CountDownLatch class-loading (fix) ===
CountDownLatch is a JDK bootstrap class — already loaded by the VM.
Creating CountDownLatch(1) triggers ZERO additional class loading.

=== Actual construction timing ===
Mutex(locked=true): 1591us
CountDownLatch(1): 2us

=== Functional equivalence ===
Mutex.isLocked = true
CountDownLatch.count = 1
After unlock/countDown:
Mutex.isLocked = false
CountDownLatch.count = 0

=== Impact on budget devices ===
Mutex loads 10 classes via <clinit> chain.
CountDownLatch loads 0 additional classes.
On budget Realme/Vivo devices, class loading from APK is 10-100x slower.
The Mutex cascade during CrashlyticsRegistrar.<clinit> pushes past the ANR timeout.

CrashlyticsRegistrar.<clinit> calls FirebaseSessionsDependencies.addDependency()
which creates a kotlinx.coroutines.sync.Mutex. This triggers loading the entire
coroutines synchronization class chain on the main thread during Class.forName(),
causing ANRs on budget devices before Application.onCreate().

Replace Mutex with java.util.concurrent.CountDownLatch — a JVM primitive with
zero class-loading overhead that provides identical one-shot gate semantics.

Fixes firebase#7882

I have also added a test cases representing the times to load both approaches
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

mrober
mrober previously approved these changes Apr 1, 2026
@jrodiz jrodiz force-pushed the hotfix/crashlytics-anr-mutex-class-loading branch from ebad1de to 7facb69 Compare April 1, 2026 18:56
@mrober mrober enabled auto-merge (squash) April 1, 2026 18:59
@mrober mrober merged commit 7f0a4f2 into firebase:main Apr 1, 2026
23 of 25 checks passed
@github-actions github-actions bot mentioned this pull request Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants