fix(android): recover default-interface index via Os.if_nametoindex#8889
fix(android): recover default-interface index via Os.if_nametoindex#8889garmr-ulfr wants to merge 4 commits into
Conversation
NetworkInterface.getByName relies on the interface-enumeration syscall, which is restricted on some Android versions/ROMs (returns null or throws there). When it failed, checkDefaultInterfaceUpdate exhausted its retries without ever calling updateDefaultInterface, so sing-box never received a default interface and every outbound failed with "no available network interface". Fall back to Os.if_nametoindex when getByName cannot resolve the index, mirroring the already-hardened getInterfaces sibling, and retry only when the index is still unresolvable. Also early-return on a null interface name instead of burning the retry budget on getByName(null).
📝 WalkthroughWalkthroughDefaultNetworkMonitor.kt is refactored to resolve default network interfaces using a single-thread executor instead of GlobalScope coroutines, adding stale-listener guarding, retry-based interface name/index resolution with a NetworkInterface/Os.if_nametoindex fallback. Bugs.kt, which exposed an unused Android stack-fix flag, is deleted. ChangesDefault Network Monitor Refactor
Removal of Unused Stack Fix Utility
Estimated code review effort: 3 (Moderate) | ~25 minutes Sequence Diagram(s)sequenceDiagram
participant NetworkCallback
participant checkDefaultInterfaceUpdate
participant updateExecutor
participant resolveDefaultInterface
participant listener
NetworkCallback->>checkDefaultInterfaceUpdate: network change detected
checkDefaultInterfaceUpdate->>updateExecutor: enqueue resolution task
updateExecutor->>resolveDefaultInterface: resolve interface (with retries)
resolveDefaultInterface-->>updateExecutor: ResolvedInterface or null
updateExecutor->>listener: notifyDefaultInterface (if listener still valid)
Related PRs: None found referencing this specific change. Suggested labels: android, refactor, network Suggested reviewers: None determined from provided information. 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Pull request overview
This PR addresses an Android-specific failure mode where sing-box never receives a default network interface index, causing all outbounds/DNS to fail with “no available network interface” even while the VPN appears connected. It hardens DefaultNetworkMonitor.checkDefaultInterfaceUpdate to recover the interface index on restricted Android ROMs by falling back to Os.if_nametoindex.
Changes:
- Add
Os.if_nametoindex(interfaceName)fallback whenNetworkInterface.getByNamecannot resolve an interface index. - Avoid retrying when the interface name is
nullby early-returning before entering the retry loop. - Retry only when the resolved index is still not usable (
<= 0).
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
getLinkProperties (and its interfaceName) can briefly return null right after a network change. Resolving it once before the loop meant a transient null abandoned the update entirely, leaving the default interface unset for that network change. Resolve inside the retry loop instead, matching upstream, so a transient null is retried rather than dropped.
* fix(android): deliver default-interface updates synchronously checkDefaultInterfaceUpdate dispatched each updateDefaultInterface call to GlobalScope.launch(Dispatchers.IO), so successive default-network changes could be applied out of order, pinning a stale interface index (Go side is last-writer-wins). The golang/go#68760 stack workaround that the launch was duplicating is already handled on the Go side (libbox SetupOptions.FixAndroidStack, enabled by radiance), so delivering synchronously preserves the network-change ordering that DefaultNetworkListener already guarantees. Remove the now-unused Bugs.fixAndroidStack helper. * refactor(android): serialize interface updates on a background thread Default-network callbacks are delivered on the main looper, so resolving the interface (which sleeps between retries and calls into Go over JNI) and delivering the update ran on the UI thread. Move that work onto a single-thread executor: it stays off the UI thread while still applying updates in network-change order, which a shared pool would not guarantee. Extract the resolution into resolveDefaultInterface/getInterfaceIndex/ sleepBeforeRetry helpers, snapshot the listener on the caller thread and mark it @volatile so the worker reliably observes setListener writes, and recover via if_nametoindex when getByName throws (not just when it returns null). Add logging when the interface is resolved, cleared, or fails to resolve. * fix(android): skip interface updates for a torn-down listener notifyDefaultInterface re-checks that the snapshotted listener is still the active one before calling into libbox, so an update queued or being resolved across a setListener teardown can no longer invoke a closed handler. Also tidy the resolution logging: split the getByName attempt from the if_nametoindex fallback so the warning names the actual failing call and the fallback runs once, and drop the fixed attempt-count from the failure log since resolution can bail early on interruption. * fix(android): guard if_nametoindex fallback behind API 26 Os.if_nametoindex was added in API 26, but the app supports API 24+. Skip the name-based fallback on API 24/25 with an SDK_INT check instead of letting the call fail, and drop the runCatching that was masking it.
There was a problem hiding this comment.
🧹 Nitpick comments (2)
android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt (2)
72-135: 🚀 Performance & Scalability | 🔵 Trivial | ⚖️ Poor tradeoffSequential per-event retry loop can back up the single-thread executor during network flapping.
Each unresolved network-change event can take up to ~1s (10 attempts × 100ms) before giving up. Since
updateExecutoris single-threaded and FIFO, several quick successive network changes could queue up delayed resolutions, causing interface updates to lag behind the actual network state (though final ordering/consistency is preserved by the stale-listener guard). Consider short-circuiting/cancelling a still-queued or in-flight resolution when a newer network-change event arrives, if this becomes an issue in practice.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt` around lines 72 - 135, The single-threaded retry work in DefaultNetworkMonitor can queue up stale network resolutions during rapid flapping. Update checkDefaultInterfaceUpdate and the updateExecutor flow so a newer network-change event can supersede any still-queued or in-flight resolveDefaultInterface work, while keeping the existing stale-listener guard in notifyDefaultInterface. Use the existing Network-based callback path and the resolveDefaultInterface/sleepBeforeRetry loop to short-circuit obsolete attempts as soon as a newer event arrives.
25-30: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low valueConsider
@VolatilefordefaultNetworkgiven the cross-thread write/read pattern.
defaultNetworkis written from the network-callback thread (Line 43) and read viarequire()/setListener(), which can be invoked from arbitrary threads. The comment immediately below (Lines 27-29) explicitly documents this exact visibility concern forlistenerbutdefaultNetworkdoesn't get the same treatment, despite the same read/write pattern across threads.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt` around lines 25 - 30, `defaultNetwork` in `DefaultNetworkMonitor` has the same cross-thread write/read pattern as `listener`, but it is missing visibility guarantees. Mark `defaultNetwork` with `@Volatile` (or otherwise synchronize its access) so updates from the network-callback thread are safely visible to `require()` and `setListener()` callers, matching the concurrency handling already documented for `listener`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In
`@android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt`:
- Around line 72-135: The single-threaded retry work in DefaultNetworkMonitor
can queue up stale network resolutions during rapid flapping. Update
checkDefaultInterfaceUpdate and the updateExecutor flow so a newer
network-change event can supersede any still-queued or in-flight
resolveDefaultInterface work, while keeping the existing stale-listener guard in
notifyDefaultInterface. Use the existing Network-based callback path and the
resolveDefaultInterface/sleepBeforeRetry loop to short-circuit obsolete attempts
as soon as a newer event arrives.
- Around line 25-30: `defaultNetwork` in `DefaultNetworkMonitor` has the same
cross-thread write/read pattern as `listener`, but it is missing visibility
guarantees. Mark `defaultNetwork` with `@Volatile` (or otherwise synchronize its
access) so updates from the network-callback thread are safely visible to
`require()` and `setListener()` callers, matching the concurrency handling
already documented for `listener`.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 21fc1cd7-b519-49f9-9254-446cd820bbf6
📒 Files selected for processing (2)
android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.ktandroid/app/src/main/kotlin/org/getlantern/lantern/utils/Bugs.kt
💤 Files with no reviewable changes (1)
- android/app/src/main/kotlin/org/getlantern/lantern/utils/Bugs.kt
Problem
On Android, every sing-box outbound (
direct, the auto-selector, and proxy outbounds) and DNS failed withno available network interface, while the UI still reported the VPN as Connected. Observed on Android 10 / API 29 (Samsung SM-G960U1); the condition persisted for the whole session with zero successful connections, and the auto-selector churned through servers marking eachhard_demoted— a consequence, not the cause.Root cause
DefaultNetworkMonitor.checkDefaultInterfaceUpdateis the only path that delivers the default network interface to sing-box (viaupdateDefaultInterface). It resolved the interface index solely throughNetworkInterface.getByName(name).index, which relies on the interface-enumeration syscall. That syscall is restricted on some Android versions/ROMs, wheregetByNamereturns null or throws. On failure the 10-retry loop exhausted and returned without ever callingupdateDefaultInterface, leaving sing-box with no default interface — so every outbound failed to bind.The sibling
PlatformInterfaceWrapper.getInterfaces()was already hardened for this exact case with anOs.if_nametoindexfallback;checkDefaultInterfaceUpdatewas not.Fix
Fall back to
Os.if_nametoindex(interfaceName)whengetByNamecannot resolve the index, mirroring the hardened sibling, and retry only when the index is still unresolvable (<= 0). Also early-return on a null interface name instead of burning the retry budget ongetByName(null).Testing
Needs a build + on-device check on an API 29 device to confirm.
Summary by CodeRabbit