Skip to content

fix(android): recover default-interface index via Os.if_nametoindex#8889

Open
garmr-ulfr wants to merge 4 commits into
mainfrom
garmr/fix/android-default-interface-index
Open

fix(android): recover default-interface index via Os.if_nametoindex#8889
garmr-ulfr wants to merge 4 commits into
mainfrom
garmr/fix/android-default-interface-index

Conversation

@garmr-ulfr

@garmr-ulfr garmr-ulfr commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Problem

On Android, every sing-box outbound (direct, the auto-selector, and proxy outbounds) and DNS failed with no available network interface, while the UI still reported the VPN as Connected. Observed on Android 10 / API 29 (Samsung SM-G960U1); the condition persisted for the whole session with zero successful connections, and the auto-selector churned through servers marking each hard_demoted — a consequence, not the cause.

Root cause

DefaultNetworkMonitor.checkDefaultInterfaceUpdate is the only path that delivers the default network interface to sing-box (via updateDefaultInterface). It resolved the interface index solely through NetworkInterface.getByName(name).index, which relies on the interface-enumeration syscall. That syscall is restricted on some Android versions/ROMs, where getByName returns null or throws. On failure the 10-retry loop exhausted and returned without ever calling updateDefaultInterface, leaving sing-box with no default interface — so every outbound failed to bind.

The sibling PlatformInterfaceWrapper.getInterfaces() was already hardened for this exact case with an Os.if_nametoindex fallback; checkDefaultInterfaceUpdate was not.

Fix

Fall back to Os.if_nametoindex(interfaceName) when getByName cannot resolve the index, mirroring the hardened sibling, and retry only when the index is still unresolvable (<= 0). Also early-return on a null interface name instead of burning the retry budget on getByName(null).

Testing

Needs a build + on-device check on an API 29 device to confirm.

Summary by CodeRabbit

  • Bug Fixes
    • Improved network detection reliability, helping the app recognize the active connection more consistently.
    • Made interface resolution more resilient on Android, reducing missed updates and stale network state.
    • Added fallback handling for cases where interface details aren’t available through the usual system path.

NetworkInterface.getByName relies on the interface-enumeration syscall,
which is restricted on some Android versions/ROMs (returns null or throws
there). When it failed, checkDefaultInterfaceUpdate exhausted its retries
without ever calling updateDefaultInterface, so sing-box never received a
default interface and every outbound failed with "no available network
interface".

Fall back to Os.if_nametoindex when getByName cannot resolve the index,
mirroring the already-hardened getInterfaces sibling, and retry only when
the index is still unresolvable. Also early-return on a null interface
name instead of burning the retry budget on getByName(null).
@coderabbitai

coderabbitai Bot commented Jun 26, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

DefaultNetworkMonitor.kt is refactored to resolve default network interfaces using a single-thread executor instead of GlobalScope coroutines, adding stale-listener guarding, retry-based interface name/index resolution with a NetworkInterface/Os.if_nametoindex fallback. Bugs.kt, which exposed an unused Android stack-fix flag, is deleted.

Changes

Default Network Monitor Refactor

Layer / File(s) Summary
Constants and resolved-interface data holder
android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt
Adds android.system.Os import, internal configuration constants, and a ResolvedInterface data holder alongside the volatile defaultNetwork field.
Executor-based dispatch and stale-listener guard
android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt
Replaces GlobalScope coroutine usage with a process-lifetime single-thread updateExecutor in checkDefaultInterfaceUpdate, and adds notifyDefaultInterface to avoid invoking a stale/changed listener.
Interface name/index resolution with retry
android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt
resolveDefaultInterface re-reads interfaceName from LinkProperties each attempt and derives the index via getInterfaceIndex, which tries NetworkInterface.getByName then falls back to Os.if_nametoindex; sleepBeforeRetry handles delay/interruption between attempts.

Removal of Unused Stack Fix Utility

Layer / File(s) Summary
Bugs object removal
android/app/src/main/kotlin/org/getlantern/lantern/utils/Bugs.kt
Deletes the Bugs object and its fixAndroidStack property, removing the SDK-version-based stack fix flag.

Estimated code review effort: 3 (Moderate) | ~25 minutes

Sequence Diagram(s)

sequenceDiagram
  participant NetworkCallback
  participant checkDefaultInterfaceUpdate
  participant updateExecutor
  participant resolveDefaultInterface
  participant listener

  NetworkCallback->>checkDefaultInterfaceUpdate: network change detected
  checkDefaultInterfaceUpdate->>updateExecutor: enqueue resolution task
  updateExecutor->>resolveDefaultInterface: resolve interface (with retries)
  resolveDefaultInterface-->>updateExecutor: ResolvedInterface or null
  updateExecutor->>listener: notifyDefaultInterface (if listener still valid)
Loading

Related PRs: None found referencing this specific change.

Suggested labels: android, refactor, network

Suggested reviewers: None determined from provided information.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly summarizes the main Android fix: recovering the default-interface index via Os.if_nametoindex.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch garmr/fix/android-default-interface-index

Comment @coderabbitai help to get the list of available commands.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses an Android-specific failure mode where sing-box never receives a default network interface index, causing all outbounds/DNS to fail with “no available network interface” even while the VPN appears connected. It hardens DefaultNetworkMonitor.checkDefaultInterfaceUpdate to recover the interface index on restricted Android ROMs by falling back to Os.if_nametoindex.

Changes:

  • Add Os.if_nametoindex(interfaceName) fallback when NetworkInterface.getByName cannot resolve an interface index.
  • Avoid retrying when the interface name is null by early-returning before entering the retry loop.
  • Retry only when the resolved index is still not usable (<= 0).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

garmr-ulfr and others added 2 commits June 26, 2026 13:35
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
getLinkProperties (and its interfaceName) can briefly return null right
after a network change. Resolving it once before the loop meant a transient
null abandoned the update entirely, leaving the default interface unset for
that network change. Resolve inside the retry loop instead, matching
upstream, so a transient null is retried rather than dropped.
* fix(android): deliver default-interface updates synchronously

checkDefaultInterfaceUpdate dispatched each updateDefaultInterface call
to GlobalScope.launch(Dispatchers.IO), so successive default-network
changes could be applied out of order, pinning a stale interface index
(Go side is last-writer-wins).

The golang/go#68760 stack workaround that the launch was duplicating is
already handled on the Go side (libbox SetupOptions.FixAndroidStack,
enabled by radiance), so delivering synchronously preserves the
network-change ordering that DefaultNetworkListener already guarantees.

Remove the now-unused Bugs.fixAndroidStack helper.

* refactor(android): serialize interface updates on a background thread

Default-network callbacks are delivered on the main looper, so resolving
the interface (which sleeps between retries and calls into Go over JNI)
and delivering the update ran on the UI thread. Move that work onto a
single-thread executor: it stays off the UI thread while still applying
updates in network-change order, which a shared pool would not guarantee.

Extract the resolution into resolveDefaultInterface/getInterfaceIndex/
sleepBeforeRetry helpers, snapshot the listener on the caller thread and
mark it @volatile so the worker reliably observes setListener writes, and
recover via if_nametoindex when getByName throws (not just when it returns
null). Add logging when the interface is resolved, cleared, or fails to
resolve.

* fix(android): skip interface updates for a torn-down listener

notifyDefaultInterface re-checks that the snapshotted listener is still
the active one before calling into libbox, so an update queued or being
resolved across a setListener teardown can no longer invoke a closed
handler.

Also tidy the resolution logging: split the getByName attempt from the
if_nametoindex fallback so the warning names the actual failing call and
the fallback runs once, and drop the fixed attempt-count from the
failure log since resolution can bail early on interruption.

* fix(android): guard if_nametoindex fallback behind API 26

Os.if_nametoindex was added in API 26, but the app supports API 24+.
Skip the name-based fallback on API 24/25 with an SDK_INT check instead
of letting the call fail, and drop the runCatching that was masking it.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt (2)

72-135: 🚀 Performance & Scalability | 🔵 Trivial | ⚖️ Poor tradeoff

Sequential per-event retry loop can back up the single-thread executor during network flapping.

Each unresolved network-change event can take up to ~1s (10 attempts × 100ms) before giving up. Since updateExecutor is single-threaded and FIFO, several quick successive network changes could queue up delayed resolutions, causing interface updates to lag behind the actual network state (though final ordering/consistency is preserved by the stale-listener guard). Consider short-circuiting/cancelling a still-queued or in-flight resolution when a newer network-change event arrives, if this becomes an issue in practice.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt`
around lines 72 - 135, The single-threaded retry work in DefaultNetworkMonitor
can queue up stale network resolutions during rapid flapping. Update
checkDefaultInterfaceUpdate and the updateExecutor flow so a newer
network-change event can supersede any still-queued or in-flight
resolveDefaultInterface work, while keeping the existing stale-listener guard in
notifyDefaultInterface. Use the existing Network-based callback path and the
resolveDefaultInterface/sleepBeforeRetry loop to short-circuit obsolete attempts
as soon as a newer event arrives.

25-30: 🩺 Stability & Availability | 🔵 Trivial | 💤 Low value

Consider @Volatile for defaultNetwork given the cross-thread write/read pattern.

defaultNetwork is written from the network-callback thread (Line 43) and read via require()/setListener(), which can be invoked from arbitrary threads. The comment immediately below (Lines 27-29) explicitly documents this exact visibility concern for listener but defaultNetwork doesn't get the same treatment, despite the same read/write pattern across threads.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt`
around lines 25 - 30, `defaultNetwork` in `DefaultNetworkMonitor` has the same
cross-thread write/read pattern as `listener`, but it is missing visibility
guarantees. Mark `defaultNetwork` with `@Volatile` (or otherwise synchronize its
access) so updates from the network-callback thread are safely visible to
`require()` and `setListener()` callers, matching the concurrency handling
already documented for `listener`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt`:
- Around line 72-135: The single-threaded retry work in DefaultNetworkMonitor
can queue up stale network resolutions during rapid flapping. Update
checkDefaultInterfaceUpdate and the updateExecutor flow so a newer
network-change event can supersede any still-queued or in-flight
resolveDefaultInterface work, while keeping the existing stale-listener guard in
notifyDefaultInterface. Use the existing Network-based callback path and the
resolveDefaultInterface/sleepBeforeRetry loop to short-circuit obsolete attempts
as soon as a newer event arrives.
- Around line 25-30: `defaultNetwork` in `DefaultNetworkMonitor` has the same
cross-thread write/read pattern as `listener`, but it is missing visibility
guarantees. Mark `defaultNetwork` with `@Volatile` (or otherwise synchronize its
access) so updates from the network-callback thread are safely visible to
`require()` and `setListener()` callers, matching the concurrency handling
already documented for `listener`.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro Plus

Run ID: 21fc1cd7-b519-49f9-9254-446cd820bbf6

📥 Commits

Reviewing files that changed from the base of the PR and between 3a26323 and 940b673.

📒 Files selected for processing (2)
  • android/app/src/main/kotlin/org/getlantern/lantern/service/DefaultNetworkMonitor.kt
  • android/app/src/main/kotlin/org/getlantern/lantern/utils/Bugs.kt
💤 Files with no reviewable changes (1)
  • android/app/src/main/kotlin/org/getlantern/lantern/utils/Bugs.kt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants