Skip to content

Android: Closeable lifecycle, error reporting, thread safety, and tes…#19229

Draft
psiddh wants to merge 1 commit intopytorch:mainfrom
psiddh:android-combined-v2
Draft

Android: Closeable lifecycle, error reporting, thread safety, and tes…#19229
psiddh wants to merge 1 commit intopytorch:mainfrom
psiddh:android-combined-v2

Conversation

@psiddh
Copy link
Copy Markdown
Contributor

@psiddh psiddh commented Apr 30, 2026

…t coverage

Combines all previously reverted Android improvement PRs (#18669, #19012, #19028, #19092, #19099, #19124) plus new Module lifecycle tests into a single atomic change.

Error reporting: all sync errors throw exceptions instead of returning status codes or logging silently. Module.loadMethod() throws ExecutorchRuntimeException on failure. LlmModule.generate() and load() throw on error. Native methods renamed with "Native" suffix; public wrappers check status and throw.

Lifecycle: Module, LlmModule, and TrainingModule implement Closeable for try-with-resources. LlmModule adds ReentrantLock to serialize access to non-thread-safe native state. stop() remains lock-free (C++ atomic flag). TrainingModule replaces Log.e silent failures with IllegalStateException.

Error consistency: ExecuTorchRuntime.validateFilePath throws IllegalArgumentException. SGD throws IllegalStateException. AsrModule throws ExecutorchRuntimeException. Cause-chaining constructor added to ExecutorchRuntimeException.

JNI safety: Module and LlmModule constructors wrapped in try-catch to surface native initialization failures. LlmModule.load() uses throwExecutorchException with diagnostic detail. All @DoNotStrip annotations preserved on JNI-called methods; new @DoNotStrip added to renamed private native methods (generateNative, resetContextNative, loadNative).

Tests: fix 4 @ignored Module tests by providing required input tensor. Add 13 new lifecycle/API coverage tests. Add LlmModule use-after-close and idempotent-close tests. Fix ModelRunner crash on loadMethod failure.

This PR was authored with the help of Claude.

Copilot AI review requested due to automatic review settings April 30, 2026 16:38
@pytorch-bot
Copy link
Copy Markdown

pytorch-bot Bot commented Apr 30, 2026

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19229

Note: Links to docs will display an error until the docs builds have been completed.

✅ You can merge normally! (2 Unrelated Failures)

As of commit 04727f0 with merge base e84a418 (image):

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 30, 2026
@github-actions
Copy link
Copy Markdown

This PR needs a release notes: label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Consolidates several Android API improvements across the ExecuTorch Java/Kotlin wrappers and JNI layer, focusing on exception-based error reporting, explicit lifecycle management (Closeable), and basic thread-safety guarantees, plus expanded instrumentation test coverage.

Changes:

  • Standardize synchronous error handling to throw exceptions (vs returning status codes / silent logging) across Module, LlmModule, training/ASR wrappers, and JNI glue.
  • Add/solidify lifecycle management via Closeable (close()/idempotent destroy patterns) and serialize access to non-thread-safe native state in LlmModule via ReentrantLock.
  • Update JNI registrations and Java native method naming (*Native), and expand Android instrumentation tests to cover lifecycle and API behavior.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
extension/benchmark/android/benchmark/app/src/main/java/org/pytorch/minibench/ModelRunner.java Handle Module.loadMethod() throwing by capturing an error code and early-returning safely; ensure destroy() in finally.
extension/android/jni/jni_layer_llama.cpp Add constructor exception-to-Java conversion; improve load() error reporting; rename registered natives (generateNative, loadNative, resetContextNative).
extension/android/jni/jni_layer.cpp Wrap native Module construction in exception mapping; throw on unsupported input EValue type codes; rename registered natives (etdumpNative, getMethodsNative).
extension/android/executorch_android/src/main/java/org/pytorch/executorch/training/TrainingModule.java Implement Closeable; replace silent failures with IllegalStateException on use-after-destroy.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/training/SGD.java Throw IllegalStateException (vs generic RuntimeException) when optimizer is destroyed.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java Implement Closeable; add locking + destroyed checks; convert many status-returning APIs to void + exception; rename native methods to *Native.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/asr/AsrModule.kt Throw ExecutorchRuntimeException with error code on creation/transcription failures.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/Module.java Implement Closeable; make loadMethod() throw; add locked wrappers around additional APIs; rename native methods to *Native.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/ExecutorchRuntimeException.java Add error-code documentation, ALREADY_LOADED, improved message prefix, and cause-chaining constructor.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/ExecuTorchRuntime.java Tighten file-path validation and switch to IllegalArgumentException.
extension/android/executorch_android/src/androidTest/java/org/pytorch/executorch/ModuleInstrumentationTest.kt Un-ignore and fix tests by providing required dummy input; add new API/lifecycle coverage.
extension/android/executorch_android/src/androidTest/java/org/pytorch/executorch/LlmModuleInstrumentationTest.kt Update tests for exception-based load; add close() lifecycle tests.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@psiddh psiddh force-pushed the android-combined-v2 branch from a96bd6b to 6ffdd62 Compare April 30, 2026 16:47
Copilot AI review requested due to automatic review settings April 30, 2026 17:05
@psiddh psiddh force-pushed the android-combined-v2 branch from 787239e to 9f8afe1 Compare April 30, 2026 17:06
@psiddh psiddh requested a review from JacobSzwejbka April 30, 2026 17:06
@psiddh psiddh marked this pull request as draft April 30, 2026 17:10
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR consolidates several previously reverted Android improvements into one atomic update, standardizing ExecuTorch Android APIs around exception-based error reporting, explicit Closeable lifecycles, and improved thread-safety—along with significant new/updated instrumentation test coverage.

Changes:

  • Shift Android APIs away from status-code returns/silent failures to consistent exception throwing (Java/Kotlin + JNI).
  • Add/align Closeable lifecycle semantics (try-with-resources) and introduce locking around non-thread-safe native state (notably for LlmModule).
  • Update JNI registrations to match renamed *Native methods and expand Android instrumentation tests (including new lifecycle tests).

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
extension/benchmark/android/benchmark/app/src/main/java/org/pytorch/minibench/ModelRunner.java Adapts benchmark load-path to exception-based Module.loadMethod() and ensures destroy on success path.
extension/benchmark/android/benchmark/app/src/main/java/org/pytorch/minibench/LlmModelRunner.java Adapts load-path to exception-based LlmModule.load() (but generate path still needs updates).
extension/android/jni/jni_layer_llama.cpp Adds constructor error surfacing, improves load error reporting, and updates native method registrations (*Native).
extension/android/jni/jni_layer.cpp Adds constructor error surfacing, throws on unsupported EValue input type, and updates native registrations (etdumpNative, getMethodsNative).
extension/android/executorch_android/src/main/java/org/pytorch/executorch/training/TrainingModule.java Implements Closeable and replaces silent failures with exceptions.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/training/SGD.java Uses IllegalStateException for destroyed optimizer usage.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java Implements Closeable, adds locking, converts sync APIs to throw-on-error wrappers, and renames JNI methods to *Native.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/asr/AsrModule.kt Switches to ExecutorchRuntimeException for native creation/transcription failures.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/Module.java Implements Closeable, adds throw-on-error wrappers and lock-guarded public methods, and renames JNI methods to *Native.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/ExecutorchRuntimeException.java Adds richer docs, new error code constant, improved message format, and a cause-chaining constructor.
extension/android/executorch_android/src/main/java/org/pytorch/executorch/ExecuTorchRuntime.java Makes file-path validation stricter and consistently IllegalArgumentException-based.
extension/android/executorch_android/src/androidTest/java/org/pytorch/executorch/ModuleInstrumentationTest.kt Un-ignores/fixes Module tests and adds broader API/lifecycle coverage (load modes, log buffer, etdump, destroyed-state).
extension/android/executorch_android/src/androidTest/java/org/pytorch/executorch/LlmModuleInstrumentationTest.kt Updates to new throwing APIs and adds lifecycle tests (use-after-close, idempotent close).
Comments suppressed due to low confidence (1)

extension/benchmark/android/benchmark/app/src/main/java/org/pytorch/minibench/LlmModelRunner.java:102

  • LlmModule.generate(...) now throws on failure, but the MESSAGE_GENERATE branch doesn't catch exceptions. An exception here will crash the HandlerThread and can prevent onGenerationStopped() from running / results from being recorded. Wrap the generate call in try/catch (ideally extracting an error code from ExecutorchRuntimeException) and route the failure to the callback (e.g., call onGenerationStopped() and/or add an explicit error callback).
      mLlmModelRunner.mCallback.onModelLoaded(status);
    } else if (msg.what == MESSAGE_GENERATE) {
      mLlmModelRunner.mModule.generate((String) msg.obj, mLlmModelRunner);
      mLlmModelRunner.mCallback.onGenerationStopped();

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +54 to +58
@Test
@Throws(IOException::class, URISyntaxException::class)
fun testMethodMetadata() {
val module = Module.load(getTestFilePath(TEST_FILE_NAME))
module.destroy()
Comment on lines +322 to +331
mLock.lock();
try {
checkNotDestroyed();
int err = generateNative(prompt, seqLen, llmCallback, echo, temperature, numBos, numEos);
if (err != 0) {
throw ExecutorchRuntimeException.makeExecutorchException(err, "Failed to generate");
}
} finally {
mLock.unlock();
}
@meta-codesync
Copy link
Copy Markdown
Contributor

meta-codesync Bot commented Apr 30, 2026

@psiddh has imported this pull request. If you are a Meta employee, you can view this in D103233465.

…t coverage

Combines all previously reverted Android improvement PRs (pytorch#18669,
pytorch#19012, pytorch#19028, pytorch#19092, pytorch#19099, pytorch#19124) plus new Module lifecycle
tests into a single atomic change.

Error reporting: all sync errors throw exceptions instead of
returning status codes or logging silently. Module.loadMethod()
throws ExecutorchRuntimeException on failure. LlmModule.generate()
and load() throw on error. Native methods renamed with "Native"
suffix; public wrappers check status and throw.

Lifecycle: Module, LlmModule, and TrainingModule implement
Closeable for try-with-resources. LlmModule adds ReentrantLock
to serialize access to non-thread-safe native state. stop()
remains lock-free (C++ atomic flag). TrainingModule replaces
Log.e silent failures with IllegalStateException.

Error consistency: ExecuTorchRuntime.validateFilePath throws
IllegalArgumentException. SGD throws IllegalStateException.
AsrModule throws ExecutorchRuntimeException. Cause-chaining
constructor added to ExecutorchRuntimeException.

JNI safety: Module and LlmModule constructors wrapped in
try-catch to surface native initialization failures.
LlmModule.load() uses throwExecutorchException with diagnostic
detail. All @DoNotStrip annotations preserved on JNI-called
methods; new @DoNotStrip added to renamed private native methods
(generateNative, resetContextNative, loadNative).

Tests: fix 4 @ignored Module tests by providing required input
tensor. Add 13 new lifecycle/API coverage tests. Add LlmModule
use-after-close and idempotent-close tests. Fix ModelRunner crash
on loadMethod failure.

This PR was authored with the help of Claude.
@psiddh psiddh force-pushed the android-combined-v2 branch from a44e428 to 04727f0 Compare April 30, 2026 18:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants