feat: Extension installation module by csmith49 · Pull Request #2811 · OpenHands/software-agent-sdk

csmith49 · 2026-04-13T14:58:35Z

A human has tested these changes.

Why

Skills and plugins have repeated but slightly-divergent implementations of installing/loading. This PR attempts to unify both with one mechanism that can be reused for other extensions in the future.

This PR is a follow-up to #2801 and should be merged after.

Summary

New openhands.sdk.extensions.installation module that provides the interface for installing/loading extensions
Tests for the above
Updates to openhands.sdk.plugin.installed and openhands.sdk.skills.installed to use the new machinery while exposing the old interface

Breaking Changes

InstalledPluginInfo.install_path / InstalledSkillInfo.install_path type changed from str to Path.
Downstream code doing string operations on info.install_path (e.g. info.install_path.startswith(...)) will need to use str(info.install_path) or switch to Path APIs. Path is the better type here and existing tests already use it correctly (e.g. install_path / "skills").
InstalledPluginsMetadata and InstalledSkillsMetadata removed from public exports.
These were internal implementation details. The user-facing functions and InstalledPluginInfo / InstalledSkillInfo type aliases remain exported.
Skill-specific fields dropped from InstalledSkillInfo.
The old InstalledSkillInfo had license, compatibility, metadata, allowed_tools fields. These are gone since it is now aliased to InstallationInfo. These fields were never accessed in production code.

How to Test

Unit/integration tests all use temp files to actually stress the saving/loading.

Type

Agent Server images for this PR

• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server

Variants & Base Images

Variant	Architectures	Base Image	Docs / Tags
java	amd64, arm64	`eclipse-temurin:17-jdk`	Link
python	amd64, arm64	`nikolaik/python-nodejs:python3.13-nodejs22-slim`	Link
golang	amd64, arm64	`golang:1.21-bookworm`	Link

Pull (multi-arch manifest)

# Each variant is a multi-arch manifest supporting both amd64 and arm64
docker pull ghcr.io/openhands/agent-server:907c4b8-python

Run

docker run -it --rm \
  -p 8000:8000 \
  --name agent-server-907c4b8-python \
  ghcr.io/openhands/agent-server:907c4b8-python

All tags pushed for this build

ghcr.io/openhands/agent-server:907c4b8-golang-amd64
ghcr.io/openhands/agent-server:907c4b8-golang_tag_1.21-bookworm-amd64
ghcr.io/openhands/agent-server:907c4b8-golang-arm64
ghcr.io/openhands/agent-server:907c4b8-golang_tag_1.21-bookworm-arm64
ghcr.io/openhands/agent-server:907c4b8-java-amd64
ghcr.io/openhands/agent-server:907c4b8-eclipse-temurin_tag_17-jdk-amd64
ghcr.io/openhands/agent-server:907c4b8-java-arm64
ghcr.io/openhands/agent-server:907c4b8-eclipse-temurin_tag_17-jdk-arm64
ghcr.io/openhands/agent-server:907c4b8-python-amd64
ghcr.io/openhands/agent-server:907c4b8-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-amd64
ghcr.io/openhands/agent-server:907c4b8-python-arm64
ghcr.io/openhands/agent-server:907c4b8-nikolaik_s_python-nodejs_tag_python3.13-nodejs22-slim-arm64
ghcr.io/openhands/agent-server:907c4b8-golang
ghcr.io/openhands/agent-server:907c4b8-java
ghcr.io/openhands/agent-server:907c4b8-python

About Multi-Architecture Support

Each variant tag (e.g., 907c4b8-python) is a multi-arch manifest supporting both amd64 and arm64
Docker automatically pulls the correct architecture for your platform
Individual architecture tags (e.g., 907c4b8-python-amd64) are also available if needed

Cover parse_extension_source, get_cache_path, fetch, fetch_with_resolution, SourceType enum, and repo_path handling for the new shared extensions module. Co-authored-by: openhands <openhands@all-hands.dev>

…h.py - Add class docstring to SourceType enum describing each variant. - Update parse_extension_source docstring: fix arg description, return types, and example function names. - Fix _apply_subpath docstring example from 'plugin repository' to 'extension repository'. - Fix get_cache_path docstring references from plugin to extension. - Rename plugin_path variable to ext_path in fetch_with_resolution. Co-authored-by: openhands <openhands@all-hands.dev>

Both modules now delegate all fetch logic to the shared extensions.fetch module while preserving their public interfaces: - plugin/fetch.py: parse_plugin_source, get_cache_path, fetch_plugin, fetch_plugin_with_resolution, PluginFetchError, DEFAULT_CACHE_DIR - skills/fetch.py: fetch_skill, fetch_skill_with_resolution, SkillFetchError, DEFAULT_CACHE_DIR ExtensionFetchError is caught and re-raised as the domain-specific error type with 'extension' replaced by 'plugin'/'skill' in messages. Co-authored-by: openhands <openhands@all-hands.dev>

- Remove parse_plugin_source and get_cache_path from plugin/fetch.py; callers should use extensions.fetch directly. - Slim test_plugin_fetch.py from 1038 to 100 lines: keep only PluginFetchError wrapping, DEFAULT_CACHE_DIR, and Plugin.fetch() tests. - Move git infrastructure tests (clone, update, checkout, locking, GitHelper errors, get_default_branch) to tests/sdk/git/test_cached_repo.py. - Update test_installed_plugins.py to import from extensions.fetch. Co-authored-by: openhands <openhands@all-hands.dev>

The except block was discarding the ExtensionFetchError message and replacing it with a generic 'Failed to fetch plugin'. Now it carries the original message through with 'extension' replaced by 'plugin'. Co-authored-by: openhands <openhands@all-hands.dev>

enyst · 2026-04-23T19:25:31Z

@OpenHands /codereview this PR, and tell us directly on github: WDYT?

openhands-ai · 2026-04-23T19:25:46Z

I'm on it! enyst can track my progress at all-hands.dev

enyst

🟡 Acceptable - the refactor itself looks sound after the latest fixes. I walked the new installation/metadata path, compared the wrapper behavior against the old plugin/skill implementations, and ran uv run pytest tests/sdk/extensions/installation/test_installation_info.py tests/sdk/extensions/installation/test_installation_manager.py tests/sdk/extensions/installation/test_installation_metadata.py tests/sdk/extensions/installation/test_installation_utils.py tests/sdk/plugin/test_installed_plugins.py tests/sdk/skills/test_installed_skills.py -q locally (80 passed). I don't currently see a code-level blocker in the diff.

[TESTING GAPS]

[PR description] No Evidence: the PR template explicitly asks for end-to-end runtime evidence, but the body still only says the unit/integration tests use temp files. Please add an Evidence section with the exact real install/load command(s) you ran and the resulting output. If this PR is also claiming to address the legacy metadata issue discussed in review, include a short before/after reproduction for that path too.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟡 MEDIUM
Breaking SDK surface plus install/load refactor, so compatibility matters. The current implementation and tests look reasonably solid; the remaining concern is missing runtime proof in the PR description rather than the code itself.

VERDICT:
✅ Worth merging once the missing runtime evidence is added

KEY INSIGHT:
The risky part here is API/persistence compatibility, and the current code/tests cover that path reasonably well; the remaining gap is proof in the PR description, not the implementation.

This review was generated by an AI assistant (OpenHands) on behalf of the user.

openhands-ai · 2026-04-23T19:30:33Z

Here’s the final summary of my work since the last summary:

I reviewed PR feat: Extension installation module #2811 in context by checking the PR metadata, linked issue, issue comments, review threads, and the current diff.
I inspected the refactored extension-installation code and compared the current plugin/skill wrapper behavior against the previous implementation.
I ran the targeted validation suite:
- uv run pytest tests/sdk/extensions/installation/test_installation_info.py tests/sdk/extensions/installation/test_installation_manager.py tests/sdk/extensions/installation/test_installation_metadata.py tests/sdk/extensions/installation/test_installation_utils.py tests/sdk/plugin/test_installed_plugins.py tests/sdk/skills/test_installed_skills.py -q
- Result: 80 passed
I posted a GitHub PR review comment here:
- feat: Extension installation module #2811 (review)

Status checklist:

Reviewed the current PR state and addressed the request directly on GitHub
Validated the relevant test coverage locally
Gave a clear verdict on the PR
Followed the instruction not to change code unless necessary
Included AI disclosure in the GitHub-posted review
Did not make any repository changes, commits, or pushes

Conciseness:

No code changes were made, so there are no extraneous edits to revert
The only new artifact was the GitHub review comment requested by the user

Bottom line:

The request was fully addressed. I reviewed the PR and posted my assessment on GitHub.
My conclusion was that the code looks acceptable after the latest fixes, but the PR description still lacks the explicit runtime evidence requested by the PR template, so I left a COMMENT review rather than approving.

all-hands-bot

🟡 Acceptable — Clean refactor with solid architecture, but requires human review + evals before merge per repo policy.

[EVAL RISK]

This refactors plugin/skill loading infrastructure
Per repo guidelines: plugin/skill loading changes require human review + lightweight evals
Recommendation: Have a human maintainer run evals to verify no behavior regression before merge

[ARCHITECTURAL ASSESSMENT]
✅ Good taste: The generic InstallationManager[T] abstraction elegantly eliminates ~650 lines of duplicated code. The protocol-based interface with type parameters is exactly the right level of abstraction.

✅ Data Structures: Using Pydantic models for metadata, Path for install_path (improvement over str), and context manager for metadata sessions - all solid choices.

✅ Testing: Comprehensive coverage with real disk I/O testing the full persistence round-trip. Not just mocks.

✅ Backward Compatibility: Legacy metadata migration properly handles old {"plugins": ...} and {"skills": ...} formats.

[BREAKING CHANGES]
✅ Properly Documented: Breaking changes are reasonable API improvements (str→Path, removed internal metadata classes).

[RISK ASSESSMENT]
⚠️ Risk Assessment: 🟡 MEDIUM — Infrastructure refactor needs eval verification to ensure no behavior change.

VERDICT:
✅ Code quality is excellent, architecture is sound
❌ Cannot approve without evals per repo policy

KEY INSIGHT:
Well-executed refactor that successfully unifies plugin/skill installation through clean abstraction, but the infrastructure nature requires eval verification before merge.

all-hands-bot

✅ QA Report: PASS

Functionally verified the extension installation module unification. All existing plugin and skill installation APIs work correctly through the new generic framework.

Does this PR achieve its stated goal?

Yes. The PR successfully unifies plugin and skill installation into a single generic openhands.sdk.extensions.installation module. The refactor consolidates previously duplicated code while maintaining full backward compatibility for existing public APIs. All documented breaking changes are verified and intentional.

Phase	Result
Environment Setup	✅ All dependencies installed, project builds successfully
CI Status	✅ All core checks passing (sdk-tests, tools-tests, workspace-tests, agent-server-tests, pre-commit, coverage-report)
Functional Verification	✅ All installation operations verified end-to-end

Functional Verification

Test 1: Plugin Installation API (Backward Compatibility)

Verified the existing openhands.sdk.plugin.installed API continues to work:

Commands executed:

from openhands.sdk.plugin.installed import install_plugin, list_installed_plugins, uninstall_plugin, enable_plugin, disable_plugin, get_installed_plugin

# Install plugin from local path
info = install_plugin(str(test_plugin_source), installed_dir=installed_dir)

# Verify install_path is Path (breaking change)
assert isinstance(info.install_path, Path)  # ✓ PASS

# List, enable/disable, get, uninstall all work
plugins = list_installed_plugins(installed_dir=installed_dir)  # ✓ Returns 1 plugin
disable_plugin("test-plugin", installed_dir=installed_dir)  # ✓ Success
enable_plugin("test-plugin", installed_dir=installed_dir)  # ✓ Success
uninstall_plugin("test-plugin-2", installed_dir=installed_dir)  # ✓ Success

Result: All plugin installation operations work correctly. The breaking change (install_path type changed from str to Path) is verified.

Test 2: Skill Installation API (Backward Compatibility)

Verified the existing openhands.sdk.skills.installed API continues to work:

Commands executed:

from openhands.sdk.skills.installed import install_skill, list_installed_skills, load_installed_skills, disable_skill, enable_skill, uninstall_skill

# Install skill from local path
info = install_skill(str(test_skill_source), installed_dir=installed_dir)

# Verify install_path is Path (breaking change)
assert isinstance(info.install_path, Path)  # ✓ PASS

# Load installed skills (only enabled ones)
skills = load_installed_skills(installed_dir=installed_dir)  # ✓ Returns 1 skill

# Disable/enable filtering works
disable_skill("test-skill", installed_dir=installed_dir)  # ✓ Success
load_installed_skills(installed_dir=installed_dir)  # ✓ Returns 0 (disabled excluded)
enable_skill("test-skill", installed_dir=installed_dir)  # ✓ Success
load_installed_skills(installed_dir=installed_dir)  # ✓ Returns 1 (enabled included)

Result: All skill installation operations work correctly. The breaking change (install_path type changed from str to Path) is verified.

Test 3: Generic InstallationManager (New Unified Framework)

Verified the new generic framework works with custom extension types:

Commands executed:

from openhands.sdk.extensions.installation import InstallationManager, InstallationInterface
from pydantic import BaseModel

# Define custom extension type
class CustomExtension(BaseModel):
    name: str
    version: str
    description: str
    custom_field: str = "custom_value"

# Define loader
class CustomExtensionInterface(InstallationInterface[CustomExtension]):
    @staticmethod
    def load_from_dir(extension_dir: Path) -> CustomExtension:
        return CustomExtension.model_validate_json((extension_dir / "extension.json").read_text())

# Create manager
manager = InstallationManager(
    installation_dir=installed_dir,
    installation_interface=CustomExtensionInterface(),
)

# All operations work with custom type
info = manager.install(str(ext_source))  # ✓ Installed
extensions = manager.load_installed()  # ✓ Returns CustomExtension instances
assert extensions[0].custom_field == "test_value"  # ✓ Custom fields preserved

Result: The generic InstallationManager successfully works with custom extension types, proving the unification is complete and extensible.

Test 4: Metadata Migration (Legacy Compatibility)

Verified legacy metadata formats are migrated automatically:

Commands executed:

from openhands.sdk.extensions.installation import InstallationMetadata
import json

# Write old "plugins" format
legacy_plugins = {
    "plugins": {
        "my-plugin": {"name": "my-plugin", "version": "1.0.0", ...}
    }
}
metadata_file.write_text(json.dumps(legacy_plugins))

# Load - should migrate to "extensions"
metadata = InstallationMetadata.load_from_dir(installed_dir)
assert "my-plugin" in metadata.extensions  # ✓ PASS

# Test "skills" key migration
legacy_skills = {"skills": {"my-skill": {...}}}
# ✓ Also migrates to "extensions"

# Test conflict resolution (extensions wins)
conflict = {
    "extensions": {"shared": {"version": "3.0.0", ...}},
    "plugins": {"shared": {"version": "1.0.0", ...}}
}
# ✓ extensions key takes precedence

Result: Legacy plugins and skills metadata keys are automatically migrated to the new extensions key. Conflicts are resolved correctly (explicit extensions wins).

Test 5: Self-Healing Metadata

Verified metadata automatically syncs with filesystem:

Commands executed:

from openhands.sdk.plugin.installed import list_installed_plugins

# Manually create plugin directory without metadata entry
untracked_plugin = installed_dir / "untracked-plugin"
create_test_plugin(untracked_plugin, "untracked-plugin")

# list_installed_plugins discovers it
plugins = list_installed_plugins(installed_dir=installed_dir)
assert "untracked-plugin" in [p.name for p in plugins]  # ✓ Discovered
assert plugins[0].source == "local"  # ✓ Correct default source

# Add stale metadata entry (directory doesn't exist)
metadata["extensions"]["deleted-plugin"] = {...}

# list_installed_plugins removes stale entry
plugins = list_installed_plugins(installed_dir=installed_dir)
assert "deleted-plugin" not in [p.name for p in plugins]  # ✓ Cleaned up

# Invalid directory names are skipped
invalid_dir = installed_dir / "Invalid_Name"
invalid_dir.mkdir()
plugins = list_installed_plugins(installed_dir=installed_dir)
assert "Invalid_Name" not in [p.name for p in plugins]  # ✓ Skipped

Result: Metadata automatically discovers untracked extension directories, removes stale entries for deleted directories, and skips invalid directory names.

Issues Found

None.

github-actions · 2026-04-24T20:21:00Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-04-24 20:36:05 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	27.6s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	21.5s	$0.03
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	12.7s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	27.9s	$0.02
01_standalone_sdk/09_pause_example.py	✅ PASS	12.7s	$0.01
01_standalone_sdk/10_persistence.py	✅ PASS	34.4s	$0.03
01_standalone_sdk/11_async.py	✅ PASS	30.0s	$0.04
01_standalone_sdk/12_custom_secrets.py	✅ PASS	10.6s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	50.6s	$0.05
01_standalone_sdk/14_context_condenser.py	✅ PASS	2m 20s	$0.17
01_standalone_sdk/17_image_input.py	✅ PASS	17.1s	$0.01
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	17.2s	$0.02
01_standalone_sdk/19_llm_routing.py	❌ FAIL Exit code 1	4m 21s	--
01_standalone_sdk/20_stuck_detector.py	✅ PASS	15.6s	$0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	10.2s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	22.5s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	1m 7s	$0.01
01_standalone_sdk/24_planning_agent_workflow.py	✅ PASS	3m 23s	$0.23
01_standalone_sdk/25_agent_delegation.py	✅ PASS	51.7s	$0.06
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	27.6s	$0.06
01_standalone_sdk/28_ask_agent_example.py	✅ PASS	35.8s	$0.04
01_standalone_sdk/29_llm_streaming.py	✅ PASS	40.0s	$0.04
01_standalone_sdk/30_tom_agent.py	✅ PASS	9.7s	$0.01
01_standalone_sdk/31_iterative_refinement.py	✅ PASS	3m 47s	$0.26
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	22.2s	$0.02
01_standalone_sdk/34_critic_example.py	❌ FAIL Timed out after 600 seconds	10m 0s	--
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	11.7s	$0.01
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	6.8s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	36.9s	$0.03
01_standalone_sdk/39_llm_fallback.py	✅ PASS	10.7s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	30.2s	$0.12
01_standalone_sdk/41_task_tool_set.py	✅ PASS	29.7s	$0.03
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	1m 41s	$0.09
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	6.9s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	9.0s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	✅ PASS	3m 11s	$0.42
01_standalone_sdk/46_agent_settings.py	✅ PASS	10.8s	$0.01
01_standalone_sdk/47_defense_in_depth_security.py	✅ PASS	3.6s	$0.00
01_standalone_sdk/48_conversation_fork.py	✅ PASS	15.0s	$0.00
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	36.0s	$0.03
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	1m 50s	$0.09
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	49.3s	$0.05
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	✅ PASS	1m 42s	$0.05
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	30.4s	$0.03
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	✅ PASS	3m 40s	$0.03
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	✅ PASS	51.9s	$0.14
02_remote_agent_server/10_cloud_workspace_share_credentials.py	✅ PASS	33.2s	$0.05
02_remote_agent_server/11_conversation_fork.py	✅ PASS	45.5s	$0.00
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	30.1s	$0.03
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	44.2s	$0.06
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	19.1s	$0.02
05_skills_and_plugins/02_loading_plugins/main.py	❌ FAIL Exit code 1	20.4s	--

❌ Some tests failed

Total: 52 | Passed: 49 | Failed: 3 | Total Cost: $2.52

Failed examples:

examples/01_standalone_sdk/19_llm_routing.py: Exit code 1
examples/01_standalone_sdk/34_critic_example.py: Timed out after 600 seconds
examples/05_skills_and_plugins/02_loading_plugins/main.py: Exit code 1

View full workflow run

The installation metadata format was unified under "extensions" but the loading_plugins example still referenced the old "plugins" key. Co-authored-by: openhands <openhands@all-hands.dev>

github-actions · 2026-04-27T16:05:51Z

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

Generated: 2026-04-27 16:21:58 UTC

Example	Status	Duration	Cost
01_standalone_sdk/02_custom_tools.py	✅ PASS	24.1s	$0.03
01_standalone_sdk/03_activate_skill.py	✅ PASS	25.3s	$0.03
01_standalone_sdk/05_use_llm_registry.py	✅ PASS	12.3s	$0.01
01_standalone_sdk/07_mcp_integration.py	✅ PASS	30.4s	$0.03
01_standalone_sdk/09_pause_example.py	✅ PASS	17.7s	$0.02
01_standalone_sdk/10_persistence.py	✅ PASS	35.8s	$0.03
01_standalone_sdk/11_async.py	✅ PASS	34.8s	$0.04
01_standalone_sdk/12_custom_secrets.py	✅ PASS	9.9s	$0.01
01_standalone_sdk/13_get_llm_metrics.py	✅ PASS	34.3s	$0.03
01_standalone_sdk/14_context_condenser.py	✅ PASS	2m 24s	$0.15
01_standalone_sdk/17_image_input.py	✅ PASS	17.5s	$0.01
01_standalone_sdk/18_send_message_while_processing.py	✅ PASS	20.9s	$0.02
01_standalone_sdk/19_llm_routing.py	✅ PASS	18.7s	$0.02
01_standalone_sdk/20_stuck_detector.py	✅ PASS	15.3s	$0.02
01_standalone_sdk/21_generate_extraneous_conversation_costs.py	✅ PASS	10.4s	$0.00
01_standalone_sdk/22_anthropic_thinking.py	✅ PASS	14.7s	$0.01
01_standalone_sdk/23_responses_reasoning.py	✅ PASS	2m 32s	$0.02
01_standalone_sdk/24_planning_agent_workflow.py	❌ FAIL Timed out after 600 seconds	10m 0s	--
01_standalone_sdk/25_agent_delegation.py	✅ PASS	1m 5s	$0.08
01_standalone_sdk/26_custom_visualizer.py	✅ PASS	19.3s	$0.02
01_standalone_sdk/28_ask_agent_example.py	✅ PASS	32.1s	$0.03
01_standalone_sdk/29_llm_streaming.py	✅ PASS	34.0s	$0.03
01_standalone_sdk/30_tom_agent.py	✅ PASS	9.1s	$0.01
01_standalone_sdk/31_iterative_refinement.py	✅ PASS	4m 3s	$0.31
01_standalone_sdk/32_configurable_security_policy.py	✅ PASS	21.8s	$0.02
01_standalone_sdk/34_critic_example.py	✅ PASS	2m 11s	$0.17
01_standalone_sdk/36_event_json_to_openai_messages.py	✅ PASS	11.3s	$0.01
01_standalone_sdk/37_llm_profile_store/main.py	✅ PASS	4.0s	$0.00
01_standalone_sdk/38_browser_session_recording.py	✅ PASS	28.8s	$0.04
01_standalone_sdk/39_llm_fallback.py	✅ PASS	11.4s	$0.01
01_standalone_sdk/40_acp_agent_example.py	✅ PASS	26.4s	$0.12
01_standalone_sdk/41_task_tool_set.py	✅ PASS	27.0s	$0.03
01_standalone_sdk/42_file_based_subagents.py	✅ PASS	1m 25s	$0.09
01_standalone_sdk/43_mixed_marketplace_skills/main.py	✅ PASS	3.3s	$0.00
01_standalone_sdk/44_model_switching_in_convo.py	✅ PASS	8.8s	$0.01
01_standalone_sdk/45_parallel_tool_execution.py	❌ FAIL Timed out after 600 seconds	10m 0s	--
01_standalone_sdk/46_agent_settings.py	✅ PASS	11.4s	$0.01
01_standalone_sdk/47_defense_in_depth_security.py	✅ PASS	3.4s	$0.00
01_standalone_sdk/48_conversation_fork.py	✅ PASS	17.6s	$0.01
02_remote_agent_server/01_convo_with_local_agent_server.py	✅ PASS	31.7s	$0.03
02_remote_agent_server/02_convo_with_docker_sandboxed_server.py	✅ PASS	1m 45s	$0.06
02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py	✅ PASS	1m 29s	$0.08
02_remote_agent_server/04_convo_with_api_sandboxed_server.py	❌ FAIL Exit code 1	1m 40s	--
02_remote_agent_server/07_convo_with_cloud_workspace.py	✅ PASS	39.2s	$0.04
02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py	✅ PASS	3m 11s	$0.02
02_remote_agent_server/09_acp_agent_with_remote_runtime.py	✅ PASS	3m 20s	$0.12
02_remote_agent_server/10_cloud_workspace_share_credentials.py	✅ PASS	30.1s	$0.05
02_remote_agent_server/11_conversation_fork.py	✅ PASS	35.4s	$0.00
04_llm_specific_tools/01_gpt5_apply_patch_preset.py	✅ PASS	23.5s	$0.02
04_llm_specific_tools/02_gemini_file_tools.py	✅ PASS	38.4s	$0.07
05_skills_and_plugins/01_loading_agentskills/main.py	✅ PASS	14.4s	$0.02
05_skills_and_plugins/02_loading_plugins/main.py	✅ PASS	16.3s	$0.02

❌ Some tests failed

Total: 52 | Passed: 49 | Failed: 3 | Total Cost: $1.97

Failed examples:

examples/01_standalone_sdk/24_planning_agent_workflow.py: Timed out after 600 seconds
examples/01_standalone_sdk/45_parallel_tool_execution.py: Timed out after 600 seconds
examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py: Exit code 1

View full workflow run

csmith49 · 2026-04-27T16:27:13Z

Looks like the only blocker from the (numerous) AI reviews is evidence that this works end-to-end. I've evaluated these changes locally while developing, and the relevant example is now successful. I'd encourage one of the potential reviewers to check out this branch and test as well.

There are also several failing examples -- to the best of my knowledge, these are unrelated to the changes in this branch. They're due to timeouts and other stochastic failure modes.

all-hands-bot

✅ QA Report: PASS

All functional verification passed. The PR successfully unifies plugin and skill installation into a generic extension mechanism while maintaining backward compatibility.

Does this PR achieve its stated goal?

Yes. The PR set out to unify plugin and skill installation/loading mechanisms into a single reusable framework, and it delivers on that goal. The new openhands.sdk.extensions.installation module provides a generic InstallationManager that works with any extension type (plugins, skills, or custom types), and both the plugin and skill APIs have been successfully refactored to delegate to this unified machinery. All existing functionality works, backward compatibility is maintained (except for documented breaking changes), and legacy metadata files are automatically migrated.

Phase	Result
Environment Setup	✅ Build succeeded, all dependencies installed
CI Status	✅ All 30 checks passing
Functional Verification	✅ All 5 verification tests passed

Functional Verification

Test 1: Generic InstallationManager with Custom Extension Type

Objective: Verify the new generic InstallationManager works with arbitrary extension types.

Setup: Created a custom Widget extension type with its own loader interface.

Ran:

uv run python /tmp/test_generic_installation.py

Result:

[1] Installing widget from local path...
✓ Installed: my-widget v1.0.0
  Source: /tmp/tmpu3ioolfw/source/my-widget
  Install path: /tmp/tmpu3ioolfw/installed/my-widget
  Enabled: True

[2] Listing installed extensions...
✓ Found 1 extension(s)
  - my-widget v1.0.0 (enabled=True)

[3] Loading installed extensions...
✓ Loaded 1 widget(s)
  - my-widget: A test widget

[4] Disabling extension...
✓ Disable succeeded
  Loaded after disable: 0 widget(s)

[5] Re-enabling extension...
✓ Enable succeeded
  Loaded after enable: 1 widget(s)

[6] Getting specific extension info...
✓ Found extension: my-widget
  Version: 1.0.0

[7] Uninstalling extension...
✓ Uninstall succeeded
  Remaining extensions: 0

Interpretation: The generic InstallationManager successfully handles a custom extension type, demonstrating the PR's goal of creating a reusable installation framework.

Test 2: Plugin Installation API Backward Compatibility

Objective: Verify existing plugin installation code continues to work after the refactor.

Ran:

uv run python /tmp/test_plugin_backward_compat.py

Result:

[1] Installing plugin...
✓ Installed: test-plugin v1.0.0
  Type of install_path: PosixPath
  install_path value: /tmp/tmpiwmz3whs/installed/test-plugin
  ✓ Confirmed: install_path is now Path (breaking change documented)

[2] Listing installed plugins...
✓ Found 1 plugin(s)
  - test-plugin v1.0.0

[3] Loading installed plugins...
✓ Loaded 1 plugin(s)
  - test-plugin: Test plugin

[4] Disabling plugin...
✓ Disable succeeded

[5] Re-enabling plugin...
✓ Enable succeeded

[6] Getting plugin info...
✓ Found plugin: test-plugin

[7] Uninstalling plugin...
✓ Uninstall succeeded

Interpretation: All plugin installation APIs (install_plugin, list_installed_plugins, load_installed_plugins, enable_plugin, disable_plugin, get_installed_plugin, uninstall_plugin) work correctly. The documented breaking change (install_path is now Path instead of str) is confirmed.

Test 3: Skill Installation API Backward Compatibility

Objective: Verify existing skill installation code continues to work after the refactor.

Ran:

uv run python /tmp/test_skill_backward_compat.py

Result:

[1] Installing skill...
✓ Installed: test-skill
  Type of install_path: PosixPath
  ✓ Confirmed: install_path is now Path (breaking change documented)

[2] Listing installed skills...
✓ Found 1 skill(s)
  - test-skill

[3] Loading installed skills...
✓ Loaded 1 skill(s)

[4] Disabling skill...
✓ Disable succeeded

[5] Re-enabling skill...
✓ Enable succeeded

[6] Getting skill info...
✓ Found skill: test-skill

[7] Uninstalling skill...
✓ Uninstall succeeded

Interpretation: All skill installation APIs work correctly with the same functionality as before the refactor.

Test 4: Updated Plugin Loading Example

Objective: Verify the example code updated in the PR runs successfully.

Ran:

uv run python examples/05_skills_and_plugins/02_loading_plugins/main.py

Result:

DEMO 5: Disabling/Enabling plugins
============================================================

After disable
-------------
Tracked plugins:
  - code-quality (enabled=False)
  - skills-237e139ce4d65e8f (enabled=True)

After re-enable
---------------
Tracked plugins:
  - code-quality (enabled=True)
  - skills-237e139ce4d65e8f (enabled=True)

Metadata file:
{
  "extensions": {
    "code-quality": { ... },
    "skills-237e139ce4d65e8f": { ... }
  }
}

============================================================
EXAMPLE COMPLETED SUCCESSFULLY
============================================================
EXAMPLE_COST: 0.0631

Interpretation: The example runs successfully end-to-end. The PR's change to the example (updating assertions from metadata["plugins"] to metadata["extensions"]) is correct and the metadata file now uses the new "extensions" key.

Test 5: Legacy Metadata Migration

Objective: Verify old metadata files with "plugins" or "skills" keys are automatically migrated to the new "extensions" format.

Ran:

uv run python /tmp/test_metadata_migration.py

Result:

[1] Testing migration of legacy 'plugins' key...
WARNING: Migrating legacy 'plugins' key to 'extensions'
✓ Legacy 'plugins' key successfully migrated to 'extensions'

[2] Testing migration of legacy 'skills' key...
WARNING: Migrating legacy 'skills' key to 'extensions'
✓ Legacy 'skills' key successfully migrated to 'extensions'

[3] Testing migration with both legacy keys...
✓ Both 'plugins' and 'skills' keys migrated to 'extensions'

[4] Testing load from file with legacy key...
✓ Legacy file successfully loaded and migrated
✓ Saved file uses new 'extensions' format

Interpretation: The PR includes automatic migration logic that converts old metadata formats to the new unified format. Existing users' installed plugins and skills will be automatically upgraded when the metadata is loaded.

Issues Found

None.

all-hands-bot

🟡 Acceptable — Clean refactor eliminating 655 lines of duplication, but has critical data loss issue in legacy migration and several important correctness/security gaps that should be addressed before merge.

Eval Risk Flag: This PR refactors plugin/skill loading infrastructure. Per repo guidelines, changes to plugin/skill loading require human review + lightweight evals before merge to verify no behavior regression.

VascoSch92 · 2026-04-27T17:59:19Z

It seems good for me...

- Resolve installation_dir to absolute path in InstallationManager.__post_init__ to prevent breakage if working directory changes - Add parametrized path traversal tests for validate_extension_name - Add test for legacy key conflict when both plugins and skills share a name Co-authored-by: openhands <openhands@all-hands.dev>

csmith49 and others added 30 commits April 10, 2026 09:49

extensions module, entry point stubs

a71198a

full fetch logic

4858e59

Port fetch tests to tests/sdk/extensions/test_fetch.py

dae2ddc

Cover parse_extension_source, get_cache_path, fetch, fetch_with_resolution, SourceType enum, and repo_path handling for the new shared extensions module. Co-authored-by: openhands <openhands@all-hands.dev>

minor

5187d87

Merge branch 'main' into feat/extensions-utils

6679478

cleaning up error messages

6460535

initial installation manager api

717c854

type strengthening

ee9fe39

relaxing type def

a469127

installation manager docs first pass, default metadata added

9399e24

utils file, simplifying interface to install manger

a1973a7

update/get

6fd1e1e

metadata validation

0a3e154

re-org into module

005b30a

removing unused utils

c3911bc

genericized metadata

f83540a

interface instead of badly linked generics

c48c7cf

refining generic in manager

cb30b92

list/load

c9ff175

enable/disable

509dd24

install/uninstall

7786088

minor todos

1d52b89

readme and init file, initial

3db00dc

initial utils tests

add5d69

rename installation info

a813f19

rename extension protocol and installation interface

f398ea9

csmith49 added 2 commits April 23, 2026 09:47

Merge branch 'main' into feat/installed-extensions

c9b3dd1

Merge branch 'main' into feat/installed-extensions

319e57d

enyst reviewed Apr 23, 2026

View reviewed changes

xingyaoww requested a review from all-hands-bot April 24, 2026 20:04

all-hands-bot reviewed Apr 24, 2026

View reviewed changes

xingyaoww added the test-examples Run all applicable "examples/" files. Expensive operation. label Apr 24, 2026

Merge branch 'main' into feat/installed-extensions

ae9e801

all-hands-bot reviewed Apr 24, 2026

View reviewed changes

VascoSch92 mentioned this pull request Apr 26, 2026

pr-review action hangs ~15m when sub-agent review delegation is enabled OpenHands/extensions#208

Open

csmith49 and others added 2 commits April 27, 2026 09:08

Merge branch 'main' into feat/installed-extensions

b4b58c1

fix(examples): update metadata key from "plugins" to "extensions"

601baa2

The installation metadata format was unified under "extensions" but the loading_plugins example still referenced the old "plugins" key. Co-authored-by: openhands <openhands@all-hands.dev>

csmith49 added test-examples Run all applicable "examples/" files. Expensive operation. and removed test-examples Run all applicable "examples/" files. Expensive operation. labels Apr 27, 2026

Merge branch 'main' into feat/installed-extensions

adcea6e

csmith49 requested a review from all-hands-bot April 27, 2026 17:50

all-hands-bot reviewed Apr 27, 2026

View reviewed changes

VascoSch92 approved these changes Apr 27, 2026

View reviewed changes

csmith49 and others added 2 commits April 27, 2026 12:20

Merge branch 'main' into feat/installed-extensions

907c4b8

csmith49 merged commit 0719217 into main Apr 28, 2026
32 checks passed

csmith49 deleted the feat/installed-extensions branch April 28, 2026 14:00

csmith49 mentioned this pull request Apr 28, 2026

refactor: Centralize MCP utils in openhands.sdk.mcp #2848

Open

6 tasks

Conversation

csmith49 commented Apr 13, 2026 • edited by github-actions Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Summary

Breaking Changes

How to Test

Type

Uh oh!

enyst commented Apr 23, 2026

Uh oh!

openhands-ai Bot commented Apr 23, 2026

Uh oh!

enyst left a comment

Choose a reason for hiding this comment

Uh oh!

openhands-ai Bot commented Apr 23, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Plugin Installation API (Backward Compatibility)

Test 2: Skill Installation API (Backward Compatibility)

Test 3: Generic InstallationManager (New Unified Framework)

Test 4: Metadata Migration (Legacy Compatibility)

Test 5: Self-Healing Metadata

Issues Found

Uh oh!

github-actions Bot commented Apr 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

❌ Some tests failed

Uh oh!

github-actions Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔄 Running Examples with openhands/claude-haiku-4-5-20251001

❌ Some tests failed

Uh oh!

csmith49 commented Apr 27, 2026

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

✅ QA Report: PASS

Does this PR achieve its stated goal?

Test 1: Generic InstallationManager with Custom Extension Type

Test 2: Plugin Installation API Backward Compatibility

Test 3: Skill Installation API Backward Compatibility

Test 4: Updated Plugin Loading Example

Test 5: Legacy Metadata Migration

Issues Found

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

VascoSch92 commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

csmith49 commented Apr 13, 2026 •

edited by github-actions Bot

Loading

github-actions Bot commented Apr 24, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`

github-actions Bot commented Apr 27, 2026 •

edited

Loading

🔄 Running Examples with `openhands/claude-haiku-4-5-20251001`