Conversation
Cover parse_extension_source, get_cache_path, fetch, fetch_with_resolution, SourceType enum, and repo_path handling for the new shared extensions module. Co-authored-by: openhands <openhands@all-hands.dev>
…h.py - Add class docstring to SourceType enum describing each variant. - Update parse_extension_source docstring: fix arg description, return types, and example function names. - Fix _apply_subpath docstring example from 'plugin repository' to 'extension repository'. - Fix get_cache_path docstring references from plugin to extension. - Rename plugin_path variable to ext_path in fetch_with_resolution. Co-authored-by: openhands <openhands@all-hands.dev>
Both modules now delegate all fetch logic to the shared extensions.fetch module while preserving their public interfaces: - plugin/fetch.py: parse_plugin_source, get_cache_path, fetch_plugin, fetch_plugin_with_resolution, PluginFetchError, DEFAULT_CACHE_DIR - skills/fetch.py: fetch_skill, fetch_skill_with_resolution, SkillFetchError, DEFAULT_CACHE_DIR ExtensionFetchError is caught and re-raised as the domain-specific error type with 'extension' replaced by 'plugin'/'skill' in messages. Co-authored-by: openhands <openhands@all-hands.dev>
- Remove parse_plugin_source and get_cache_path from plugin/fetch.py; callers should use extensions.fetch directly. - Slim test_plugin_fetch.py from 1038 to 100 lines: keep only PluginFetchError wrapping, DEFAULT_CACHE_DIR, and Plugin.fetch() tests. - Move git infrastructure tests (clone, update, checkout, locking, GitHelper errors, get_default_branch) to tests/sdk/git/test_cached_repo.py. - Update test_installed_plugins.py to import from extensions.fetch. Co-authored-by: openhands <openhands@all-hands.dev>
The except block was discarding the ExtensionFetchError message and replacing it with a generic 'Failed to fetch plugin'. Now it carries the original message through with 'extension' replaced by 'plugin'. Co-authored-by: openhands <openhands@all-hands.dev>
|
@OpenHands /codereview this PR, and tell us directly on github: WDYT? |
|
I'm on it! enyst can track my progress at all-hands.dev |
enyst
left a comment
There was a problem hiding this comment.
🟡 Acceptable - the refactor itself looks sound after the latest fixes. I walked the new installation/metadata path, compared the wrapper behavior against the old plugin/skill implementations, and ran uv run pytest tests/sdk/extensions/installation/test_installation_info.py tests/sdk/extensions/installation/test_installation_manager.py tests/sdk/extensions/installation/test_installation_metadata.py tests/sdk/extensions/installation/test_installation_utils.py tests/sdk/plugin/test_installed_plugins.py tests/sdk/skills/test_installed_skills.py -q locally (80 passed). I don't currently see a code-level blocker in the diff.
[TESTING GAPS]
- [PR description] No Evidence: the PR template explicitly asks for end-to-end runtime evidence, but the body still only says the unit/integration tests use temp files. Please add an
Evidencesection with the exact real install/load command(s) you ran and the resulting output. If this PR is also claiming to address the legacy metadata issue discussed in review, include a short before/after reproduction for that path too.
[RISK ASSESSMENT]
- [Overall PR]
⚠️ Risk Assessment: 🟡 MEDIUM
Breaking SDK surface plus install/load refactor, so compatibility matters. The current implementation and tests look reasonably solid; the remaining concern is missing runtime proof in the PR description rather than the code itself.
VERDICT:
✅ Worth merging once the missing runtime evidence is added
KEY INSIGHT:
The risky part here is API/persistence compatibility, and the current code/tests cover that path reasonably well; the remaining gap is proof in the PR description, not the implementation.
This review was generated by an AI assistant (OpenHands) on behalf of the user.
|
Here’s the final summary of my work since the last summary:
Status checklist:
Conciseness:
Bottom line:
|
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable — Clean refactor with solid architecture, but requires human review + evals before merge per repo policy.
[EVAL RISK]
- This refactors plugin/skill loading infrastructure
- Per repo guidelines: plugin/skill loading changes require human review + lightweight evals
- Recommendation: Have a human maintainer run evals to verify no behavior regression before merge
[ARCHITECTURAL ASSESSMENT]
✅ Good taste: The generic InstallationManager[T] abstraction elegantly eliminates ~650 lines of duplicated code. The protocol-based interface with type parameters is exactly the right level of abstraction.
✅ Data Structures: Using Pydantic models for metadata, Path for install_path (improvement over str), and context manager for metadata sessions - all solid choices.
✅ Testing: Comprehensive coverage with real disk I/O testing the full persistence round-trip. Not just mocks.
✅ Backward Compatibility: Legacy metadata migration properly handles old {"plugins": ...} and {"skills": ...} formats.
[BREAKING CHANGES]
✅ Properly Documented: Breaking changes are reasonable API improvements (str→Path, removed internal metadata classes).
[RISK ASSESSMENT]
VERDICT:
✅ Code quality is excellent, architecture is sound
❌ Cannot approve without evals per repo policy
KEY INSIGHT:
Well-executed refactor that successfully unifies plugin/skill installation through clean abstraction, but the infrastructure nature requires eval verification before merge.
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
Functionally verified the extension installation module unification. All existing plugin and skill installation APIs work correctly through the new generic framework.
Does this PR achieve its stated goal?
Yes. The PR successfully unifies plugin and skill installation into a single generic openhands.sdk.extensions.installation module. The refactor consolidates previously duplicated code while maintaining full backward compatibility for existing public APIs. All documented breaking changes are verified and intentional.
| Phase | Result |
|---|---|
| Environment Setup | ✅ All dependencies installed, project builds successfully |
| CI Status | ✅ All core checks passing (sdk-tests, tools-tests, workspace-tests, agent-server-tests, pre-commit, coverage-report) |
| Functional Verification | ✅ All installation operations verified end-to-end |
Functional Verification
Test 1: Plugin Installation API (Backward Compatibility)
Verified the existing openhands.sdk.plugin.installed API continues to work:
Commands executed:
from openhands.sdk.plugin.installed import install_plugin, list_installed_plugins, uninstall_plugin, enable_plugin, disable_plugin, get_installed_plugin
# Install plugin from local path
info = install_plugin(str(test_plugin_source), installed_dir=installed_dir)
# Verify install_path is Path (breaking change)
assert isinstance(info.install_path, Path) # ✓ PASS
# List, enable/disable, get, uninstall all work
plugins = list_installed_plugins(installed_dir=installed_dir) # ✓ Returns 1 plugin
disable_plugin("test-plugin", installed_dir=installed_dir) # ✓ Success
enable_plugin("test-plugin", installed_dir=installed_dir) # ✓ Success
uninstall_plugin("test-plugin-2", installed_dir=installed_dir) # ✓ SuccessResult: All plugin installation operations work correctly. The breaking change (install_path type changed from str to Path) is verified.
Test 2: Skill Installation API (Backward Compatibility)
Verified the existing openhands.sdk.skills.installed API continues to work:
Commands executed:
from openhands.sdk.skills.installed import install_skill, list_installed_skills, load_installed_skills, disable_skill, enable_skill, uninstall_skill
# Install skill from local path
info = install_skill(str(test_skill_source), installed_dir=installed_dir)
# Verify install_path is Path (breaking change)
assert isinstance(info.install_path, Path) # ✓ PASS
# Load installed skills (only enabled ones)
skills = load_installed_skills(installed_dir=installed_dir) # ✓ Returns 1 skill
# Disable/enable filtering works
disable_skill("test-skill", installed_dir=installed_dir) # ✓ Success
load_installed_skills(installed_dir=installed_dir) # ✓ Returns 0 (disabled excluded)
enable_skill("test-skill", installed_dir=installed_dir) # ✓ Success
load_installed_skills(installed_dir=installed_dir) # ✓ Returns 1 (enabled included)Result: All skill installation operations work correctly. The breaking change (install_path type changed from str to Path) is verified.
Test 3: Generic InstallationManager (New Unified Framework)
Verified the new generic framework works with custom extension types:
Commands executed:
from openhands.sdk.extensions.installation import InstallationManager, InstallationInterface
from pydantic import BaseModel
# Define custom extension type
class CustomExtension(BaseModel):
name: str
version: str
description: str
custom_field: str = "custom_value"
# Define loader
class CustomExtensionInterface(InstallationInterface[CustomExtension]):
@staticmethod
def load_from_dir(extension_dir: Path) -> CustomExtension:
return CustomExtension.model_validate_json((extension_dir / "extension.json").read_text())
# Create manager
manager = InstallationManager(
installation_dir=installed_dir,
installation_interface=CustomExtensionInterface(),
)
# All operations work with custom type
info = manager.install(str(ext_source)) # ✓ Installed
extensions = manager.load_installed() # ✓ Returns CustomExtension instances
assert extensions[0].custom_field == "test_value" # ✓ Custom fields preservedResult: The generic InstallationManager successfully works with custom extension types, proving the unification is complete and extensible.
Test 4: Metadata Migration (Legacy Compatibility)
Verified legacy metadata formats are migrated automatically:
Commands executed:
from openhands.sdk.extensions.installation import InstallationMetadata
import json
# Write old "plugins" format
legacy_plugins = {
"plugins": {
"my-plugin": {"name": "my-plugin", "version": "1.0.0", ...}
}
}
metadata_file.write_text(json.dumps(legacy_plugins))
# Load - should migrate to "extensions"
metadata = InstallationMetadata.load_from_dir(installed_dir)
assert "my-plugin" in metadata.extensions # ✓ PASS
# Test "skills" key migration
legacy_skills = {"skills": {"my-skill": {...}}}
# ✓ Also migrates to "extensions"
# Test conflict resolution (extensions wins)
conflict = {
"extensions": {"shared": {"version": "3.0.0", ...}},
"plugins": {"shared": {"version": "1.0.0", ...}}
}
# ✓ extensions key takes precedenceResult: Legacy plugins and skills metadata keys are automatically migrated to the new extensions key. Conflicts are resolved correctly (explicit extensions wins).
Test 5: Self-Healing Metadata
Verified metadata automatically syncs with filesystem:
Commands executed:
from openhands.sdk.plugin.installed import list_installed_plugins
# Manually create plugin directory without metadata entry
untracked_plugin = installed_dir / "untracked-plugin"
create_test_plugin(untracked_plugin, "untracked-plugin")
# list_installed_plugins discovers it
plugins = list_installed_plugins(installed_dir=installed_dir)
assert "untracked-plugin" in [p.name for p in plugins] # ✓ Discovered
assert plugins[0].source == "local" # ✓ Correct default source
# Add stale metadata entry (directory doesn't exist)
metadata["extensions"]["deleted-plugin"] = {...}
# list_installed_plugins removes stale entry
plugins = list_installed_plugins(installed_dir=installed_dir)
assert "deleted-plugin" not in [p.name for p in plugins] # ✓ Cleaned up
# Invalid directory names are skipped
invalid_dir = installed_dir / "Invalid_Name"
invalid_dir.mkdir()
plugins = list_installed_plugins(installed_dir=installed_dir)
assert "Invalid_Name" not in [p.name for p in plugins] # ✓ SkippedResult: Metadata automatically discovers untracked extension directories, removes stale entries for deleted directories, and skips invalid directory names.
Issues Found
None.
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 27.6s | $0.03 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 21.5s | $0.03 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 12.7s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 27.9s | $0.02 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 12.7s | $0.01 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 34.4s | $0.03 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 30.0s | $0.04 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 10.6s | $0.01 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 50.6s | $0.05 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 2m 20s | $0.17 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 17.1s | $0.01 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 17.2s | $0.02 |
| 01_standalone_sdk/19_llm_routing.py | ❌ FAIL Exit code 1 |
4m 21s | -- |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 15.6s | $0.02 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 10.2s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 22.5s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 1m 7s | $0.01 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ✅ PASS | 3m 23s | $0.23 |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 51.7s | $0.06 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 27.6s | $0.06 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 35.8s | $0.04 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 40.0s | $0.04 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 9.7s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ✅ PASS | 3m 47s | $0.26 |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 22.2s | $0.02 |
| 01_standalone_sdk/34_critic_example.py | ❌ FAIL Timed out after 600 seconds |
10m 0s | -- |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 11.7s | $0.01 |
| 01_standalone_sdk/37_llm_profile_store/main.py | ✅ PASS | 6.8s | $0.00 |
| 01_standalone_sdk/38_browser_session_recording.py | ✅ PASS | 36.9s | $0.03 |
| 01_standalone_sdk/39_llm_fallback.py | ✅ PASS | 10.7s | $0.01 |
| 01_standalone_sdk/40_acp_agent_example.py | ✅ PASS | 30.2s | $0.12 |
| 01_standalone_sdk/41_task_tool_set.py | ✅ PASS | 29.7s | $0.03 |
| 01_standalone_sdk/42_file_based_subagents.py | ✅ PASS | 1m 41s | $0.09 |
| 01_standalone_sdk/43_mixed_marketplace_skills/main.py | ✅ PASS | 6.9s | $0.00 |
| 01_standalone_sdk/44_model_switching_in_convo.py | ✅ PASS | 9.0s | $0.01 |
| 01_standalone_sdk/45_parallel_tool_execution.py | ✅ PASS | 3m 11s | $0.42 |
| 01_standalone_sdk/46_agent_settings.py | ✅ PASS | 10.8s | $0.01 |
| 01_standalone_sdk/47_defense_in_depth_security.py | ✅ PASS | 3.6s | $0.00 |
| 01_standalone_sdk/48_conversation_fork.py | ✅ PASS | 15.0s | $0.00 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 36.0s | $0.03 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ✅ PASS | 1m 50s | $0.09 |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ✅ PASS | 49.3s | $0.05 |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ✅ PASS | 1m 42s | $0.05 |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 30.4s | $0.03 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ✅ PASS | 3m 40s | $0.03 |
| 02_remote_agent_server/09_acp_agent_with_remote_runtime.py | ✅ PASS | 51.9s | $0.14 |
| 02_remote_agent_server/10_cloud_workspace_share_credentials.py | ✅ PASS | 33.2s | $0.05 |
| 02_remote_agent_server/11_conversation_fork.py | ✅ PASS | 45.5s | $0.00 |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 30.1s | $0.03 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 44.2s | $0.06 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 19.1s | $0.02 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ❌ FAIL Exit code 1 |
20.4s | -- |
❌ Some tests failed
Total: 52 | Passed: 49 | Failed: 3 | Total Cost: $2.52
Failed examples:
- examples/01_standalone_sdk/19_llm_routing.py: Exit code 1
- examples/01_standalone_sdk/34_critic_example.py: Timed out after 600 seconds
- examples/05_skills_and_plugins/02_loading_plugins/main.py: Exit code 1
The installation metadata format was unified under "extensions" but the loading_plugins example still referenced the old "plugins" key. Co-authored-by: openhands <openhands@all-hands.dev>
🔄 Running Examples with
|
| Example | Status | Duration | Cost |
|---|---|---|---|
| 01_standalone_sdk/02_custom_tools.py | ✅ PASS | 24.1s | $0.03 |
| 01_standalone_sdk/03_activate_skill.py | ✅ PASS | 25.3s | $0.03 |
| 01_standalone_sdk/05_use_llm_registry.py | ✅ PASS | 12.3s | $0.01 |
| 01_standalone_sdk/07_mcp_integration.py | ✅ PASS | 30.4s | $0.03 |
| 01_standalone_sdk/09_pause_example.py | ✅ PASS | 17.7s | $0.02 |
| 01_standalone_sdk/10_persistence.py | ✅ PASS | 35.8s | $0.03 |
| 01_standalone_sdk/11_async.py | ✅ PASS | 34.8s | $0.04 |
| 01_standalone_sdk/12_custom_secrets.py | ✅ PASS | 9.9s | $0.01 |
| 01_standalone_sdk/13_get_llm_metrics.py | ✅ PASS | 34.3s | $0.03 |
| 01_standalone_sdk/14_context_condenser.py | ✅ PASS | 2m 24s | $0.15 |
| 01_standalone_sdk/17_image_input.py | ✅ PASS | 17.5s | $0.01 |
| 01_standalone_sdk/18_send_message_while_processing.py | ✅ PASS | 20.9s | $0.02 |
| 01_standalone_sdk/19_llm_routing.py | ✅ PASS | 18.7s | $0.02 |
| 01_standalone_sdk/20_stuck_detector.py | ✅ PASS | 15.3s | $0.02 |
| 01_standalone_sdk/21_generate_extraneous_conversation_costs.py | ✅ PASS | 10.4s | $0.00 |
| 01_standalone_sdk/22_anthropic_thinking.py | ✅ PASS | 14.7s | $0.01 |
| 01_standalone_sdk/23_responses_reasoning.py | ✅ PASS | 2m 32s | $0.02 |
| 01_standalone_sdk/24_planning_agent_workflow.py | ❌ FAIL Timed out after 600 seconds |
10m 0s | -- |
| 01_standalone_sdk/25_agent_delegation.py | ✅ PASS | 1m 5s | $0.08 |
| 01_standalone_sdk/26_custom_visualizer.py | ✅ PASS | 19.3s | $0.02 |
| 01_standalone_sdk/28_ask_agent_example.py | ✅ PASS | 32.1s | $0.03 |
| 01_standalone_sdk/29_llm_streaming.py | ✅ PASS | 34.0s | $0.03 |
| 01_standalone_sdk/30_tom_agent.py | ✅ PASS | 9.1s | $0.01 |
| 01_standalone_sdk/31_iterative_refinement.py | ✅ PASS | 4m 3s | $0.31 |
| 01_standalone_sdk/32_configurable_security_policy.py | ✅ PASS | 21.8s | $0.02 |
| 01_standalone_sdk/34_critic_example.py | ✅ PASS | 2m 11s | $0.17 |
| 01_standalone_sdk/36_event_json_to_openai_messages.py | ✅ PASS | 11.3s | $0.01 |
| 01_standalone_sdk/37_llm_profile_store/main.py | ✅ PASS | 4.0s | $0.00 |
| 01_standalone_sdk/38_browser_session_recording.py | ✅ PASS | 28.8s | $0.04 |
| 01_standalone_sdk/39_llm_fallback.py | ✅ PASS | 11.4s | $0.01 |
| 01_standalone_sdk/40_acp_agent_example.py | ✅ PASS | 26.4s | $0.12 |
| 01_standalone_sdk/41_task_tool_set.py | ✅ PASS | 27.0s | $0.03 |
| 01_standalone_sdk/42_file_based_subagents.py | ✅ PASS | 1m 25s | $0.09 |
| 01_standalone_sdk/43_mixed_marketplace_skills/main.py | ✅ PASS | 3.3s | $0.00 |
| 01_standalone_sdk/44_model_switching_in_convo.py | ✅ PASS | 8.8s | $0.01 |
| 01_standalone_sdk/45_parallel_tool_execution.py | ❌ FAIL Timed out after 600 seconds |
10m 0s | -- |
| 01_standalone_sdk/46_agent_settings.py | ✅ PASS | 11.4s | $0.01 |
| 01_standalone_sdk/47_defense_in_depth_security.py | ✅ PASS | 3.4s | $0.00 |
| 01_standalone_sdk/48_conversation_fork.py | ✅ PASS | 17.6s | $0.01 |
| 02_remote_agent_server/01_convo_with_local_agent_server.py | ✅ PASS | 31.7s | $0.03 |
| 02_remote_agent_server/02_convo_with_docker_sandboxed_server.py | ✅ PASS | 1m 45s | $0.06 |
| 02_remote_agent_server/03_browser_use_with_docker_sandboxed_server.py | ✅ PASS | 1m 29s | $0.08 |
| 02_remote_agent_server/04_convo_with_api_sandboxed_server.py | ❌ FAIL Exit code 1 |
1m 40s | -- |
| 02_remote_agent_server/07_convo_with_cloud_workspace.py | ✅ PASS | 39.2s | $0.04 |
| 02_remote_agent_server/08_convo_with_apptainer_sandboxed_server.py | ✅ PASS | 3m 11s | $0.02 |
| 02_remote_agent_server/09_acp_agent_with_remote_runtime.py | ✅ PASS | 3m 20s | $0.12 |
| 02_remote_agent_server/10_cloud_workspace_share_credentials.py | ✅ PASS | 30.1s | $0.05 |
| 02_remote_agent_server/11_conversation_fork.py | ✅ PASS | 35.4s | $0.00 |
| 04_llm_specific_tools/01_gpt5_apply_patch_preset.py | ✅ PASS | 23.5s | $0.02 |
| 04_llm_specific_tools/02_gemini_file_tools.py | ✅ PASS | 38.4s | $0.07 |
| 05_skills_and_plugins/01_loading_agentskills/main.py | ✅ PASS | 14.4s | $0.02 |
| 05_skills_and_plugins/02_loading_plugins/main.py | ✅ PASS | 16.3s | $0.02 |
❌ Some tests failed
Total: 52 | Passed: 49 | Failed: 3 | Total Cost: $1.97
Failed examples:
- examples/01_standalone_sdk/24_planning_agent_workflow.py: Timed out after 600 seconds
- examples/01_standalone_sdk/45_parallel_tool_execution.py: Timed out after 600 seconds
- examples/02_remote_agent_server/04_convo_with_api_sandboxed_server.py: Exit code 1
|
Looks like the only blocker from the (numerous) AI reviews is evidence that this works end-to-end. I've evaluated these changes locally while developing, and the relevant example is now successful. I'd encourage one of the potential reviewers to check out this branch and test as well. There are also several failing examples -- to the best of my knowledge, these are unrelated to the changes in this branch. They're due to timeouts and other stochastic failure modes. |
all-hands-bot
left a comment
There was a problem hiding this comment.
✅ QA Report: PASS
All functional verification passed. The PR successfully unifies plugin and skill installation into a generic extension mechanism while maintaining backward compatibility.
Does this PR achieve its stated goal?
Yes. The PR set out to unify plugin and skill installation/loading mechanisms into a single reusable framework, and it delivers on that goal. The new openhands.sdk.extensions.installation module provides a generic InstallationManager that works with any extension type (plugins, skills, or custom types), and both the plugin and skill APIs have been successfully refactored to delegate to this unified machinery. All existing functionality works, backward compatibility is maintained (except for documented breaking changes), and legacy metadata files are automatically migrated.
| Phase | Result |
|---|---|
| Environment Setup | ✅ Build succeeded, all dependencies installed |
| CI Status | ✅ All 30 checks passing |
| Functional Verification | ✅ All 5 verification tests passed |
Functional Verification
Test 1: Generic InstallationManager with Custom Extension Type
Objective: Verify the new generic InstallationManager works with arbitrary extension types.
Setup: Created a custom Widget extension type with its own loader interface.
Ran:
uv run python /tmp/test_generic_installation.pyResult:
[1] Installing widget from local path...
✓ Installed: my-widget v1.0.0
Source: /tmp/tmpu3ioolfw/source/my-widget
Install path: /tmp/tmpu3ioolfw/installed/my-widget
Enabled: True
[2] Listing installed extensions...
✓ Found 1 extension(s)
- my-widget v1.0.0 (enabled=True)
[3] Loading installed extensions...
✓ Loaded 1 widget(s)
- my-widget: A test widget
[4] Disabling extension...
✓ Disable succeeded
Loaded after disable: 0 widget(s)
[5] Re-enabling extension...
✓ Enable succeeded
Loaded after enable: 1 widget(s)
[6] Getting specific extension info...
✓ Found extension: my-widget
Version: 1.0.0
[7] Uninstalling extension...
✓ Uninstall succeeded
Remaining extensions: 0
Interpretation: The generic InstallationManager successfully handles a custom extension type, demonstrating the PR's goal of creating a reusable installation framework.
Test 2: Plugin Installation API Backward Compatibility
Objective: Verify existing plugin installation code continues to work after the refactor.
Ran:
uv run python /tmp/test_plugin_backward_compat.pyResult:
[1] Installing plugin...
✓ Installed: test-plugin v1.0.0
Type of install_path: PosixPath
install_path value: /tmp/tmpiwmz3whs/installed/test-plugin
✓ Confirmed: install_path is now Path (breaking change documented)
[2] Listing installed plugins...
✓ Found 1 plugin(s)
- test-plugin v1.0.0
[3] Loading installed plugins...
✓ Loaded 1 plugin(s)
- test-plugin: Test plugin
[4] Disabling plugin...
✓ Disable succeeded
[5] Re-enabling plugin...
✓ Enable succeeded
[6] Getting plugin info...
✓ Found plugin: test-plugin
[7] Uninstalling plugin...
✓ Uninstall succeeded
Interpretation: All plugin installation APIs (install_plugin, list_installed_plugins, load_installed_plugins, enable_plugin, disable_plugin, get_installed_plugin, uninstall_plugin) work correctly. The documented breaking change (install_path is now Path instead of str) is confirmed.
Test 3: Skill Installation API Backward Compatibility
Objective: Verify existing skill installation code continues to work after the refactor.
Ran:
uv run python /tmp/test_skill_backward_compat.pyResult:
[1] Installing skill...
✓ Installed: test-skill
Type of install_path: PosixPath
✓ Confirmed: install_path is now Path (breaking change documented)
[2] Listing installed skills...
✓ Found 1 skill(s)
- test-skill
[3] Loading installed skills...
✓ Loaded 1 skill(s)
[4] Disabling skill...
✓ Disable succeeded
[5] Re-enabling skill...
✓ Enable succeeded
[6] Getting skill info...
✓ Found skill: test-skill
[7] Uninstalling skill...
✓ Uninstall succeeded
Interpretation: All skill installation APIs work correctly with the same functionality as before the refactor.
Test 4: Updated Plugin Loading Example
Objective: Verify the example code updated in the PR runs successfully.
Ran:
uv run python examples/05_skills_and_plugins/02_loading_plugins/main.pyResult:
DEMO 5: Disabling/Enabling plugins
============================================================
After disable
-------------
Tracked plugins:
- code-quality (enabled=False)
- skills-237e139ce4d65e8f (enabled=True)
After re-enable
---------------
Tracked plugins:
- code-quality (enabled=True)
- skills-237e139ce4d65e8f (enabled=True)
Metadata file:
{
"extensions": {
"code-quality": { ... },
"skills-237e139ce4d65e8f": { ... }
}
}
============================================================
EXAMPLE COMPLETED SUCCESSFULLY
============================================================
EXAMPLE_COST: 0.0631
Interpretation: The example runs successfully end-to-end. The PR's change to the example (updating assertions from metadata["plugins"] to metadata["extensions"]) is correct and the metadata file now uses the new "extensions" key.
Test 5: Legacy Metadata Migration
Objective: Verify old metadata files with "plugins" or "skills" keys are automatically migrated to the new "extensions" format.
Ran:
uv run python /tmp/test_metadata_migration.pyResult:
[1] Testing migration of legacy 'plugins' key...
WARNING: Migrating legacy 'plugins' key to 'extensions'
✓ Legacy 'plugins' key successfully migrated to 'extensions'
[2] Testing migration of legacy 'skills' key...
WARNING: Migrating legacy 'skills' key to 'extensions'
✓ Legacy 'skills' key successfully migrated to 'extensions'
[3] Testing migration with both legacy keys...
✓ Both 'plugins' and 'skills' keys migrated to 'extensions'
[4] Testing load from file with legacy key...
✓ Legacy file successfully loaded and migrated
✓ Saved file uses new 'extensions' format
Interpretation: The PR includes automatic migration logic that converts old metadata formats to the new unified format. Existing users' installed plugins and skills will be automatically upgraded when the metadata is loaded.
Issues Found
None.
all-hands-bot
left a comment
There was a problem hiding this comment.
🟡 Acceptable — Clean refactor eliminating 655 lines of duplication, but has critical data loss issue in legacy migration and several important correctness/security gaps that should be addressed before merge.
Eval Risk Flag: This PR refactors plugin/skill loading infrastructure. Per repo guidelines, changes to plugin/skill loading require human review + lightweight evals before merge to verify no behavior regression.
|
It seems good for me... |
- Resolve installation_dir to absolute path in InstallationManager.__post_init__ to prevent breakage if working directory changes - Add parametrized path traversal tests for validate_extension_name - Add test for legacy key conflict when both plugins and skills share a name Co-authored-by: openhands <openhands@all-hands.dev>
Why
Skills and plugins have repeated but slightly-divergent implementations of installing/loading. This PR attempts to unify both with one mechanism that can be reused for other extensions in the future.
This PR is a follow-up to #2801 and should be merged after.
Summary
openhands.sdk.extensions.installationmodule that provides the interface for installing/loading extensionsopenhands.sdk.plugin.installedandopenhands.sdk.skills.installedto use the new machinery while exposing the old interfaceBreaking Changes
InstalledPluginInfo.install_path/InstalledSkillInfo.install_pathtype changed fromstrtoPath.Downstream code doing string operations on
info.install_path(e.g.info.install_path.startswith(...)) will need to usestr(info.install_path)or switch toPathAPIs.Pathis the better type here and existing tests already use it correctly (e.g.install_path / "skills").InstalledPluginsMetadataandInstalledSkillsMetadataremoved from public exports.These were internal implementation details. The user-facing functions and
InstalledPluginInfo/InstalledSkillInfotype aliases remain exported.Skill-specific fields dropped from
InstalledSkillInfo.The old
InstalledSkillInfohadlicense,compatibility,metadata,allowed_toolsfields. These are gone since it is now aliased toInstallationInfo. These fields were never accessed in production code.How to Test
Unit/integration tests all use temp files to actually stress the saving/loading.
Type
Agent Server images for this PR
• GHCR package: https://github.com/OpenHands/agent-sdk/pkgs/container/agent-server
Variants & Base Images
eclipse-temurin:17-jdknikolaik/python-nodejs:python3.13-nodejs22-slimgolang:1.21-bookwormPull (multi-arch manifest)
# Each variant is a multi-arch manifest supporting both amd64 and arm64 docker pull ghcr.io/openhands/agent-server:907c4b8-pythonRun
All tags pushed for this build
About Multi-Architecture Support
907c4b8-python) is a multi-arch manifest supporting both amd64 and arm64907c4b8-python-amd64) are also available if needed