Skip to content

Fix ACPAgent dropping image content from user messages#3027

Open
sjathin wants to merge 6 commits intoOpenHands:mainfrom
sjathin:sdk-2982
Open

Fix ACPAgent dropping image content from user messages#3027
sjathin wants to merge 6 commits intoOpenHands:mainfrom
sjathin:sdk-2982

Conversation

@sjathin
Copy link
Copy Markdown
Contributor

@sjathin sjathin commented May 1, 2026

  • A human has tested these changes.

Why

ACPAgent._build_acp_prompt() only extracts TextContent from user messages and silently drops ImageContent blocks. The ACP prompt() method supports ImageContentBlock but it was never wired up.

Summary

  • Changed _build_acp_prompt() to return list[TextContentBlock | ImageContentBlock] instead of str, preserving image content as ACP ImageContentBlock
  • Added _image_url_to_acp_block() helper to convert data URIs and plain URLs to ACP image blocks
  • Updated step() to pass content blocks directly to conn.prompt() instead of wrapping in a single text_block()

Issue Number

Closes #2982

How to Test

uv run python -m pytest tests/sdk/agent/test_acp_agent.py -v -k "not test_static_secret"

168 tests pass. New tests cover:

  • test_build_acp_prompt_includes_image_content — mixed text + image message
  • TestImageUrlToAcpBlock — data URI parsing, plain URL passthrough, invalid URI handling

Manually verified with a real PNG image that base64 data round-trips correctly through the conversion.

Video/Screenshots

N/A — backend logic change.

Type

  • Bug fix
  • Feature
  • Refactor
  • Breaking change
  • Docs / chore

Notes

The _image_url_to_acp_block helper handles two URL formats:

  • Data URIs (data:image/png;base64,...): parsed into data + mime_type fields
  • Plain URLs (https://...): passed via the uri field with a generic MIME type for the ACP server to resolve

Jathin Sreenivas added 2 commits May 1, 2026 02:34
_build_acp_prompt now returns a list of ACP content blocks (text + image)
instead of a single joined string. ImageContent URLs (data URIs and plain
URLs) are converted to ACP ImageContentBlock via the new
_image_url_to_acp_block helper. step() passes the blocks directly to
conn.prompt().

Closes OpenHands#2982
Copy link
Copy Markdown
Collaborator

@all-hands-bot all-hands-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Taste Rating: 🟢 Good taste - Clean implementation that eliminates the special case of dropped image content.

[EVAL CONSIDERATION]
This PR fixes a bug where image content was silently dropped, but it could affect agent behavior in multimodal scenarios. While the fix is correct, I recommend running lightweight evals before merging to ensure no unexpected performance impacts in benchmarks that use image inputs.

[RISK ASSESSMENT]

  • [Overall PR] ⚠️ Risk Assessment: 🟢 LOW

This is a correctness fix for existing ACP functionality. The change is well-isolated to ACPAgent, has good test coverage, and includes proper error handling for malformed data URIs. The risk is low because it's fixing missing functionality rather than changing existing behavior.

VERDICT:
Worth merging after eval check: Implementation is solid, tests are comprehensive, and this fixes a real bug where images were being silently dropped.

KEY INSIGHT:
The change from str to list[ContentBlock] eliminates a special case by preserving all content types, which is a cleaner architecture than text-only extraction.

@xingyaoww xingyaoww requested a review from simonrosenberg May 3, 2026 02:23
@xingyaoww
Copy link
Copy Markdown
Collaborator

@OpenHands do we have existing example tests for ACP agent? If so, can we modify that example test to add an image as input so we can thoroughly test the ACP system to make sure that it can actually handle image inputs?

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 3, 2026

I'm on it! xingyaoww can track my progress at all-hands.dev

Adds a second conversation turn that sends a Message with TextContent +
ImageContent (using the OpenHands logo URL) so the example test exercises
the new multimodal image support end-to-end through the ACP pipeline.

Co-authored-by: openhands <openhands@all-hands.dev>
Copy link
Copy Markdown
Collaborator

Yes, there is an existing example test at examples/01_standalone_sdk/40_acp_agent_example.py. I've added an image input turn to it in commit a8f28ea — it now sends a Message with TextContent + ImageContent (using the OpenHands logo URL) as a second conversation turn, which exercises the new multimodal image support end-to-end through the ACP pipeline.

The example now covers:

  1. Text-only turn — the original file listing task
  2. Image + text turn — sends an image URL with a description prompt via ImageContent
  3. ask_agent — the existing fork_session side-question

This will be validated when the example test suite runs (--run-examples).

This comment was generated by an AI agent (OpenHands).

@openhands-ai
Copy link
Copy Markdown

openhands-ai Bot commented May 3, 2026

Summary

The reviewer asked whether there are existing example tests for the ACP agent and requested adding an image input to that example to end-to-end test the new image support.

What was done

I modified the existing ACP agent example (examples/01_standalone_sdk/40_acp_agent_example.py) to add an image input turn, pushed the commit to the PR branch, and replied to the reviewer's comment on the PR.

Checklist

  • Identified existing ACP example testexamples/01_standalone_sdk/40_acp_agent_example.py exists and is discoverable by the example test runner
  • Added image input turn — New conversation turn sends a Message with TextContent + ImageContent using the OpenHands logo URL (same image used in 17_image_input.py)
  • Exercises the PR's changes end-to-end — The image URL goes through _build_acp_prompt()_image_url_to_acp_block()conn.prompt(), validating the full ACP image pipeline
  • All 169 existing unit tests pass — Verified with pytest tests/sdk/agent/test_acp_agent.py
  • Pre-commit checks pass — Linting, formatting, type checking all clean
  • Pushed to PR branch — Commit a8f28eab pushed to sjathin/software-agent-sdk:sdk-2982
  • Replied to reviewer — Left a comment on PR Fix ACPAgent dropping image content from user messages #3027 explaining the changes

Conciseness

The changes are minimal and focused — only one file was modified (40_acp_agent_example.py) with:

  • 3 new imports (ImageContent, Message, TextContent)
  • 1 constant (IMAGE_URL)
  • ~12 lines for the new image input turn
  • Minor docstring update

No extraneous changes were made.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ACPAgent drops image content from user messages

4 participants