Skip to content

Add Set-of-Marks overlay for VLM element grounding#230

Merged
JE-Chen merged 1 commit into
devfrom
feat/set-of-marks-batch
Jun 19, 2026
Merged

Add Set-of-Marks overlay for VLM element grounding#230
JE-Chen merged 1 commit into
devfrom
feat/set-of-marks-batch

Conversation

@JE-Chen

@JE-Chen JE-Chen commented Jun 19, 2026

Copy link
Copy Markdown
Member

Round-7 follow-up, batch 12 — the standard Set-of-Marks VLM-grounding format, wired through all five layers (facade, AC_*, MCP, Script Builder) with headless tests + EN/Zh v22 docs + README sections.

Feature (utils/set_of_marks)

  • Number + render + resolvemark_elements (assign 1..N + centre/role/text), render_marks (draw numbered red boxes on a PNG, Pillow), resolve_mark (number -> element). Pure + unit-testable with synthetic elements.
  • Mark-then-click loopmark_screen(app_name?, render_path?) numbers the live a11y tree (caches marks, optional overlay screenshot); mark_click(n) resolves a number from the cache and clicks the element centre. AC_mark_screen / AC_mark_click + ac_*.

Why

Top queued modality item across the research rounds: Set-of-Marks is the de-facto input format for modern GUI agents (UI-TARS/OmniParser/VisualWebArena) and turns the existing VLM locator into a reliable "pick a number" loop. Pure-stdlib + Pillow (already a dep), no model needed.

Verification

  • test/unit_test/headless/test_set_of_marks_batch.py — 6 tests pass (numbering/skip-invalid, resolve, PNG render, click via supplied marks + monkeypatched mouse, wiring).
  • ruff clean (avoided the global statement via in-place list mutation); radon no CC≥C; bandit clean; import je_auto_control PySide6-free.

@codacy-production

Copy link
Copy Markdown

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 41 complexity · 2 duplication

Metric Results
Complexity 41
Duplication 2

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

@JE-Chen JE-Chen merged commit b96a650 into dev Jun 19, 2026
16 checks passed
@JE-Chen JE-Chen deleted the feat/set-of-marks-batch branch June 19, 2026 08:56
@sonarqubecloud

Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant