Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@

## Table of Contents

- [What's new (2026-06-19) — Native UI Control](#whats-new-2026-06-19--native-ui-control)
- [What's new (2026-06-19)](#whats-new-2026-06-19)
- [What's new (2026-06-18)](#whats-new-2026-06-18)
- [What's new (2026-06-17)](#whats-new-2026-06-17)
Expand Down Expand Up @@ -59,6 +60,15 @@

---

## What's new (2026-06-19) — Native UI Control

Object-level desktop automation: read and drive native controls through the OS accessibility API (by name / role / app / **AutomationId**) instead of clicking pixels or OCR-ing text — far more reliable for native apps. The accessibility layer previously only listed/found/clicked; it now also acts. Ships through the full stack (facade, `AC_*`, MCP, Script Builder) with a Windows UIAutomation backend; unsupported backends raise a clear error. Full reference: [`docs/source/Eng/doc/new_features/v7_features_doc.rst`](docs/source/Eng/doc/new_features/v7_features_doc.rst).

- **Read / set value** — `control_get_value` / `control_set_value` (`AC_control_get_value` / `AC_control_set_value`): read a textbox/combo value (no OCR) and set it in one call (no per-key typing).
- **Invoke / toggle** — `control_invoke` / `control_toggle` (`AC_control_invoke` / `AC_control_toggle`): press a button or flip a checkbox via its control pattern.
- **Read a table/grid** — `read_control_table` (`AC_read_table`): scrape a grid/list/table control into rows of cell strings — desktop data extraction without OCR.
- Targets a control by `name` / `role` / `app_name` / `automation_id` (the stable Windows identifier), so it survives layout/localization changes.

## What's new (2026-06-19)

Two headless cores that shipped without the rest of their stack are now
Expand Down
10 changes: 10 additions & 0 deletions README/README_zh-CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

## 目录

- [本次更新 (2026-06-19) — 原生 UI 控制](#本次更新-2026-06-19--原生-ui-控制)
- [本次更新 (2026-06-19)](#本次更新-2026-06-19)
- [本次更新 (2026-06-18)](#本次更新-2026-06-18)
- [本次更新 (2026-06-17)](#本次更新-2026-06-17)
Expand Down Expand Up @@ -58,6 +59,15 @@

---

## 本次更新 (2026-06-19) — 原生 UI 控制

对象级桌面自动化:通过 OS 无障碍 API(以 name / role / app / **AutomationId** 定位)读取与操作原生控件,而非点像素或 OCR——对原生 app 可靠得多。无障碍层先前只能 list/find/click,现在还能操作。走完整五层(facade、`AC_*`、MCP、Script Builder),提供 Windows UIAutomation 后端;不支持的后端会抛清楚错误。完整参考:[`docs/source/Eng/doc/new_features/v7_features_doc.rst`](../docs/source/Eng/doc/new_features/v7_features_doc.rst)。

- **读取 / 设置值** — `control_get_value` / `control_set_value`(`AC_control_get_value` / `AC_control_set_value`):读 textbox/combo 值(不用 OCR),一次设置值(不必逐键输入)。
- **调用 / 切换** — `control_invoke` / `control_toggle`(`AC_control_invoke` / `AC_control_toggle`):通过控件模式按按钮或切换复选框。
- **读取表格/列表** — `read_control_table`(`AC_read_table`):把 grid/list/table 控件抓成逐行单元格字符串——不用 OCR 的桌面数据提取。
- 以 `name` / `role` / `app_name` / `automation_id`(Windows 稳定标识符)定位,版面/本地化改变也不坏。

## 本次更新 (2026-06-19)

两个早已存在、却没接上其余各层的 headless 核心,现在成为一级功能。两者都新增 facade re-export、`AC_*` 执行器指令、MCP 工具与 Script Builder 项目,并有 headless 测试。完整参考:
Expand Down
10 changes: 10 additions & 0 deletions README/README_zh-TW.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@

## 目錄

- [本次更新 (2026-06-19) — 原生 UI 控制](#本次更新-2026-06-19--原生-ui-控制)
- [本次更新 (2026-06-19)](#本次更新-2026-06-19)
- [本次更新 (2026-06-18)](#本次更新-2026-06-18)
- [本次更新 (2026-06-17)](#本次更新-2026-06-17)
Expand Down Expand Up @@ -58,6 +59,15 @@

---

## 本次更新 (2026-06-19) — 原生 UI 控制

物件級桌面自動化:透過 OS 無障礙 API(以 name / role / app / **AutomationId** 定位)讀取與操作原生控制項,而非點像素或 OCR——對原生 app 可靠得多。無障礙層先前只能 list/find/click,現在還能操作。走完整五層(facade、`AC_*`、MCP、Script Builder),提供 Windows UIAutomation 後端;不支援的後端會拋清楚錯誤。完整參考:[`docs/source/Zh/doc/new_features/v7_features_doc.rst`](../docs/source/Zh/doc/new_features/v7_features_doc.rst)。

- **讀取 / 設定值** — `control_get_value` / `control_set_value`(`AC_control_get_value` / `AC_control_set_value`):讀 textbox/combo 值(不用 OCR),一次設定值(不必逐鍵輸入)。
- **呼叫 / 切換** — `control_invoke` / `control_toggle`(`AC_control_invoke` / `AC_control_toggle`):透過控制模式按按鈕或切換核取方塊。
- **讀取表格/清單** — `read_control_table`(`AC_read_table`):把 grid/list/table 控制項抓成逐列儲存格字串——不用 OCR 的桌面資料擷取。
- 以 `name` / `role` / `app_name` / `automation_id`(Windows 穩定識別碼)定位,版面/在地化改變也不壞。

## 本次更新 (2026-06-19)

兩個早已存在、卻沒接上其餘各層的 headless 核心,現在成為一級功能。兩者都新增 facade re-export、`AC_*` 執行器指令、MCP 工具與 Script Builder 項目,並有 headless 測試。完整參考:
Expand Down
95 changes: 95 additions & 0 deletions docs/source/Eng/doc/new_features/v7_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
==============================================
New Features (2026-06-19) — Native UI Control
==============================================

Object-level desktop automation: read and drive native controls through
the OS accessibility API instead of clicking pixels or OCR-ing text. This
is far more reliable than coordinate/image automation for native apps —
the controls are addressed by name / role / app / **AutomationId**, so
they survive layout changes.

The accessibility layer previously only *listed*, *found*, and *clicked*
elements; it now also *acts* on them via their control patterns. Ships
through the full stack (facade, ``AC_*`` executor commands, MCP tools,
Script Builder), with a Windows UIAutomation backend; backends that can't
perform an action raise a clear ``AccessibilityNotAvailableError``.

.. contents::
:local:
:depth: 2


Reading and setting values
==========================

::

from je_auto_control import control_get_value, control_set_value

# Read a textbox / combo value directly (no OCR).
user = control_get_value(name="Username", app_name="myapp.exe")

# Set a value in one call (no per-key typing / focus dance).
control_set_value("alice@example.com", automation_id="emailField")

``control_get_value`` returns the control's value text (or ``None`` when
no match); ``control_set_value`` writes it via the Value pattern and
returns ``True`` on success.

Executor commands: ``AC_control_get_value``, ``AC_control_set_value``.


Invoking and toggling
====================

::

from je_auto_control import control_invoke, control_toggle

control_invoke(name="Sign in") # press a button
control_toggle(name="Remember me") # flip a checkbox / switch

``control_invoke`` triggers a control's default action (Invoke pattern);
``control_toggle`` flips a checkbox/switch (Toggle pattern). Both return
``True`` on success.

Executor commands: ``AC_control_invoke``, ``AC_control_toggle``.


Reading tables / grids
====================

::

from je_auto_control import read_control_table

rows = read_control_table(name="Results", app_name="myapp.exe")
# -> [["Sam", "30"], ["Lee", "25"], ...]

``read_control_table`` reads a grid/table/list control into rows of cell
strings via the Grid pattern — reliable desktop data scraping without OCR.

Executor command: ``AC_read_table``.


Targeting controls
=================

Every call accepts the same matchers — provide whichever uniquely
identify the control:

* ``name`` — the control's accessible name / label.
* ``role`` — the control type.
* ``app_name`` — the owning application (e.g. ``notepad.exe``).
* ``automation_id`` — the most stable identifier (Windows AutomationId),
unaffected by layout or localization.


Platforms
=========

A Windows UIAutomation backend (via ``comtypes``) implements all four
actions. On platforms / backends without a control driver yet, the calls
raise ``AccessibilityNotAvailableError`` with a clear message rather than
silently failing. The backend is swappable, so the logic is unit-tested
with an injected fake — no real GUI required.
1 change: 1 addition & 0 deletions docs/source/Eng/eng_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ Comprehensive guides for all AutoControl features.
doc/new_features/v4_features_doc
doc/new_features/v5_features_doc
doc/new_features/v6_features_doc
doc/new_features/v7_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
89 changes: 89 additions & 0 deletions docs/source/Zh/doc/new_features/v7_features_doc.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
====================================
新功能 (2026-06-19) — 原生 UI 控制
====================================

物件級桌面自動化:透過 OS 無障礙 API 讀取與操作原生控制項,而非點像素或
OCR 文字。對原生 app 而言,這比座標/影像自動化**可靠得多**——控制項以
name / role / app / **AutomationId** 定位,因此版面改變也不會壞。

無障礙層先前只能 *列出*、*尋找*、*點擊* 元素;現在還能透過控制模式
*操作* 它們。走完整五層(facade、``AC_*`` 執行器指令、MCP 工具、Script
Builder),並提供 Windows UIAutomation 後端;無法執行該動作的後端會拋出
清楚的 ``AccessibilityNotAvailableError``。

.. contents::
:local:
:depth: 2


讀取與設定值
============

::

from je_auto_control import control_get_value, control_set_value

# 直接讀 textbox / combo 的值(不用 OCR)。
user = control_get_value(name="Username", app_name="myapp.exe")

# 一次設定值(不必逐鍵輸入 / 處理焦點)。
control_set_value("alice@example.com", automation_id="emailField")

``control_get_value`` 回傳控制項的值(無相符時回傳 ``None``);
``control_set_value`` 透過 Value pattern 寫入,成功回傳 ``True``。

執行器指令:``AC_control_get_value``、``AC_control_set_value``。


呼叫與切換
==========

::

from je_auto_control import control_invoke, control_toggle

control_invoke(name="Sign in") # 按下按鈕
control_toggle(name="Remember me") # 切換核取方塊 / 開關

``control_invoke`` 觸發控制項的預設動作(Invoke pattern);
``control_toggle`` 切換核取方塊/開關(Toggle pattern)。兩者成功皆回傳
``True``。

執行器指令:``AC_control_invoke``、``AC_control_toggle``。


讀取表格 / 清單
================

::

from je_auto_control import read_control_table

rows = read_control_table(name="Results", app_name="myapp.exe")
# -> [["Sam", "30"], ["Lee", "25"], ...]

``read_control_table`` 透過 Grid pattern 把 grid/table/list 控制項讀成
逐列的儲存格字串——不用 OCR 的可靠桌面資料抓取。

執行器指令:``AC_read_table``。


定位控制項
==========

每個呼叫都接受相同的比對條件——提供能唯一辨識控制項的任意組合:

* ``name`` — 控制項的無障礙名稱 / 標籤。
* ``role`` — 控制項型別。
* ``app_name`` — 所屬應用程式(例如 ``notepad.exe``)。
* ``automation_id`` — 最穩定的識別碼(Windows AutomationId),不受版面或
在地化影響。


平台
====

Windows UIAutomation 後端(透過 ``comtypes``)實作全部四個動作。在尚無
控制驅動的平台/後端上,呼叫會拋出帶清楚訊息的
``AccessibilityNotAvailableError``,而非默默失敗。後端可抽換,因此邏輯以
注入的 fake 後端做單元測試——不需真實 GUI。
1 change: 1 addition & 0 deletions docs/source/Zh/zh_index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@ AutoControl 所有功能的完整使用指南。
doc/new_features/v4_features_doc
doc/new_features/v5_features_doc
doc/new_features/v6_features_doc
doc/new_features/v7_features_doc
doc/ocr_backends/ocr_backends_doc
doc/observability/observability_doc
doc/operations_layer/operations_layer_doc
Expand Down
6 changes: 5 additions & 1 deletion je_auto_control/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,10 @@
from je_auto_control.utils.accessibility import (
AccessibilityElement, AccessibilityNotAvailableError,
AccessibilityRecorder, AXRecorderEvent, AXTreeNode,
click_accessibility_element, dump_accessibility_tree,
click_accessibility_element, control_get_value, control_invoke,
control_set_value, control_toggle, dump_accessibility_tree,
find_accessibility_element, list_accessibility_elements,
read_control_table,
)
# VLM element locator (headless)
from je_auto_control.utils.vision import (
Expand Down Expand Up @@ -544,6 +546,8 @@ def start_autocontrol_gui(*args, **kwargs):
"AccessibilityRecorder", "AXRecorderEvent", "AXTreeNode",
"click_accessibility_element", "dump_accessibility_tree",
"find_accessibility_element", "list_accessibility_elements",
"control_get_value", "control_set_value", "control_invoke",
"control_toggle", "read_control_table",
# VLM locator
"VLMNotAvailableError", "locate_by_description", "click_by_description",
"verify_description",
Expand Down
35 changes: 35 additions & 0 deletions je_auto_control/gui/script_builder/command_schema.py
Original file line number Diff line number Diff line change
Expand Up @@ -569,7 +569,42 @@
))


def _add_native_control_specs(specs: List[CommandSpec]) -> None:
fields = (
FieldSpec("name", FieldType.STRING, optional=True),
FieldSpec("role", FieldType.STRING, optional=True),
FieldSpec("app_name", FieldType.STRING, optional=True),
FieldSpec("automation_id", FieldType.STRING, optional=True),
)
specs.append(CommandSpec(
"AC_control_get_value", "Native UI", "Get Control Value",

Check failure on line 580 in je_auto_control/gui/script_builder/command_schema.py

View check run for this annotation

SonarQubeCloud / SonarCloud Code Analysis

Define a constant instead of duplicating this literal "Native UI" 5 times.

See more on https://sonarcloud.io/project/issues?id=Integration-Automation_AutoControlGUI&issues=AZ7cQZj6MrHSMgX-m0bQ&open=AZ7cQZj6MrHSMgX-m0bQ&pullRequest=214
fields=fields,
description="Read a native control's value via the accessibility API.",
))
specs.append(CommandSpec(
"AC_control_set_value", "Native UI", "Set Control Value",
fields=(FieldSpec("value", FieldType.STRING),) + fields,
description="Set a native control's value directly (no per-key typing).",
))
specs.append(CommandSpec(
"AC_control_invoke", "Native UI", "Invoke Control",
fields=fields,
description="Invoke a native control (e.g. press a button).",
))
specs.append(CommandSpec(
"AC_control_toggle", "Native UI", "Toggle Control",
fields=fields,
description="Toggle a native control (e.g. a checkbox).",
))
specs.append(CommandSpec(
"AC_read_table", "Native UI", "Read Table / Grid",
fields=fields,
description="Read a grid/table/list control as rows of cell strings.",
))


def _add_misc_specs(specs: List[CommandSpec]) -> None:
_add_native_control_specs(specs)
specs.append(CommandSpec(
"AC_shell_command", "Shell", "Shell Command",
fields=(FieldSpec("shell_command", FieldType.STRING),),
Expand Down
6 changes: 5 additions & 1 deletion je_auto_control/utils/accessibility/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,10 @@
"""Cross-platform accessibility-tree widget location + recording."""
from je_auto_control.utils.accessibility.accessibility_api import (
AccessibilityElement, AccessibilityNotAvailableError, AXTreeNode,
click_accessibility_element, dump_accessibility_tree,
click_accessibility_element, control_get_value, control_invoke,
control_set_value, control_toggle, dump_accessibility_tree,
find_accessibility_element, list_accessibility_elements,
read_control_table,
)
from je_auto_control.utils.accessibility.recorder import (
AXRecorderEvent, AccessibilityRecorder,
Expand All @@ -18,4 +20,6 @@
"AXTreeWalker", "click_accessibility_element", "count_nodes",
"dump_accessibility_tree", "find_accessibility_element",
"list_accessibility_elements", "max_depth",
"control_get_value", "control_set_value", "control_invoke",
"control_toggle", "read_control_table",
]
Loading
Loading