diff --git a/README.md b/README.md index 76dbb0a6..c596c4dd 100644 --- a/README.md +++ b/README.md @@ -13,6 +13,7 @@ ## Table of Contents +- [What's new (2026-06-19) — Office I/O](#whats-new-2026-06-19--office-io) - [What's new (2026-06-19) — Agent Toolkit](#whats-new-2026-06-19--agent-toolkit) - [What's new (2026-06-19) — Authoring & Debugging](#whats-new-2026-06-19--authoring--debugging) - [What's new (2026-06-19) — Test & Tooling Batch](#whats-new-2026-06-19--test--tooling-batch) @@ -66,6 +67,16 @@ --- +## What's new (2026-06-19) — Office I/O + +Headless read/write for Excel/Word/PowerPoint, full stack (facade, `AC_*`, MCP, Script Builder). Optional extra: `pip install je_auto_control[office]`. Full reference: [`docs/source/Eng/doc/new_features/v14_features_doc.rst`](docs/source/Eng/doc/new_features/v14_features_doc.rst). + +- **Excel** — `read_workbook` / `write_workbook` (`AC_read_workbook` / `AC_write_workbook`, `ac_read_workbook` / `ac_write_workbook`): read an `.xlsx` worksheet into row dicts (first row = keys) and write rows back, no GUI. +- **Word** — `read_document` / `write_document` (`AC_read_document` / `AC_write_document`): read/write `.docx` paragraphs. +- **PowerPoint** — `read_presentation` / `write_presentation` (`AC_read_presentation` / `AC_write_presentation`): read per-slide text; write slides as `{title, body:[...]}`. + +The backing libraries (`openpyxl`/`python-docx`/`python-pptx`) are optional — each call raises a clear error if missing, and `import je_auto_control` pulls none of them. + ## What's new (2026-06-19) — Agent Toolkit Three pure-stdlib tools for LLM/agent-driven automation, full stack (facade, `AC_*`, MCP, Script Builder). Full reference: [`docs/source/Eng/doc/new_features/v13_features_doc.rst`](docs/source/Eng/doc/new_features/v13_features_doc.rst). diff --git a/README/README_zh-CN.md b/README/README_zh-CN.md index 22c0a8bf..a81d8af9 100644 --- a/README/README_zh-CN.md +++ b/README/README_zh-CN.md @@ -12,6 +12,7 @@ ## 目录 +- [本次更新 (2026-06-19) — Office 读写](#本次更新-2026-06-19--office-读写) - [本次更新 (2026-06-19) — Agent 工具组](#本次更新-2026-06-19--agent-工具组) - [本次更新 (2026-06-19) — 编写与调试](#本次更新-2026-06-19--编写与调试) - [本次更新 (2026-06-19) — 测试与工具三件套](#本次更新-2026-06-19--测试与工具三件套) @@ -65,6 +66,16 @@ --- +## 本次更新 (2026-06-19) — Office 读写 + +Excel/Word/PowerPoint 的 headless 读写,走完整五层(facade、`AC_*`、MCP、Script Builder)。可选 extra:`pip install je_auto_control[office]`。完整参考:[`docs/source/Zh/doc/new_features/v14_features_doc.rst`](../docs/source/Zh/doc/new_features/v14_features_doc.rst)。 + +- **Excel** — `read_workbook` / `write_workbook`(`AC_read_workbook` / `AC_write_workbook`、`ac_read_workbook` / `ac_write_workbook`):把 `.xlsx` 工作表读成数据行字典(第一行为键)并写回,不需 GUI。 +- **Word** — `read_document` / `write_document`(`AC_read_document` / `AC_write_document`):读写 `.docx` 段落。 +- **PowerPoint** — `read_presentation` / `write_presentation`(`AC_read_presentation` / `AC_write_presentation`):读取每张幻灯片文本;以 `{title, body:[...]}` 写入幻灯片。 + +背后函式库(`openpyxl`/`python-docx`/`python-pptx`)为可选——缺少时每个调用会抛出清楚错误,且 `import je_auto_control` 不会载入它们。 + ## 本次更新 (2026-06-19) — Agent 工具组 三项供 LLM / agent 驱动自动化使用的纯标准库工具,走完整五层(facade、`AC_*`、MCP、Script Builder)。完整参考:[`docs/source/Zh/doc/new_features/v13_features_doc.rst`](../docs/source/Zh/doc/new_features/v13_features_doc.rst)。 diff --git a/README/README_zh-TW.md b/README/README_zh-TW.md index df17b00b..f5e0ad8e 100644 --- a/README/README_zh-TW.md +++ b/README/README_zh-TW.md @@ -12,6 +12,7 @@ ## 目錄 +- [本次更新 (2026-06-19) — Office 讀寫](#本次更新-2026-06-19--office-讀寫) - [本次更新 (2026-06-19) — Agent 工具組](#本次更新-2026-06-19--agent-工具組) - [本次更新 (2026-06-19) — 編寫與除錯](#本次更新-2026-06-19--編寫與除錯) - [本次更新 (2026-06-19) — 測試與工具三件套](#本次更新-2026-06-19--測試與工具三件套) @@ -65,6 +66,16 @@ --- +## 本次更新 (2026-06-19) — Office 讀寫 + +Excel/Word/PowerPoint 的 headless 讀寫,走完整五層(facade、`AC_*`、MCP、Script Builder)。可選 extra:`pip install je_auto_control[office]`。完整參考:[`docs/source/Zh/doc/new_features/v14_features_doc.rst`](../docs/source/Zh/doc/new_features/v14_features_doc.rst)。 + +- **Excel** — `read_workbook` / `write_workbook`(`AC_read_workbook` / `AC_write_workbook`、`ac_read_workbook` / `ac_write_workbook`):把 `.xlsx` 工作表讀成資料列字典(第一列為鍵)並寫回,不需 GUI。 +- **Word** — `read_document` / `write_document`(`AC_read_document` / `AC_write_document`):讀寫 `.docx` 段落。 +- **PowerPoint** — `read_presentation` / `write_presentation`(`AC_read_presentation` / `AC_write_presentation`):讀取每張投影片文字;以 `{title, body:[...]}` 寫入投影片。 + +背後函式庫(`openpyxl`/`python-docx`/`python-pptx`)為可選——缺少時每個呼叫會丟出清楚錯誤,且 `import je_auto_control` 不會載入它們。 + ## 本次更新 (2026-06-19) — Agent 工具組 三項供 LLM / agent 驅動自動化使用的純標準庫工具,走完整五層(facade、`AC_*`、MCP、Script Builder)。完整參考:[`docs/source/Zh/doc/new_features/v13_features_doc.rst`](../docs/source/Zh/doc/new_features/v13_features_doc.rst)。 diff --git a/dev_requirements.txt b/dev_requirements.txt index 25376a11..8ac5598d 100644 --- a/dev_requirements.txt +++ b/dev_requirements.txt @@ -9,6 +9,11 @@ qt-material==2.17 mss==10.2.0 defusedxml==0.7.1 +# Office I/O ([office] extra) — exercised by the headless Office tests. +openpyxl==3.1.5 +python-docx==1.2.0 +python-pptx==1.0.2 + # Quality tooling — used by .github/workflows/quality.yml and locally. ruff==0.15.14 bandit==1.9.4 diff --git a/docs/source/Eng/doc/new_features/v14_features_doc.rst b/docs/source/Eng/doc/new_features/v14_features_doc.rst new file mode 100644 index 00000000..b47cb329 --- /dev/null +++ b/docs/source/Eng/doc/new_features/v14_features_doc.rst @@ -0,0 +1,65 @@ +============================================ +New Features (2026-06-19) — Office I/O +============================================ + +Headless read/write for Office documents — Excel (``.xlsx``), Word +(``.docx``), and PowerPoint (``.pptx``) — so flows can ingest a row-set or +emit a report without driving the GUI. Wired through the full stack +(facade, ``AC_*`` executor commands, MCP tools, Script Builder). + +The backing libraries (``openpyxl`` / ``python-docx`` / ``python-pptx``) +are an **optional** dependency:: + + pip install je_auto_control[office] + +Each function raises a clear ``RuntimeError`` if its library is missing, +so the core package stays lean and ``import je_auto_control`` pulls none of +them. + +.. contents:: + :local: + :depth: 2 + + +Excel +===== + +:: + + from je_auto_control import read_workbook, write_workbook + + write_workbook("people.xlsx", [{"name": "Ada", "age": 36}], sheet="P") + rows = read_workbook("people.xlsx", sheet="P") # [{'name': 'Ada', ...}] + +The first row supplies the dict keys; ``sheet`` defaults to the active +sheet. Commands: ``AC_read_workbook`` / ``AC_write_workbook`` (and +``ac_read_workbook`` / ``ac_write_workbook``). + + +Word +==== + +:: + + from je_auto_control import read_document, write_document + + write_document("report.docx", ["Title", "First line", "Second line"]) + paragraphs = read_document("report.docx")["paragraphs"] + +Commands: ``AC_read_document`` / ``AC_write_document``. + + +PowerPoint +========== + +:: + + from je_auto_control import read_presentation, write_presentation + + write_presentation("deck.pptx", [ + {"title": "Intro", "body": ["bullet one", "bullet two"]}, + ]) + slides = read_presentation("deck.pptx")["slides"] # per-slide text runs + +Each slide spec is ``{title, body:[...]}`` on a "Title and Content" +layout. Commands: ``AC_read_presentation`` / ``AC_write_presentation``. diff --git a/docs/source/Eng/eng_index.rst b/docs/source/Eng/eng_index.rst index acc430f5..1dee96bd 100644 --- a/docs/source/Eng/eng_index.rst +++ b/docs/source/Eng/eng_index.rst @@ -36,6 +36,7 @@ Comprehensive guides for all AutoControl features. doc/new_features/v11_features_doc doc/new_features/v12_features_doc doc/new_features/v13_features_doc + doc/new_features/v14_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/docs/source/Zh/doc/new_features/v14_features_doc.rst b/docs/source/Zh/doc/new_features/v14_features_doc.rst new file mode 100644 index 00000000..e8524d15 --- /dev/null +++ b/docs/source/Zh/doc/new_features/v14_features_doc.rst @@ -0,0 +1,63 @@ +==================================== +新功能 (2026-06-19) — Office 讀寫 +==================================== + +Office 文件的 headless 讀寫——Excel(``.xlsx``)、Word(``.docx``)、 +PowerPoint(``.pptx``)——讓流程不必驅動 GUI 就能吃進資料列或產出報表。 +走完整五層(facade、``AC_*`` 執行器指令、MCP 工具、Script Builder)。 + +背後的函式庫(``openpyxl`` / ``python-docx`` / ``python-pptx``)是 +**可選**相依:: + + pip install je_auto_control[office] + +缺少對應函式庫時,每個函式都會丟出清楚的 ``RuntimeError``,因此核心 +套件維持精簡,``import je_auto_control`` 不會載入任何一個。 + +.. contents:: + :local: + :depth: 2 + + +Excel +===== + +:: + + from je_auto_control import read_workbook, write_workbook + + write_workbook("people.xlsx", [{"name": "Ada", "age": 36}], sheet="P") + rows = read_workbook("people.xlsx", sheet="P") # [{'name': 'Ada', ...}] + +第一列作為 dict 的鍵;``sheet`` 預設為作用中工作表。指令: +``AC_read_workbook`` / ``AC_write_workbook``(以及 ``ac_read_workbook`` / +``ac_write_workbook``)。 + + +Word +==== + +:: + + from je_auto_control import read_document, write_document + + write_document("report.docx", ["標題", "第一行", "第二行"]) + paragraphs = read_document("report.docx")["paragraphs"] + +指令:``AC_read_document`` / ``AC_write_document``。 + + +PowerPoint +========== + +:: + + from je_auto_control import read_presentation, write_presentation + + write_presentation("deck.pptx", [ + {"title": "簡介", "body": ["重點一", "重點二"]}, + ]) + slides = read_presentation("deck.pptx")["slides"] # 每張投影片的文字 + +每張投影片規格為 ``{title, body:[...]}``,採用「標題及內容」版面。指令: +``AC_read_presentation`` / ``AC_write_presentation``。 diff --git a/docs/source/Zh/zh_index.rst b/docs/source/Zh/zh_index.rst index 80d9273a..2a98d7b9 100644 --- a/docs/source/Zh/zh_index.rst +++ b/docs/source/Zh/zh_index.rst @@ -36,6 +36,7 @@ AutoControl 所有功能的完整使用指南。 doc/new_features/v11_features_doc doc/new_features/v12_features_doc doc/new_features/v13_features_doc + doc/new_features/v14_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/je_auto_control/__init__.py b/je_auto_control/__init__.py index 1336a438..1d5251e5 100644 --- a/je_auto_control/__init__.py +++ b/je_auto_control/__init__.py @@ -133,6 +133,11 @@ ) # A2A (agent-to-agent) agent card from je_auto_control.utils.a2a import build_agent_card, write_agent_card +# Headless Office I/O (optional [office] extra: openpyxl/python-docx/pptx) +from je_auto_control.utils.office import ( + read_document, read_presentation, read_workbook, + write_document, write_presentation, write_workbook, +) # Background popup/interrupt watchdog (unattended automation) from je_auto_control.utils.watchdog import ( PopupWatchdog, WatchdogRule, default_popup_watchdog, @@ -542,6 +547,9 @@ def start_autocontrol_gui(*args, **kwargs): "Skill", "SkillLibrary", "assess_text", "redact_text", "scan_text", "build_agent_card", "write_agent_card", + "read_workbook", "write_workbook", + "read_document", "write_document", + "read_presentation", "write_presentation", # MCP server "AuditLogger", "HttpMCPServer", "MCPContent", "MCPPrompt", "MCPPromptArgument", "MCPResource", "MCPServer", "MCPTool", diff --git a/je_auto_control/gui/script_builder/command_schema.py b/je_auto_control/gui/script_builder/command_schema.py index d1e1ae15..516237b6 100644 --- a/je_auto_control/gui/script_builder/command_schema.py +++ b/je_auto_control/gui/script_builder/command_schema.py @@ -656,6 +656,42 @@ def _add_misc_specs(specs: List[CommandSpec]) -> None: _add_tooling_specs(specs) _add_authoring_specs(specs) _add_agent_specs(specs) + _add_office_specs(specs) + + +def _add_office_specs(specs: List[CommandSpec]) -> None: + xlsx = FieldSpec("path", FieldType.FILE_PATH) + specs.append(CommandSpec( + "AC_read_workbook", "Office", "Excel: Read Workbook", + fields=(xlsx, FieldSpec("sheet", FieldType.STRING, optional=True)), + description="Read an .xlsx worksheet into rows (needs [office] extra).", + )) + specs.append(CommandSpec( + "AC_write_workbook", "Office", "Excel: Write Workbook", + fields=(xlsx, FieldSpec("sheet", FieldType.STRING, optional=True, + default="Sheet1")), + description="Write 'rows' (JSON view) to an .xlsx file.", + )) + specs.append(CommandSpec( + "AC_read_document", "Office", "Word: Read Document", + fields=(xlsx,), + description="Read a .docx file's paragraphs (needs [office] extra).", + )) + specs.append(CommandSpec( + "AC_write_document", "Office", "Word: Write Document", + fields=(xlsx,), + description="Write 'paragraphs' (JSON view) to a .docx file.", + )) + specs.append(CommandSpec( + "AC_read_presentation", "Office", "PowerPoint: Read", + fields=(xlsx,), + description="Read a .pptx file's per-slide text (needs [office]).", + )) + specs.append(CommandSpec( + "AC_write_presentation", "Office", "PowerPoint: Write", + fields=(xlsx,), + description="Write 'slides' (JSON view) to a .pptx file.", + )) def _add_authoring_specs(specs: List[CommandSpec]) -> None: diff --git a/je_auto_control/utils/executor/action_executor.py b/je_auto_control/utils/executor/action_executor.py index dd40bde0..6228c98a 100644 --- a/je_auto_control/utils/executor/action_executor.py +++ b/je_auto_control/utils/executor/action_executor.py @@ -2475,6 +2475,43 @@ def _agent_card(path: Optional[str] = None) -> Dict[str, Any]: return {"card": build_agent_card()} +def _read_workbook(path: str, sheet: str = "") -> Dict[str, Any]: + """Adapter: read an .xlsx worksheet into rows.""" + from je_auto_control.utils.office import read_workbook + return {"rows": read_workbook(path, sheet=sheet)} + + +def _write_workbook(path: str, rows: List[Dict[str, Any]], + sheet: str = "Sheet1") -> Dict[str, Any]: + """Adapter: write rows to an .xlsx file.""" + from je_auto_control.utils.office import write_workbook + return {"path": write_workbook(path, rows, sheet=sheet)} + + +def _read_document(path: str) -> Dict[str, Any]: + """Adapter: read a .docx file's paragraphs.""" + from je_auto_control.utils.office import read_document + return read_document(path) + + +def _write_document(path: str, paragraphs: List[str]) -> Dict[str, Any]: + """Adapter: write paragraphs to a .docx file.""" + from je_auto_control.utils.office import write_document + return {"path": write_document(path, paragraphs)} + + +def _read_presentation(path: str) -> Dict[str, Any]: + """Adapter: read a .pptx file's per-slide text.""" + from je_auto_control.utils.office import read_presentation + return read_presentation(path) + + +def _write_presentation(path: str, slides: List[Any]) -> Dict[str, Any]: + """Adapter: write slides to a .pptx file.""" + from je_auto_control.utils.office import write_presentation + return {"path": write_presentation(path, slides)} + + class Executor: """ Executor @@ -2655,6 +2692,12 @@ def __init__(self): "AC_skill_search": _skill_search, "AC_guard_text": _guard_text, "AC_agent_card": _agent_card, + "AC_read_workbook": _read_workbook, + "AC_write_workbook": _write_workbook, + "AC_read_document": _read_document, + "AC_write_document": _write_document, + "AC_read_presentation": _read_presentation, + "AC_write_presentation": _write_presentation, "AC_a11y_record_start": _a11y_record_start, "AC_a11y_record_stop": _a11y_record_stop, "AC_a11y_record_events": _a11y_record_events, diff --git a/je_auto_control/utils/mcp_server/tools/_factories.py b/je_auto_control/utils/mcp_server/tools/_factories.py index e59fe9fd..7e32a4ac 100644 --- a/je_auto_control/utils/mcp_server/tools/_factories.py +++ b/je_auto_control/utils/mcp_server/tools/_factories.py @@ -1941,6 +1941,69 @@ def a2a_tools() -> List[MCPTool]: ] +def office_tools() -> List[MCPTool]: + _P = {"path": {"type": "string"}} + return [ + MCPTool( + name="ac_read_workbook", + description=("Read an Excel (.xlsx) worksheet into rows (first row " + "= keys). 'sheet' defaults to the active sheet. " + "Requires the [office] extra."), + input_schema=schema({"sheet": {"type": "string"}, **_P}, + required=["path"]), + handler=h.read_workbook, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_write_workbook", + description=("Write rows (list of objects) to an Excel (.xlsx) " + "file. Requires the [office] extra."), + input_schema=schema({ + "rows": {"type": "array", "items": {"type": "object"}}, + "sheet": {"type": "string"}, **_P}, + required=["path", "rows"]), + handler=h.write_workbook, + annotations=SIDE_EFFECT_ONLY, + ), + MCPTool( + name="ac_read_document", + description=("Read a Word (.docx) file's paragraph texts. " + "Requires the [office] extra."), + input_schema=schema(dict(_P), required=["path"]), + handler=h.read_document, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_write_document", + description=("Write paragraphs (list of strings) to a Word " + "(.docx) file. Requires the [office] extra."), + input_schema=schema({ + "paragraphs": {"type": "array", "items": {"type": "string"}}, + **_P}, required=["path", "paragraphs"]), + handler=h.write_document, + annotations=SIDE_EFFECT_ONLY, + ), + MCPTool( + name="ac_read_presentation", + description=("Read a PowerPoint (.pptx) file's per-slide text. " + "Requires the [office] extra."), + input_schema=schema(dict(_P), required=["path"]), + handler=h.read_presentation, + annotations=READ_ONLY, + ), + MCPTool( + name="ac_write_presentation", + description=("Write slides (each {title, body:[...]}) to a " + "PowerPoint (.pptx) file. Requires the [office] " + "extra."), + input_schema=schema({"slides": {"type": "array"}, **_P}, + required=["path", "slides"]), + handler=h.write_presentation, + annotations=SIDE_EFFECT_ONLY, + ), + ] + + def unattended_tools() -> List[MCPTool]: return [ MCPTool( @@ -2973,7 +3036,7 @@ def media_assert_tools() -> List[MCPTool]: unattended_tools, work_queue_tools, synthetic_data_tools, mcp_registry_tools, test_selection_tools, element_repository_tools, flow_debugger_tools, - skill_library_tools, guardrail_tools, a2a_tools, + skill_library_tools, guardrail_tools, a2a_tools, office_tools, screen_record_tools, process_and_shell_tools, remote_desktop_tools, gamepad_tools, usb_passthrough_tools, assertion_tools, data_source_tools, diff --git a/je_auto_control/utils/mcp_server/tools/_handlers.py b/je_auto_control/utils/mcp_server/tools/_handlers.py index 223459da..1466e0ae 100644 --- a/je_auto_control/utils/mcp_server/tools/_handlers.py +++ b/je_auto_control/utils/mcp_server/tools/_handlers.py @@ -927,6 +927,36 @@ def agent_card(path=None): return {"card": build_agent_card()} +def read_workbook(path, sheet=""): + from je_auto_control.utils.office import read_workbook as _read + return {"rows": _read(path, sheet=sheet)} + + +def write_workbook(path, rows, sheet="Sheet1"): + from je_auto_control.utils.office import write_workbook as _write + return {"path": _write(path, rows, sheet=sheet)} + + +def read_document(path): + from je_auto_control.utils.office import read_document as _read + return _read(path) + + +def write_document(path, paragraphs): + from je_auto_control.utils.office import write_document as _write + return {"path": _write(path, paragraphs)} + + +def read_presentation(path): + from je_auto_control.utils.office import read_presentation as _read + return _read(path) + + +def write_presentation(path, slides): + from je_auto_control.utils.office import write_presentation as _write + return {"path": _write(path, slides)} + + def vlm_locate(description: str, screen_region: Optional[List[int]] = None, model: Optional[str] = None) -> Optional[List[int]]: diff --git a/je_auto_control/utils/office/__init__.py b/je_auto_control/utils/office/__init__.py new file mode 100644 index 00000000..3a5b7932 --- /dev/null +++ b/je_auto_control/utils/office/__init__.py @@ -0,0 +1,11 @@ +"""Headless read/write for Office documents (Excel / Word / PowerPoint).""" +from je_auto_control.utils.office.office import ( + read_document, read_presentation, read_workbook, + write_document, write_presentation, write_workbook, +) + +__all__ = [ + "read_workbook", "write_workbook", + "read_document", "write_document", + "read_presentation", "write_presentation", +] diff --git a/je_auto_control/utils/office/office.py b/je_auto_control/utils/office/office.py new file mode 100644 index 00000000..1ca164bf --- /dev/null +++ b/je_auto_control/utils/office/office.py @@ -0,0 +1,151 @@ +"""Headless read/write for Office documents (Excel / Word / PowerPoint). + +A top automation need is reading and writing spreadsheets and documents +without driving the GUI. This module wraps the de-facto libraries +(``openpyxl`` / ``python-docx`` / ``python-pptx``) behind a small, +serialisable API so flows can ingest an ``.xlsx`` row-set or emit a +``.docx`` report headlessly. + +Those libraries are an **optional** dependency: install them with +``pip install je_auto_control[office]``. Each function raises a clear +:class:`RuntimeError` when the backing library is missing, so the core +package stays lean and import-time stays Qt-free / dependency-free. +""" +from pathlib import Path +from typing import Any, Dict, List + +_HINT = "pip install je_auto_control[office]" + + +def _openpyxl() -> Any: + """Import openpyxl (optional Excel backend) or raise a helpful error.""" + try: + import openpyxl + except ImportError as error: + raise RuntimeError(f"Excel I/O requires openpyxl ({_HINT}).") from error + return openpyxl + + +def _docx() -> Any: + """Import python-docx (optional Word backend) or raise a helpful error.""" + try: + import docx + except ImportError as error: + raise RuntimeError( + f"Word I/O requires python-docx ({_HINT}).") from error + return docx + + +def _pptx() -> Any: + """Import python-pptx (optional PPT backend) or raise a helpful error.""" + try: + import pptx + except ImportError as error: + raise RuntimeError( + f"PowerPoint I/O requires python-pptx ({_HINT}).") from error + return pptx + + +def _existing(path: str) -> Path: + resolved = Path(path).expanduser() + if not resolved.is_file(): + raise FileNotFoundError(f"no such file: {resolved}") + return resolved + + +# --- Excel (.xlsx) -------------------------------------------------------- + +def read_workbook(path: str, sheet: str = "") -> List[Dict[str, Any]]: + """Read a worksheet into a list of dicts (first row supplies the keys). + + ``sheet`` defaults to the active sheet. + """ + openpyxl = _openpyxl() + workbook = openpyxl.load_workbook(filename=str(_existing(path)), + read_only=True, data_only=True) + try: + worksheet = workbook[sheet] if sheet else workbook.active + rows_iter = worksheet.iter_rows(values_only=True) + header = next(rows_iter, None) + if header is None: + return [] + keys = [str(cell) for cell in header] + return [dict(zip(keys, values)) for values in rows_iter] + finally: + workbook.close() + + +def write_workbook(path: str, rows: List[Dict[str, Any]], + sheet: str = "Sheet1") -> str: + """Write ``rows`` (list of dicts) to an ``.xlsx`` file; return the path.""" + openpyxl = _openpyxl() + workbook = openpyxl.Workbook() + worksheet = workbook.active + worksheet.title = sheet + rows = list(rows) + if rows: + keys = list(rows[0].keys()) + worksheet.append(keys) + for row in rows: + worksheet.append([row.get(key) for key in keys]) + target = Path(path).expanduser() + workbook.save(str(target)) + return str(target.resolve()) + + +# --- Word (.docx) --------------------------------------------------------- + +def read_document(path: str) -> Dict[str, List[str]]: + """Read a ``.docx`` file's paragraph texts.""" + docx = _docx() + document = docx.Document(str(_existing(path))) + return {"paragraphs": [para.text for para in document.paragraphs]} + + +def write_document(path: str, paragraphs: List[str]) -> str: + """Write ``paragraphs`` to a ``.docx`` file; return the path.""" + docx = _docx() + document = docx.Document() + for paragraph in paragraphs: + document.add_paragraph(str(paragraph)) + target = Path(path).expanduser() + document.save(str(target)) + return str(target.resolve()) + + +# --- PowerPoint (.pptx) --------------------------------------------------- + +def read_presentation(path: str) -> Dict[str, List[List[str]]]: + """Read a ``.pptx`` file's per-slide text runs.""" + pptx = _pptx() + presentation = pptx.Presentation(str(_existing(path))) + slides = [] + for slide in presentation.slides: + slides.append([shape.text for shape in slide.shapes + if shape.has_text_frame and shape.text]) + return {"slides": slides} + + +def _add_slide(presentation: Any, layout: Any, spec: Any) -> None: + slide = presentation.slides.add_slide(layout) + title = spec.get("title", "") if isinstance(spec, dict) else str(spec) + body = spec.get("body", []) if isinstance(spec, dict) else [] + if slide.shapes.title is not None: + slide.shapes.title.text = str(title) + if body: + frame = slide.placeholders[1].text_frame + frame.text = str(body[0]) + for line in body[1:]: + frame.add_paragraph().text = str(line) + + +def write_presentation(path: str, slides: List[Any]) -> str: + """Write ``slides`` (each ``{title, body:[...]}``) to a ``.pptx`` file.""" + pptx = _pptx() + presentation = pptx.Presentation() + layout = presentation.slide_layouts[1] # "Title and Content" + for spec in slides: + _add_slide(presentation, layout, spec) + target = Path(path).expanduser() + presentation.save(str(target)) + return str(target.resolve()) diff --git a/pyproject.toml b/pyproject.toml index adae9052..2db13ee3 100644 --- a/pyproject.toml +++ b/pyproject.toml @@ -70,6 +70,7 @@ webrtc = ["aiortc>=1.14.0", "av>=14.0.0"] signaling = ["fastapi>=0.115", "uvicorn>=0.32"] discovery = ["zeroconf>=0.130"] pdf = ["pypdf>=4.0"] +office = ["openpyxl>=3.1", "python-docx>=1.1", "python-pptx>=0.6"] [tool.bandit] exclude_dirs = [ diff --git a/test/unit_test/headless/test_office_batch.py b/test/unit_test/headless/test_office_batch.py new file mode 100644 index 00000000..21a64140 --- /dev/null +++ b/test/unit_test/headless/test_office_batch.py @@ -0,0 +1,82 @@ +"""Headless tests for Office I/O (Excel / Word / PowerPoint). + +The document round-trips require the optional [office] extra +(openpyxl / python-docx / python-pptx) and skip when it is missing; the +wiring/facade tests always run (they only check registration).""" +import pytest + +import je_auto_control as ac +from je_auto_control.utils.office import ( + read_document, read_presentation, read_workbook, + write_document, write_presentation, write_workbook) + + +# --- document round-trips (need the [office] extra) ---------------------- + +def test_excel_roundtrip(tmp_path): + pytest.importorskip("openpyxl") + path = str(tmp_path / "data.xlsx") + rows = [{"name": "Ada", "age": 36}, {"name": "Bo", "age": 41}] + write_workbook(path, rows, sheet="People") + loaded = read_workbook(path, sheet="People") + assert loaded == rows + + +def test_word_roundtrip(tmp_path): + pytest.importorskip("docx") + path = str(tmp_path / "doc.docx") + paragraphs = ["Title line", "Body one", "Body two"] + write_document(path, paragraphs) + assert read_document(path)["paragraphs"] == paragraphs + + +def test_powerpoint_roundtrip(tmp_path): + pytest.importorskip("pptx") + path = str(tmp_path / "deck.pptx") + write_presentation(path, [{"title": "Intro", "body": ["alpha", "beta"]}]) + slides = read_presentation(path)["slides"] + flat = " ".join(slides[0]) + assert "Intro" in flat and "alpha" in flat and "beta" in flat + + +def test_read_missing_file_raises(): + pytest.importorskip("openpyxl") + with pytest.raises(FileNotFoundError): + read_workbook("does-not-exist-12345.xlsx") + + +# --- wiring (always runs) ------------------------------------------------- + +def test_executor_roundtrip(tmp_path): + pytest.importorskip("openpyxl") + path = str(tmp_path / "e.xlsx") + ac.execute_action([["AC_write_workbook", { + "path": path, "rows": [{"a": 1, "b": 2}]}]]) + rec = ac.execute_action([["AC_read_workbook", {"path": path}]]) + assert any("'a': 1" in str(v) for v in rec.values()) + + +def test_command_wiring(): + known = ac.executor.known_commands() + assert {"AC_read_workbook", "AC_write_workbook", "AC_read_document", + "AC_write_document", "AC_read_presentation", + "AC_write_presentation"} <= known + from je_auto_control.utils.mcp_server.tools import ( + build_default_tool_registry) + names = {t.name for t in build_default_tool_registry()} + assert {"ac_read_workbook", "ac_write_workbook", "ac_read_document", + "ac_write_document", "ac_read_presentation", + "ac_write_presentation"} <= names + from je_auto_control.gui.script_builder.command_schema import _build_specs + cmds = {s.command for s in _build_specs()} + assert {"AC_read_workbook", "AC_write_workbook", "AC_read_document", + "AC_write_document", "AC_read_presentation", + "AC_write_presentation"} <= cmds + + +def test_facade_exports(): + for attr in ("read_workbook", "write_workbook", "read_document", + "write_document", "read_presentation", + "write_presentation"): + assert hasattr(ac, attr) + assert attr in ac.__all__