diff --git a/README.md b/README.md index b818fbdb..9f8d3ac1 100644 --- a/README.md +++ b/README.md @@ -13,6 +13,7 @@ ## Table of Contents +- [What's new (2026-06-21) — Chaos Experiments](#whats-new-2026-06-21--chaos-experiments) - [What's new (2026-06-21) — JSON Contract & Snapshot Matching](#whats-new-2026-06-21--json-contract--snapshot-matching) - [What's new (2026-06-21) — SLSA Build Provenance](#whats-new-2026-06-21--slsa-build-provenance) - [What's new (2026-06-21) — Feature Flags](#whats-new-2026-06-21--feature-flags) @@ -123,6 +124,12 @@ --- +## What's new (2026-06-21) — Chaos Experiments + +Inject faults, verify the system holds. Full reference: [`docs/source/Eng/doc/new_features/v71_features_doc.rst`](docs/source/Eng/doc/new_features/v71_features_doc.rst). + +- **`ChaosExperiment` / `run_experiment` / `Probe` / `latency_fault` / `exception_fault`** (`AC_run_chaos`): `resilience` *recovers* from failures; this *causes* them and checks a steady-state hypothesis still holds (Chaos Toolkit lifecycle — verify before, inject faults, verify after, roll back LIFO). Probes/faults/rollbacks are callables; the clock/RNG/sleep are injectable so experiments run **deterministically** in tests with no real failures or sleeping. `AC_run_chaos` drives an action-list spec. Pure-stdlib. + ## What's new (2026-06-21) — JSON Contract & Snapshot Matching Match, diff and snapshot JSON payloads. Full reference: [`docs/source/Eng/doc/new_features/v70_features_doc.rst`](docs/source/Eng/doc/new_features/v70_features_doc.rst). diff --git a/README/README_zh-CN.md b/README/README_zh-CN.md index 2c16ab9e..719c86c8 100644 --- a/README/README_zh-CN.md +++ b/README/README_zh-CN.md @@ -12,6 +12,7 @@ ## 目录 +- [本次更新 (2026-06-21) — 混沌实验](#本次更新-2026-06-21--混沌实验) - [本次更新 (2026-06-21) — JSON 合约与快照比对](#本次更新-2026-06-21--json-合约与快照比对) - [本次更新 (2026-06-21) — SLSA 构建来源证明](#本次更新-2026-06-21--slsa-构建来源证明) - [本次更新 (2026-06-21) — 功能旗标](#本次更新-2026-06-21--功能旗标) @@ -122,6 +123,12 @@ --- +## 本次更新 (2026-06-21) — 混沌实验 + +注入故障、验证系统仍成立。完整参考:[`docs/source/Zh/doc/new_features/v71_features_doc.rst`](../docs/source/Zh/doc/new_features/v71_features_doc.rst)。 + +- **`ChaosExperiment` / `run_experiment` / `Probe` / `latency_fault` / `exception_fault`**(`AC_run_chaos`):`resilience` 从失败中*恢复*;这则*制造*失败并检查稳态假设是否仍成立(Chaos Toolkit 生命周期 —— 之前验证、注入故障、之后验证、LIFO 回滚)。探针/故障/回滚皆为 callable;时钟/RNG/sleep 可注入,因此实验在测试中**确定地**执行,无真正失败或睡眠。`AC_run_chaos` 以动作列表 spec 驱动。纯标准库。 + ## 本次更新 (2026-06-21) — JSON 合约与快照比对 比对、取差异与快照 JSON 内容。完整参考:[`docs/source/Zh/doc/new_features/v70_features_doc.rst`](../docs/source/Zh/doc/new_features/v70_features_doc.rst)。 diff --git a/README/README_zh-TW.md b/README/README_zh-TW.md index 190f9451..61925780 100644 --- a/README/README_zh-TW.md +++ b/README/README_zh-TW.md @@ -12,6 +12,7 @@ ## 目錄 +- [本次更新 (2026-06-21) — 混沌實驗](#本次更新-2026-06-21--混沌實驗) - [本次更新 (2026-06-21) — JSON 合約與快照比對](#本次更新-2026-06-21--json-合約與快照比對) - [本次更新 (2026-06-21) — SLSA 建置來源證明](#本次更新-2026-06-21--slsa-建置來源證明) - [本次更新 (2026-06-21) — 功能旗標](#本次更新-2026-06-21--功能旗標) @@ -122,6 +123,12 @@ --- +## 本次更新 (2026-06-21) — 混沌實驗 + +注入故障、驗證系統仍成立。完整參考:[`docs/source/Zh/doc/new_features/v71_features_doc.rst`](../docs/source/Zh/doc/new_features/v71_features_doc.rst)。 + +- **`ChaosExperiment` / `run_experiment` / `Probe` / `latency_fault` / `exception_fault`**(`AC_run_chaos`):`resilience` 從失敗中*復原*;這則*製造*失敗並檢查穩態假設是否仍成立(Chaos Toolkit 生命週期 —— 之前驗證、注入故障、之後驗證、LIFO 回滾)。探針/故障/回滾皆為 callable;時鐘/RNG/sleep 可注入,因此實驗在測試中**具決定性**地執行,無真正失敗或睡眠。`AC_run_chaos` 以動作清單 spec 驅動。純標準函式庫。 + ## 本次更新 (2026-06-21) — JSON 合約與快照比對 比對、取差異與快照 JSON 內容。完整參考:[`docs/source/Zh/doc/new_features/v70_features_doc.rst`](../docs/source/Zh/doc/new_features/v70_features_doc.rst)。 diff --git a/docs/source/Eng/doc/new_features/v71_features_doc.rst b/docs/source/Eng/doc/new_features/v71_features_doc.rst new file mode 100644 index 00000000..01431ae5 --- /dev/null +++ b/docs/source/Eng/doc/new_features/v71_features_doc.rst @@ -0,0 +1,51 @@ +Chaos Experiments +================= + +``resilience`` *recovers* from failures (retry, circuit breaker); this is the +inverse — it *causes* controlled failures and checks that a steady-state +hypothesis still holds. Modelled on the Chaos Toolkit lifecycle: verify steady +state **before**, run the **method** (fault activities), verify steady state +**after**, then always run **rollbacks** (LIFO). It returns a journal. + +Probes, faults and rollbacks are caller-supplied callables, and the clock / RNG +/ sleep are injectable, so an experiment runs deterministically in tests with +fakes — no real failures, no real sleeping. Pure standard library (``random`` + +``time``); imports no ``PySide6``. + +Headless API +------------ + +.. code-block:: python + + from je_auto_control import ( + ChaosExperiment, Probe, run_experiment, latency_fault) + + experiment = ChaosExperiment( + title="checkout survives slow payments", + probes=[Probe("service_up", check_health, tolerance=True), + Probe("p95_latency", measure_p95, tolerance=[0, 500])], + method=[latency_fault("payment_delay", delay_s=2.0, rate=0.5)], + rollbacks=[restore_network]) + + journal = run_experiment(experiment) + if journal["deviated"]: + print("hypothesis broke under fault:", journal["status"]) + +A ``Probe`` returns a value checked against its ``tolerance`` (a literal, a +``[low, high]`` range, or a predicate callable). ``run_experiment`` verifies the +hypothesis first — if it fails, the status is ``failed-before-method`` and the +method never runs — then applies each fault, re-verifies (setting ``deviated`` +and status ``deviated`` if it no longer holds), and always runs rollbacks LIFO +in a ``finally``. Probe/fault/rollback errors are caught and recorded in the +journal rather than crashing the run. ``latency_fault`` and ``exception_fault`` +are ready-made fault factories with an injectable RNG (rate) and sleep. + +Executor command +---------------- + +``AC_run_chaos`` takes a ``spec`` (object or JSON string) whose probes, method +and rollbacks are **action lists** — ``{title, probes:[{name, action:[AC...]}], +method:[{name, action:[AC...]}], rollbacks:[[AC...]]}`` — and returns the +journal. A probe's steady state holds when its actions run without error. The +same operation is exposed as the MCP tool ``ac_run_chaos`` and as a Script +Builder command under **Flow**. diff --git a/docs/source/Eng/eng_index.rst b/docs/source/Eng/eng_index.rst index 26a7e787..83be642f 100644 --- a/docs/source/Eng/eng_index.rst +++ b/docs/source/Eng/eng_index.rst @@ -93,6 +93,7 @@ Comprehensive guides for all AutoControl features. doc/new_features/v68_features_doc doc/new_features/v69_features_doc doc/new_features/v70_features_doc + doc/new_features/v71_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/docs/source/Zh/doc/new_features/v71_features_doc.rst b/docs/source/Zh/doc/new_features/v71_features_doc.rst new file mode 100644 index 00000000..71bd265f --- /dev/null +++ b/docs/source/Zh/doc/new_features/v71_features_doc.rst @@ -0,0 +1,43 @@ +混沌實驗(Chaos Experiments) +============================ + +``resilience`` 從失敗中*復原*(retry、circuit breaker);這是相反的一面 —— 它*製造*受控的失敗, +並檢查穩態假設是否仍成立。仿照 Chaos Toolkit 生命週期:**之前**驗證穩態、執行**方法**(故障活動)、 +**之後**再驗證穩態,然後永遠執行**回滾**(LIFO)。它回傳一份 journal。 + +探針、故障與回滾皆為呼叫端提供的 callable,且時鐘 / RNG / sleep 可注入,因此實驗在測試中以假物件 +具決定性地執行 —— 沒有真正的失敗、沒有真正的睡眠。純標準函式庫(``random`` + ``time``);不匯入 +``PySide6``。 + +無頭 API +-------- + +.. code-block:: python + + from je_auto_control import ( + ChaosExperiment, Probe, run_experiment, latency_fault) + + experiment = ChaosExperiment( + title="checkout survives slow payments", + probes=[Probe("service_up", check_health, tolerance=True), + Probe("p95_latency", measure_p95, tolerance=[0, 500])], + method=[latency_fault("payment_delay", delay_s=2.0, rate=0.5)], + rollbacks=[restore_network]) + + journal = run_experiment(experiment) + if journal["deviated"]: + print("假設在故障下被打破:", journal["status"]) + +``Probe`` 回傳一個值,並以其 ``tolerance``(字面值、``[low, high]`` 範圍,或述詞 callable)檢查。 +``run_experiment`` 先驗證假設 —— 若失敗,狀態為 ``failed-before-method`` 且方法不會執行 —— 接著 +套用每個故障、再次驗證(若不再成立則設定 ``deviated`` 與狀態 ``deviated``),並在 ``finally`` 中 +永遠以 LIFO 執行回滾。探針/故障/回滾的錯誤會被捕捉並記錄在 journal,而非讓執行崩潰。 +``latency_fault`` 與 ``exception_fault`` 是現成的故障工廠,具可注入的 RNG(rate)與 sleep。 + +執行器命令 +---------- + +``AC_run_chaos`` 接受一份 ``spec``(物件或 JSON 字串),其探針、方法與回滾皆為**動作清單** —— +``{title, probes:[{name, action:[AC...]}], method:[{name, action:[AC...]}], rollbacks:[[AC...]]}`` +—— 並回傳 journal。當探針的動作執行而無錯誤時,其穩態即成立。同一操作亦以 MCP 工具 +``ac_run_chaos`` 以及 Script Builder 中 **Flow** 分類下的命令提供。 diff --git a/docs/source/Zh/zh_index.rst b/docs/source/Zh/zh_index.rst index 5e8c4d74..2e05d813 100644 --- a/docs/source/Zh/zh_index.rst +++ b/docs/source/Zh/zh_index.rst @@ -93,6 +93,7 @@ AutoControl 所有功能的完整使用指南。 doc/new_features/v68_features_doc doc/new_features/v69_features_doc doc/new_features/v70_features_doc + doc/new_features/v71_features_doc doc/ocr_backends/ocr_backends_doc doc/observability/observability_doc doc/operations_layer/operations_layer_doc diff --git a/je_auto_control/__init__.py b/je_auto_control/__init__.py index 0eab1026..d7c97e67 100644 --- a/je_auto_control/__init__.py +++ b/je_auto_control/__init__.py @@ -360,6 +360,11 @@ from je_auto_control.utils.json_contract import ( MatchReport, diff_json, match_json, normalize_json, snapshot_json, ) +# Deterministic chaos experiments (steady-state hypothesis + fault injection) +from je_auto_control.utils.chaos import ( + ChaosExperiment, Fault, Probe, exception_fault, latency_fault, + run_experiment, +) # Background popup/interrupt watchdog (unattended automation) from je_auto_control.utils.watchdog import ( PopupWatchdog, WatchdogRule, default_popup_watchdog, @@ -853,6 +858,8 @@ def start_autocontrol_gui(*args, **kwargs): "build_provenance", "subject_for", "subject_for_bytes", "verify_provenance", "write_provenance", "MatchReport", "diff_json", "match_json", "normalize_json", "snapshot_json", + "ChaosExperiment", "Fault", "Probe", "exception_fault", "latency_fault", + "run_experiment", # MCP server "AuditLogger", "HttpMCPServer", "MCPContent", "MCPPrompt", "MCPPromptArgument", "MCPResource", "MCPServer", "MCPTool", diff --git a/je_auto_control/gui/script_builder/command_schema.py b/je_auto_control/gui/script_builder/command_schema.py index 8e1c926b..010ba04b 100644 --- a/je_auto_control/gui/script_builder/command_schema.py +++ b/je_auto_control/gui/script_builder/command_schema.py @@ -1388,6 +1388,16 @@ def _add_misc_specs(specs: List[CommandSpec]) -> None: ), description="Verify a JWT (alg allowlist + exp/nbf/aud); returns {ok, claims}.", )) + specs.append(CommandSpec( + "AC_run_chaos", "Flow", "Run Chaos Experiment", + fields=( + FieldSpec("spec", FieldType.STRING, + placeholder='{"title": "...", "probes": [{"name": "p", ' + '"action": [...]}], "method": [{"name": "f", ' + '"action": [...]}], "rollbacks": [[...]]}'), + ), + description="Verify steady state, inject faults, re-verify, roll back.", + )) specs.append(CommandSpec( "AC_run_saga", "Flow", "Run Saga (Compensating Rollback)", fields=( diff --git a/je_auto_control/utils/chaos/__init__.py b/je_auto_control/utils/chaos/__init__.py new file mode 100644 index 00000000..1723fe42 --- /dev/null +++ b/je_auto_control/utils/chaos/__init__.py @@ -0,0 +1,10 @@ +"""Deterministic chaos experiments (steady-state hypothesis + fault injection).""" +from je_auto_control.utils.chaos.chaos import ( + ChaosExperiment, Fault, Probe, exception_fault, latency_fault, + run_experiment, +) + +__all__ = [ + "ChaosExperiment", "Fault", "Probe", "exception_fault", "latency_fault", + "run_experiment", +] diff --git a/je_auto_control/utils/chaos/chaos.py b/je_auto_control/utils/chaos/chaos.py new file mode 100644 index 00000000..a1c45a40 --- /dev/null +++ b/je_auto_control/utils/chaos/chaos.py @@ -0,0 +1,143 @@ +"""Deterministic chaos experiments: steady-state hypothesis + fault injection. + +``resilience`` *recovers* from failures (retry, circuit breaker); this is the +inverse — it *causes* controlled failures and checks a steady-state hypothesis +still holds. Modelled on the Chaos Toolkit lifecycle: verify steady state +before, run the method (fault activities), verify steady state after, then +always run rollbacks (LIFO). Returns a journal. + +Probes, faults and rollbacks are caller-supplied callables, and the clock / +RNG / sleep are injectable, so an experiment runs deterministically in tests +with fakes — no real failures, no real sleeping. Pure standard library +(``random`` + ``time``); imports no ``PySide6``. +""" +import random +import time +from dataclasses import dataclass, field +from typing import Any, Callable, Dict, List, Optional, Sequence + + +@dataclass +class Probe: + """A steady-state probe: ``call`` returns a value checked against ``tolerance``.""" + + name: str + call: Callable[[], Any] + tolerance: Any = True + + +@dataclass +class Fault: + """A fault-injection activity run during the experiment method.""" + + name: str + apply: Callable[[], Any] + + +@dataclass +class ChaosExperiment: + """An experiment: a steady-state hypothesis, a method, and rollbacks.""" + + title: str + probes: Sequence[Probe] = () + method: Sequence[Fault] = () + rollbacks: Sequence[Callable[[], Any]] = field(default_factory=tuple) + + +def _check_tolerance(value: Any, tolerance: Any) -> bool: + if callable(tolerance): + return bool(tolerance(value)) + if isinstance(tolerance, (list, tuple)) and len(tolerance) == 2: + return tolerance[0] <= value <= tolerance[1] + return value == tolerance + + +def _verify_probes(probes: Sequence[Probe]) -> Dict[str, Any]: + results: List[Dict[str, Any]] = [] + ok = True + for probe in probes: + try: + value = probe.call() + met = _check_tolerance(value, probe.tolerance) + results.append({"name": probe.name, "ok": met, "value": value}) + except Exception as exc: # pylint: disable=broad-exception-caught + met = False + results.append({"name": probe.name, "ok": False, + "error": str(exc)}) + ok = ok and met + return {"ok": ok, "probes": results} + + +def _apply_fault(fault: Fault) -> Dict[str, Any]: + try: + return {"name": fault.name, "ok": True, "result": fault.apply()} + except Exception as exc: # pylint: disable=broad-exception-caught + return {"name": fault.name, "ok": False, "error": str(exc)} + + +def _run_rollbacks(rollbacks: Sequence[Callable[[], Any]]) -> List[Dict[str, Any]]: + results: List[Dict[str, Any]] = [] + for rollback in reversed(list(rollbacks)): + try: + rollback() + results.append({"ok": True}) + except Exception as exc: # pylint: disable=broad-exception-caught + results.append({"ok": False, "error": str(exc)}) + return results + + +def run_experiment(experiment: ChaosExperiment, *, + clock: Callable[[], float] = time.monotonic) -> Dict[str, Any]: + """Run ``experiment`` and return a journal dict (Chaos-Toolkit shape).""" + start = clock() + before = _verify_probes(experiment.probes) + journal: Dict[str, Any] = { + "title": experiment.title, + "steady_states": {"before": before, "after": None}, + "run": [], "rollbacks": [], "deviated": False, "status": "completed", + } + if not before["ok"]: + journal["status"] = "failed-before-method" + journal["duration"] = clock() - start + return journal + try: + for fault in experiment.method: + journal["run"].append(_apply_fault(fault)) + after = _verify_probes(experiment.probes) + journal["steady_states"]["after"] = after + journal["deviated"] = not after["ok"] + if not after["ok"]: + journal["status"] = "deviated" + finally: + journal["rollbacks"] = _run_rollbacks(experiment.rollbacks) + journal["duration"] = clock() - start + return journal + + +def latency_fault(name: str, *, delay_s: float, rate: float = 1.0, + rng: Optional[random.Random] = None, + sleep: Callable[[float], None] = time.sleep) -> Fault: + """A fault that sleeps ``delay_s`` with probability ``rate``.""" + generator = rng or random.Random() # nosec B311 # reason: non-crypto chaos rate sampling + + def apply() -> Dict[str, Any]: + if generator.random() < rate: + sleep(delay_s) + return {"injected": "latency", "delay_s": delay_s} + return {"injected": None} + + return Fault(name=name, apply=apply) + + +def exception_fault(name: str, *, exc: type = RuntimeError, + message: str = "chaos", rate: float = 1.0, + rng: Optional[random.Random] = None) -> Fault: + """A fault that raises ``exc`` with probability ``rate``.""" + generator = rng or random.Random() # nosec B311 # reason: non-crypto chaos rate sampling + + def apply() -> Dict[str, Any]: + if generator.random() < rate: + raise exc(message) + return {"injected": None} + + return Fault(name=name, apply=apply) diff --git a/je_auto_control/utils/executor/action_executor.py b/je_auto_control/utils/executor/action_executor.py index 4defc6c3..6df58aa5 100644 --- a/je_auto_control/utils/executor/action_executor.py +++ b/je_auto_control/utils/executor/action_executor.py @@ -2928,6 +2928,37 @@ def _rate_limit(name: str, rate: float = 1.0, capacity: float = 1.0, "wait": round(bucket.time_until_available(float(n)), 4)} +def _chaos_probe_call(actions: List[Any]) -> Any: + def call() -> bool: + executor.execute_action(list(actions), raise_on_error=True) + return True + return call + + +def _chaos_fault_apply(actions: List[Any]) -> Any: + def apply() -> Dict[str, Any]: + return executor.execute_action(list(actions), raise_on_error=True) + return apply + + +def _run_chaos(spec: Any) -> Dict[str, Any]: + """Adapter: run a chaos experiment whose probes/method/rollbacks are actions.""" + import json + from je_auto_control.utils.chaos import ( + ChaosExperiment, Fault, Probe, run_experiment) + if isinstance(spec, str): + spec = json.loads(spec) + probes = [Probe(p.get("name", "probe"), _chaos_probe_call(p["action"]), True) + for p in spec.get("probes", [])] + method = [Fault(f.get("name", "fault"), _chaos_fault_apply(f["action"])) + for f in spec.get("method", [])] + rollbacks = [_chaos_fault_apply(actions) + for actions in spec.get("rollbacks", [])] + experiment = ChaosExperiment(spec.get("title", "chaos"), probes, method, + rollbacks) + return run_experiment(experiment) + + def _match_json(actual: Any, expected: Any, partial: bool = False, match_type: bool = False) -> Dict[str, Any]: """Adapter: match a JSON payload against an expected one (relaxed rules).""" @@ -3933,6 +3964,7 @@ def __init__(self): "AC_verify_provenance": _verify_provenance, "AC_match_json": _match_json, "AC_diff_json": _diff_json, + "AC_run_chaos": _run_chaos, "AC_unified_diff": _unified_diff, "AC_apply_unified": _apply_unified, "AC_three_way_merge": _three_way_merge, diff --git a/je_auto_control/utils/mcp_server/tools/_factories.py b/je_auto_control/utils/mcp_server/tools/_factories.py index b8ca94ab..765519f7 100644 --- a/je_auto_control/utils/mcp_server/tools/_factories.py +++ b/je_auto_control/utils/mcp_server/tools/_factories.py @@ -3343,6 +3343,21 @@ def rate_limit_tools() -> List[MCPTool]: ] +def chaos_tools() -> List[MCPTool]: + return [ + MCPTool( + name="ac_run_chaos", + description=("Run a chaos experiment 'spec' {title, probes:[{name, " + "action}], method:[{name, action}], rollbacks:[[...]]}" + " — verify steady state, inject faults, re-verify, roll " + "back. Returns the journal {status, deviated, ...}."), + input_schema=schema({"spec": {"type": "object"}}, ["spec"]), + handler=h.run_chaos, + annotations=READ_ONLY, + ), + ] + + def json_contract_tools() -> List[MCPTool]: return [ MCPTool( @@ -4773,7 +4788,7 @@ def media_assert_tools() -> List[MCPTool]: jsonpath_tools, json_schema_tools, vuln_scan_tools, vex_tools, license_policy_tools, jwt_tools, rate_limit_tools, json_patch_tools, search_index_tools, stats_tools, recurrence_tools, text_diff_tools, - feature_flag_tools, provenance_tools, json_contract_tools, + feature_flag_tools, provenance_tools, json_contract_tools, chaos_tools, saga_tools, decision_table_tools, locator_repair_tools, pii_text_tools, sarif_tools, screen_record_tools, diff --git a/je_auto_control/utils/mcp_server/tools/_handlers.py b/je_auto_control/utils/mcp_server/tools/_handlers.py index 9fc9dd58..3a29363d 100644 --- a/je_auto_control/utils/mcp_server/tools/_handlers.py +++ b/je_auto_control/utils/mcp_server/tools/_handlers.py @@ -1675,6 +1675,11 @@ def diff_json(actual, expected): return {"diffs": _diff(actual, expected)} +def run_chaos(spec): + from je_auto_control.utils.executor.action_executor import _run_chaos + return _run_chaos(spec) + + def build_provenance(paths, builder_id="je_auto_control"): from je_auto_control.utils.provenance import build_provenance, subject_for subjects = [subject_for(path) for path in paths] diff --git a/test/unit_test/headless/test_chaos_batch.py b/test/unit_test/headless/test_chaos_batch.py new file mode 100644 index 00000000..43a25ac5 --- /dev/null +++ b/test/unit_test/headless/test_chaos_batch.py @@ -0,0 +1,107 @@ +"""Headless tests for the chaos experiment runner. Pure stdlib, no Qt.""" +import json +import random + +import pytest + +import je_auto_control as ac +from je_auto_control.utils.chaos import ( + ChaosExperiment, Fault, Probe, exception_fault, latency_fault, + run_experiment) + + +def test_steady_state_holds_no_deviation(): + state = {"healthy": True} + experiment = ChaosExperiment( + "ok", + probes=[Probe("health", lambda: state["healthy"], True)], + method=[latency_fault("lat", delay_s=0.5, sleep=lambda _d: None)], + rollbacks=[lambda: state.update(healthy=True)]) + journal = run_experiment(experiment) + assert journal["status"] == "completed" + assert journal["deviated"] is False + assert len(journal["rollbacks"]) == 1 + + +def test_failed_before_method_bails(): + journal = run_experiment(ChaosExperiment( + "bad", probes=[Probe("p", lambda: False, True)], + method=[Fault("x", lambda: 1)])) + assert journal["status"] == "failed-before-method" + assert journal["run"] == [] + + +def test_deviation_after_method(): + counter = {"n": 0} + + def probe(): + counter["n"] += 1 + return counter["n"] < 2 # ok before, fails after + + journal = run_experiment(ChaosExperiment( + "dev", probes=[Probe("p", probe, True)], + method=[Fault("break", lambda: "boom")])) + assert journal["status"] == "deviated" + assert journal["deviated"] is True + + +def test_tolerance_range_and_predicate(): + assert run_experiment(ChaosExperiment( + "r", probes=[Probe("lat", lambda: 50, [0, 100])]))[ + "steady_states"]["before"]["ok"] is True + assert run_experiment(ChaosExperiment( + "p", probes=[Probe("even", lambda: 4, lambda v: v % 2 == 0)]))[ + "steady_states"]["before"]["ok"] is True + + +def test_exception_fault_recorded(): + fault = exception_fault("boom", rate=1.0, rng=random.Random(0)) + journal = run_experiment(ChaosExperiment( + "ex", probes=[Probe("p", lambda: True, True)], method=[fault])) + assert journal["run"][0]["ok"] is False + assert "error" in journal["run"][0] + + +def test_rollbacks_run_lifo(): + order = [] + run_experiment(ChaosExperiment( + "lifo", probes=[Probe("p", lambda: True, True)], method=[], + rollbacks=[lambda: order.append(1), lambda: order.append(2)])) + assert order == [2, 1] + + +def test_injectable_clock(): + journal = run_experiment( + ChaosExperiment("c", probes=[Probe("p", lambda: True, True)]), + clock=lambda: 5.0) + assert journal["duration"] == pytest.approx(0.0) + + +# --- wiring --------------------------------------------------------------- + +def test_executor_round_trip(): + spec = { + "title": "exec-chaos", + "probes": [{"name": "noop", "action": [["AC_sleep", {"seconds": 0}]]}], + "method": [{"name": "noop", "action": [["AC_sleep", {"seconds": 0}]]}], + "rollbacks": [[["AC_sleep", {"seconds": 0}]]], + } + rec = ac.execute_action([["AC_run_chaos", {"spec": json.dumps(spec)}]]) + journal = next(v for v in rec.values() if isinstance(v, dict)) + assert journal["status"] == "completed" + assert len(journal["rollbacks"]) == 1 + + +def test_wiring(): + assert "AC_run_chaos" in ac.executor.known_commands() + from je_auto_control.utils.mcp_server.tools import build_default_tool_registry + assert "ac_run_chaos" in {t.name for t in build_default_tool_registry()} + from je_auto_control.gui.script_builder.command_schema import _build_specs + assert "AC_run_chaos" in {s.command for s in _build_specs()} + + +def test_facade_exports(): + for attr in ("ChaosExperiment", "Probe", "Fault", "run_experiment", + "latency_fault", "exception_fault"): + assert hasattr(ac, attr) + assert attr in ac.__all__