Add deterministic chaos experiment runner#279
Merged
Merged
Conversation
resilience recovers from failures; this causes controlled ones and checks a steady-state hypothesis still holds (Chaos Toolkit lifecycle: verify before, inject faults, verify after, roll back LIFO). Probes/faults/ rollbacks are callables and the clock/RNG/sleep are injectable, so experiments run deterministically in tests. Wired through the facade, AC_run_chaos executor command (action-list spec), MCP tool and the Script Builder.
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 43 |
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



Summary
resiliencerecovers from failures (retry, circuit breaker); this is the inverse — it causes controlled failures and checks a steady-state hypothesis still holds. Modelled on the Chaos Toolkit lifecycle: verify steady state before, run the method (fault activities), verify after, then always run rollbacks (LIFO). Returns a journal.ChaosExperiment(title, probes, method, rollbacks),Probe(name, call, tolerance),Fault(name, apply).run_experiment(experiment, *, clock=)— journal{status, deviated, run, rollbacks, steady_states};failed-before-methodif the hypothesis fails up front.latency_fault/exception_fault— fault factories with injectable RNG (rate) + sleep.toleranceis a literal, a[low, high]range, or a predicate. Probes/faults/rollbacks are caller callables; clock/RNG/sleep injectable → deterministic tests (no real failures or sleeping). Probe/fault errors are caught and recorded. Pure stdlib (random+time).Five-layer wiring
je_auto_control/utils/chaos/__init__.py+__all__AC_run_chaos— action-list spec{title, probes:[{name, action}], method:[{name, action}], rollbacks:[[...]]}ac_run_chaosTests & docs
test/unit_test/headless/test_chaos_batch.py(10 tests: no-deviation, failed-before-method, deviation, tolerance range/predicate, exception fault, LIFO rollbacks, injectable clock, executor)Lint clean: ruff / pylint / bandit (B311 nosec, repo convention) / radon.