You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
🤖 ci: simplify flaky bash special characters test (#527)
## Problem
The integration test `should handle bash command with special
characters` was flaky in CI. Investigation revealed two issues:
1. **The test was testing AI escaping behavior** rather than bash
execution functionality - it expected the LLM to properly escape shell
special characters (`$`, backticks, quotes), which is unpredictable
2. **gpt-5-mini was not reliably calling the bash tool** - tests would
complete but the tool call would be missing
## Solution
1. Removed the flaky special characters test entirely - it wasn't
testing our code
2. Switched all tests from `gpt-5-mini` to `claude-haiku-4-5` - Haiku is
faster and more reliable for tool use
## Testing
Ran all 6 tests 3x locally - all passed:
- Run 1: 6/6 passed (25.9s)
- Run 2: 6/6 passed (28.3s)
- Run 3: 6/6 passed (26.3s)
Tests now complete quickly and reliably with the Haiku model.
_Generated with `cmux`_
0 commit comments