What's the problem this feature will solve?
When using a RaisesExc check function (either within a with RaisesGroup or (less commonly) directly via with RaisesExc), the error messages can be vague, complicating debugging1. For example:
def test() -> None:
def check_BarError(exc: BarError, /) -> bool:
return (str(exc) == "a" and isinstance(exc.__cause__, FooError))
with pytest.RaisesGroup(
pytest.RaisesExc(BarError, check=check_BarError),
):
do_something()
In the check function, either check, or both, can fail. But a failure of either check is reported the same way:
[ ... ]
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/tmp/tmp.DPffGtQk0J/eg.py", line 33, in test
with pytest.RaisesGroup(
~~~~~~~~~~~~~~~~~~^
pytest.RaisesExc(BarError, check=check_BarError),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
):
^
File "/tmp/tmp.DPffGtQk0J/.direnv/python-3.14/lib/python3.14/site-packages/_pytest/raises.py", line 1432, in __exit__
fail(f"Raised exception {group_str} did not match: {self._fail_reason}")
~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/tmp/tmp.DPffGtQk0J/.direnv/python-3.14/lib/python3.14/site-packages/_pytest/outcomes.py", line 163, in __call__
raise Failed(msg=reason, pytrace=pytrace)
Failed: Raised exception group did not match: RaisesExc(BarError, check=<function test.<locals>.check_BarError at 0x7f5aac5a8930>): check did not return True
During handling of the above exception, another exception occurred:
def test() -> None:
def check_BarError(exc: BarError, /) -> bool:
return (str(exc) == "a" and isinstance(exc.__cause__, FooError))
> with pytest.RaisesGroup(
pytest.RaisesExc(BarError, check=check_BarError),
):
E Failed: Raised exception group did not match: RaisesExc(BarError, check=<function test.<locals>.check_BarError at 0x7f5aac5a8930>): check did not return True
The reported failure reason is ambiguous and does not make debugging easier2.
Using assertion rewriting in the example check function makes failures clearer:
def test() -> None:
def check_BarError(exc: BarError, /) -> bool:
assert str(exc) == "a"
assert isinstance(exc.__cause__, FooError)
return True
with pytest.RaisesGroup(
pytest.RaisesExc(BarError, check=check_BarError),
):
do_something()
We get clearer failure reports, e.g.
[... truncated ...]
exc = BarError('not right')
def check_BarError(exc: BarError, /) -> bool:
> assert str(exc) == "a"
E AssertionError: assert 'not right' == 'a'
E
E - a
E + not right
and
[... truncated ..]
exc = BarError('a')
def check_BarError(exc: BarError, /) -> bool:
assert str(exc) == "a"
> assert isinstance(exc.__cause__, FooError)
E AssertionError: assert False
E + where False = isinstance(ValueError(), FooError)
E + where ValueError() = BarError('a').__cause__
Using assertions in a RaisesExc's check function works well for a RaisesGroup containing only one RaisesExc (and for a bare with raises/RaisesExc), but putting multiple of them into a RaisesGroup causes RaisesGroup to misbehave. A small example:
@pytest.mark.parametrize("order", tuple(range(math.perm(2))))
def test(order: int) -> None:
def check_foo(exc: ValueError, /) -> bool:
assert exc.args[0] == "foo"
return True
def check_bar(exc: ValueError, /) -> bool:
assert exc.args[0] == "bar"
return True
with pytest.RaisesGroup(
pytest.RaisesExc(ValueError, check=check_foo),
pytest.RaisesExc(ValueError, check=check_bar),
):
raise ExceptionGroup(
"",
tuple(
itertools.permutations(
(
ValueError("foo"),
ValueError("bar"),
)
)
)[order],
)
One of the orders passes and the other fails. Ideally, both would pass. RaisesGroup is documented as order-agnostic (modulo potential issues related to greedy matching).
Describe the solution you'd like
It would be great if RaisesExc(..., check=i_use_assertions) worked in RaisesGroups that contain multiple RaisesExcs, not just one.
Alternative solutions
Alternative solutions I've tried:
-
Don't use assertions in check functions.
This made debugging test failures tedous. Pytest's assertion rewriting is very convenient! I ended up usually putting breakpoint()s in check functions and checking properties manually in a REPL.
-
Don't use RaisesGroup; don't do reordering of group members during matching.
This is a no-go because (with async runtimes) it would have caused overtesting for scheduling order (i.e. it would have caused scheduling-order-dependent test failures).
-
Don't use RaisesGroup; do reordering of group members manually during matching.
This is what I've currently settled on doing in these cases. Applied to the most recent example, it's
def assert_matches(
exception: BaseException, raises: _pytest.raises.AbstractRaises[BaseException]
) -> None:
assert raises.matches(exception), raises.fail_reason
@pytest.mark.parametrize("order", tuple(range(math.perm(2))))
def test(order: int) -> None:
def check_foo(exc: ValueError, /) -> bool:
assert exc.args[0] == "foo"
return True
def check_bar(exc: ValueError, /) -> bool:
assert exc.args[0] == "bar"
return True
def check_group(exc: ExceptionGroup[Exception], /) -> bool:
assert len(exc.exceptions) == 2
for exceptions in itertools.permutations(exc.exceptions):
try:
# TODO: https://github.com/python/mypy/issues/19304: Remove the
# dummy variables.
assert_matches(
exceptions[0], _ := pytest.RaisesExc(ValueError, check=check_foo)
)
assert_matches(
exceptions[1], _ := pytest.RaisesExc(ValueError, check=check_bar)
)
except AssertionError: # cov: ignore
continue
else:
return True
return False # cov: ignore
with pytest.raises(ExceptionGroup, check=check_group):
raise ExceptionGroup(
"",
tuple(
itertools.permutations(
(
ValueError("foo"),
ValueError("bar"),
)
)
)[order],
)
It works well, but it's a bit tedious to write.3
It could also be done with with raises(ExceptionGroup) as exc_info and asserting properties of exc_info.value, but it's just as tedious.
What's the problem this feature will solve?
When using a
RaisesExccheckfunction (either within awith RaisesGroupor (less commonly) directly viawith RaisesExc), the error messages can be vague, complicating debugging1. For example:In the check function, either check, or both, can fail. But a failure of either check is reported the same way:
The reported failure reason is ambiguous and does not make debugging easier2.
Using assertion rewriting in the example
checkfunction makes failures clearer:We get clearer failure reports, e.g.
and
Using assertions in a
RaisesExc'scheckfunction works well for aRaisesGroupcontaining only oneRaisesExc(and for a barewith raises/RaisesExc), but putting multiple of them into aRaisesGroupcausesRaisesGroupto misbehave. A small example:One of the orders passes and the other fails. Ideally, both would pass.
RaisesGroupis documented as order-agnostic (modulo potential issues related to greedy matching).Describe the solution you'd like
It would be great if
RaisesExc(..., check=i_use_assertions)worked inRaisesGroups that contain multipleRaisesExcs, not just one.Alternative solutions
Alternative solutions I've tried:
Don't use assertions in
checkfunctions.This made debugging test failures tedous. Pytest's assertion rewriting is very convenient! I ended up usually putting
breakpoint()s incheckfunctions and checking properties manually in a REPL.Don't use
RaisesGroup; don't do reordering of group members during matching.This is a no-go because (with async runtimes) it would have caused overtesting for scheduling order (i.e. it would have caused scheduling-order-dependent test failures).
Don't use
RaisesGroup; do reordering of group members manually during matching.This is what I've currently settled on doing in these cases. Applied to the most recent example, it's
It works well, but it's a bit tedious to write.3
It could also be done with
with raises(ExceptionGroup) as exc_infoand asserting properties ofexc_info.value, but it's just as tedious.Footnotes
Especially when the
checkin question only fails occasionally and/or only in remote CI jobs where you can't easily reach for PDB. ↩In this particular example, the full traceback of the failure does include the exception's message and the line where its
__cause__was raised, so it can be deduced which was the issue, but (in some cases that I hit) with more complicated checks and more complicated tracebacks from service tasks in task groups, it becomes harder to eke out that information from the long traceback (or theFailedand accompanying traceback simply doesn't contain the information to deduce that). ↩The permutation does in principle blow up quicker than pytest's greedy resolver does, but it does theoretically also have the advantage of avoiding some of the cases where pytest's resolver is too greedy and fails to find the (or a) permutation that succeeds. ↩