Ensure subtest's context kwargs are JSON serializable #13963

nicoddemus · 2025-11-13T22:32:27Z

Convert all the values of SubtestContext.kwargs to strings using saferepr.

This complies with the requirement that the returned dict from pytest_report_to_serializable is serializable to JSON, at the cost of losing type information for objects that are natively supported by JSON.

Fixes pytest-dev/pytest-xdist#1273

Convert all the values of `SubtestContext.kwargs` to strings using `saferepr`. This complies with the requirement that the returned dict from `pytest_report_to_serializable` is serializable to JSON, at the cost of losing type information for objects that are natively supported by JSON. Fixes pytest-dev/pytest-xdist#1273

bluetech

A couple of comments for your consideration.

bluetech · 2025-11-13T22:50:10Z

src/_pytest/subtests.py

-        return dataclasses.asdict(self)
+        result = dataclasses.asdict(self)
+        # Brute-force the returned kwargs dict to be JSON serializable (pytest-dev/pytest-xdist#1273).
+        result["kwargs"] = {k: saferepr(v) for (k, v) in result["kwargs"].items()}


IMO we should strive to have the property that from_json(to_json(report)) == report. In this case it means that we should have self.kwargs itself should be serializable, not just in to_json.

IMO we should strive to have the property that from_json(to_json(report)) == report.

I agree this is desirable, but I don't see how we can accomplish this in a general manner, for any type of object.

In unittest subtests, kwargs is permitted to contain any type of object, because it is not serialized to anything and used only for reporting.

I don't see how we can support that in functions that serialize to JSON.

Perhaps I'm missing something, could you elaborate a bit?

What I'm thinking is that kwargs itself contain strings, not the objects themselves, that would of course make it serializable. It is not exposed in the API so should be fine? Unless the original objects are needed for some reason?

Well once they are serialized and come back unserialized on the other side, unfortunately they might not be useful depending on the intent. For reporting this should be fine, but if a plugin is using the serialization hooks and expecting the actual objects to be returned this would be a problem.

We can perhaps try to keep the concrete types for objects we know are safe (int, float, etc.), but for other objects I don't see another solution.

We can perhaps try to keep the concrete types for objects we know are safe (int, float, etc.)

Implemented that, take a look.

OK, I defer to your judgment on the representation.

However, I still think that from_json(to_json(report)) == report is important.

Well once they are serialized and come back unserialized on the other side, unfortunately they might not be useful depending on the intent. For reporting this should be fine, but if a plugin is using the serialization hooks and expecting the actual objects to be returned this would be a problem.

If the report itself keeps the kwargs as dict[str, json-serializable] (i.e. performs the conversion in init, not in to_serializable), then the property holds. Is there a need to keep the "raw" objects?

If the report itself keeps the kwargs as dict[str, json-serializable] (i.e. performs the conversion in init, not in to_serializable), then the property holds. Is there a need to keep the "raw" objects?

I think we should keep the original objects, because doesn't seem right to me for us to arbitrarily convert the values to something else during __init__ just on the off-chance to_serializable might be called eventually. We lose data this way. Nothing prevents a custom plugin to expect the report objects to be intact in their own pytest_runtest_logreport implementation.

But something occurred to me: there's nothing in pytest_report_to_serializable that says the data returned needs to be JSON-native, only JSON-compatible. This means we can convert the data in the report in whenever format we want, as long as the resulting data can be converted into JSON.

With this in mind, I think we can simply use pickle to convert the kwargs into a str, which we can then easily deserialize during pytest_report_from_serializable. This will hold the from_serializable(to_serializable(report)) == report property, and should support most data types. pickle has its problems, but is the standard way to serialize/deserialize data in Python.

Done, please take a look.

One downside that did not occur to me is that local objects (such as MyEnum used in the test) normally cannot be pickled. I don't think this will be a problem in production, however.

We lose data this way. Nothing prevents a custom plugin to expect the report objects to be intact in their own pytest_runtest_logreport implementation.

I somewhat think that an arbitrary Python value would be harder for a plugin to handle than a JSON-native value. AFAIK the main/intended use case for the pytest_runtest_logreport hook is to report the outcome to some other place (terminal, junitxml, etc.) and so what the plugin mostly cares about is the nodeid and outcome and metadata. In case of subtests, I'm not sure how many plugins would want to do something special with the kwargs formatting beyond repring it.

Regarding pickle, I never really used it, but I assumed it wasn't used in pytest for a reason. E.g. execnet doesn't use it for serialization, and pytest also went with json-like for report serialization. Maybe because it's Python specific, or binary, or has multiple versions, or not secure? For this reason my vague sense is that we should avoid it but I don't have a concrete argument.

I somewhat think that an arbitrary Python value would be harder for a plugin to handle than a JSON-native value.

Plugins are supposed to be "looking" at the value returned by pytest_report_to_serializable, the value is only meant to be JSON serializable so plugins have some guidance on how it is supposed to be implemented. Ultimately, plugins are expected to pass the returned value over to pytest_report_from_serializable and then use the returned object.

AFAIK the main/intended use case for the pytest_runtest_logreport hook is to report the outcome to some other place (terminal, junitxml, etc.) and so what the plugin mostly cares about is the nodeid and outcome and metadata. In case of subtests, I'm not sure how many plugins would want to do something special with the kwargs formatting beyond repring it.

Perhaps, but I wouldn't just assume that: there is nothing in the docs or the design that says that we will convert the kwargs using saferepr (and lose the original information). I can envision a plugin saving that data to some other logging system, and attempting to recover it later using pytest_report_from_serializable. If we convert things using saferepr from under it we lose the original metadata.

Regarding pickle, I never really used it, but I assumed it wasn't used in pytest for a reason. E.g. execnet doesn't use it for serialization, and pytest also went with json-like for report serialization. Maybe because it's Python specific, or binary, or has multiple versions, or not secure? For this reason my vague sense is that we should avoid it but I don't have a concrete argument.

The reasons are:

It is Python specific; but here we are ultimately implementing hooks in Python, so I don't see this as an issue.

It has multiple versions, and does not play well with Python 2 -> 3 objects; I believe this was the main motivator for execnet using a custom system.

It is not secure, because it can execute arbitrary code (like __init__ of some classes if they are pickled, __setstate__, etc). But ultimately we are executing custom Python code anyway directly (the hooks), so this point is moot (we are already executing arbitrary code).

I think we are running in circles here:

We want from_json(to_json(kwargs)) == kwargs and I agree with that.

You seem to suggest to use saferepr when creating* the SubTestReport, which I disagree with: we shouldn't be changing the data like that just because eventually we might call a separate hook for serialization.

Perhaps @RonnyPfannschmidt @Pierre-Sassoulas @The-Compiler want to chime in and have other suggestions or share their opinions on what a correct approach would look like?

src/_pytest/subtests.py

nicoddemus force-pushed the xdist-subtests-serialization branch from a3cf834 to 7585f45 Compare November 13, 2025 22:35

psf-chronographer bot added the bot:chronographer:provided (automation) changelog entry is part of PR label Nov 13, 2025

nicoddemus added the backport 9.0.x apply to PRs at any point; backports the changes to the 9.0.x branch label Nov 13, 2025

bluetech reviewed Nov 13, 2025

View reviewed changes

nicoddemus force-pushed the xdist-subtests-serialization branch 2 times, most recently from 53f50a5 to f8da2cb Compare November 22, 2025 12:23

Best-effort to convert to JSON

fdfa22f

nicoddemus force-pushed the xdist-subtests-serialization branch from f8da2cb to fdfa22f Compare November 22, 2025 14:47

Use pickle

b5b4035

nicoddemus requested a review from bluetech November 27, 2025 19:07

jhillairet mentioned this pull request Nov 30, 2025

Temporary fix of continuous integration scikit-rf/scikit-rf#1357

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Ensure subtest's context kwargs are JSON serializable #13963

Ensure subtest's context kwargs are JSON serializable #13963

nicoddemus commented Nov 13, 2025

Uh oh!

bluetech left a comment

Uh oh!

bluetech Nov 13, 2025

Uh oh!

nicoddemus Nov 21, 2025 •

edited

Loading

Uh oh!

bluetech Nov 22, 2025

Uh oh!

nicoddemus Nov 22, 2025

Uh oh!

nicoddemus Nov 22, 2025

Uh oh!

bluetech Nov 22, 2025

Uh oh!

nicoddemus Nov 25, 2025 •

edited

Loading

Uh oh!

nicoddemus Nov 27, 2025

Uh oh!

bluetech Nov 29, 2025

Uh oh!

nicoddemus Dec 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Ensure subtest's context kwargs are JSON serializable #13963

Are you sure you want to change the base?

Ensure subtest's context kwargs are JSON serializable #13963

Conversation

nicoddemus commented Nov 13, 2025

Uh oh!

bluetech left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nicoddemus Nov 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nicoddemus Nov 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nicoddemus Nov 21, 2025 •

edited

Loading

nicoddemus Nov 25, 2025 •

edited

Loading