Skip to content

Comments

Serialize Rust data structures to / from pods#396

Open
dgulotta wants to merge 12 commits intomainfrom
serialize-to-pod
Open

Serialize Rust data structures to / from pods#396
dgulotta wants to merge 12 commits intomainfrom
serialize-to-pod

Conversation

@dgulotta
Copy link
Collaborator

@dgulotta dgulotta commented Aug 25, 2025

This PR allows Rust data structures to be serialized to / from pods Value and Dictionary. There are some example tests in middleware/serialization.rs.

The implementation seems good enough to cover most use cases, but there are few ways that it could be improved:

  • Ideally, serializing a Value would return the same Value, and serializing any of RawValue, Point, SecretKey, Dictionary, Set, Array would just wrap it in a TypedValue in the obvious way. This behavior is implemented for RawValue, Point, and SecretKey, but not for Dictionary, Set, Array and Value. Implementing the remaining four would be feasible, but the code would be ugly.
  • If the user happens to create their own newtype struct named RawValue, Point, or SecretKey, then the serializer will think that it is one of the POD types. We could fix this by having our types pass custom names to Serializer::serialize_newtype_struct.
  • It would be convenient to have a serializer and deserializer for Dictionary, and maybe having deserializers for owned types in addition to reference types.
  • If the deserializer receives a request for a specific data type, and it can't fulfill that request, it will send whatever it has and let the Deserialize implementation decide if it can do anything with that. It seems customary to return an error instead.

Where possible, I tried to mimic serde_json behavior, but a few things had to be done differently since POD does not have null or floating point numbers.

  • () and unit structs are serialized as an empty Array (the JSON serializer uses null)
  • None is serialized as the string "None" (the JSON serializer uses null)
  • Some(x) is serialized as a dictionary mapping "Some" to x (the JSON serializer just uses x)
    • Should we use an empty Array/Set for None and a one-element Array/Set for Some(x) instead?
  • f32 and f64 are serialized as strings using scientific notation
  • Bytes are serialized as a base64 encoded string (the JSON serializer uses an array of numbers but I figured that we may not to build a large Merkle tree for a byte array)

@dgulotta dgulotta linked an issue Aug 25, 2025 that may be closed by this pull request
@dgulotta
Copy link
Collaborator Author

dgulotta commented Sep 9, 2025

  • Added a deserializer for Dictionary
  • Added pod2:: prefix to serde names of RawValue, Point, SecretKey so the serializer won't misidentify user defined types with these names
  • Changed () to serialize to Bool(false)
  • Changed None to serialize to an empty set and Some(x) to serialize to a one-element set

@dgulotta dgulotta marked this pull request as ready for review September 9, 2025 19:08
@dgulotta
Copy link
Collaborator Author

As @robknight has pointed out, it makes more sense to represent a struct containing an Option by skipping the field if the value is None. I'm not sure if this needs to be done now, or if it can wait until we have a particular need for it.

Since the serialization feature is intended to be "something that does what you want most of the time" rather than "the canonical way of converting Rust data structures to POD", maybe it would make more sense to have newtype structs for the deserializers, rather than having &TypedValue and &Dictionary implement Deserializer directly.

@dgulotta
Copy link
Collaborator Author

The latest commit makes it so that Options inside of structs are flattened: MyStruct{ field: Some(x) } is serialized as { "field" : x } and MyStruct{ field: None } is serialized as {}.

@dgulotta
Copy link
Collaborator Author

dgulotta commented Sep 14, 2025

I looked into an alternate approach that uses procedural macros instead of serde. Code here:
https://github.com/dgulotta/pod2-derive
https://github.com/0xPARC/pod2/tree/more-try-from

Advantages of the serde approach:

  • Works with almost any type that implements Serialize/Deserialize
  • Gets some customization for free via serde attributes
  • Does not require a separate crate for procedural macros

Advantages of the procedural macro approach:

  • Cleaner code, especially the handling of conversions between Value and pod2 types (making serde Value-to-Value (de)serialization a no-op would be a real mess, and even the approach of using serde newtype structs to handle Point/RawValue/SecretKey is a bit hacky)
  • Can differentiate between Vec and HashSet without any hints from the user. The serde approach will serialize HashSet to TypedValue::Array unless the user specifies otherwise, and also makes it very difficult to prevent Vec from allowing conversion from TypedValue::Set and HashSet from allowing conversion from TypedValue::Array.

@dgulotta
Copy link
Collaborator Author

After trying the procedural macro approach for a while, I concluded that it's probably better in principle but also more time-consuming to implement. So the serde approach seems more realistic for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Make it easier to deserialize data from a TypedValue

1 participant