Conversation
Implement full Dict API (Put, Get, Update, Pop, Contains, Len, Keys, Values, Items, Clear, FromName, Delete) using og-rek + cloudpickle hybrid serialization for cross-language compatibility with Python.
PR SummaryMedium Risk Overview Implements specialized pickle serialization for Dict keys to match Python Written by Cursor Bugbot for commit 460afe3. This will update automatically on new commits. Configure here. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.
- encodeLong1: guard against empty Bytes() for n=-1 (abs-1==0), which would panic on data[len(data)-1] index out of range - cloudpickleToOgRek: only strip trailing 0x94 as MEMOIZE when the leading opcode is a string/bytes type, mirroring ogrekToCloudpickle. Previously, a BININT payload ending in 0x94 (e.g. int -1811939328) would have its data byte falsely stripped.
|
Thanks @sri-thorobid. While we are planning to add support for Dict or some other Key-value primitive to the Go and JS SDKs, there is a substantial amount of complexity around serialization stability that you've picked up on here. Its likely that we're going to approach using CBOR serialization as we do for Function payloads, which is build to be cross-language. |
|
@mwaskom thats what i figured i just raised this incase others need this for now for primitive keys, we are using it for now in our fork till cbor compatability. feel free to close PR as it will still be searchable. Thanks for the view into roadmap though! excited for not having to manage a fork :-). |
Summary
Put,Get,Update,Pop,Contains,Len,Keys,Values,Items,Clear,FromName,Deletecloudpickle.dumps(value, protocol=4)DictServiceonClient(client.Dicts)Serialization
Why this complexity?
Modal's Dict server matches keys by byte-equality of their serialized pickle representation. When Python writes
d[42] = "hello", the key is stored as the exact bytes produced bycloudpickle.dumps(42, protocol=4). For Go to read that key back, it must produce the identical byte sequence — not just a semantically equivalent pickle, but byte-for-byte the same output.This is non-trivial because Go's og-rek pickle library and Python's cloudpickle produce structurally different output for the same values:
Iopcode:I1234567890\nLONG1opcode:0x8a+ binary two's complement LE bytes[]byte)builtins.bytearray()constructorSHORT_BINBYTES(C) orBINBYTES(B) opcodeMEMOIZEafter string/bytes opcodesMEMOIZE(0x94) after each string/bytes valueFRAMEwrapperFRAME(0x95) + 8-byte LE length prefix for payloads ≥ 4 bytesWithout handling these differences, a Go client writing
dict[42] = "value"would store the key under different bytes than Python expects, making the entry invisible to Python readers (and vice versa).How it works
Keys and values are serialized using og-rek's pickle encoder, then post-processed by
ogrekToCloudpickleto match cloudpickle's protocol 4 output:Iopcode →LONG1for integers outside int32 rangebytearray()constructor → bareSHORT_BINBYTES/BINBYTESfor[]byteMEMOIZEafter string/bytes opcodesFRAMEheader if ≥ 4 bytesDeserialization uses og-rek's decoder with the inverse transform (
cloudpickleToOgRek).Go key/value types
nilNONEpickle.None{}boolNEWTRUE/NEWFALSEboolint,int8–int64BININT1/BININT2/BININT/LONG1int64(≤ int32) or*big.Intuint8–uint64BININT1/BININT2/BININT/LONG1int64(≤ int32) or*big.Intfloat32,float64BINFLOATfloat64stringSHORT_BINUNICODE/BINUNICODEstring[]byteSHORT_BINBYTES/BINBYTESpickle.Bytes(typeBytes string)map[any]anyEMPTY_DICT+SETITEMSmap[interface{}]interface{}[]anyEMPTY_LIST+APPENDS[]interface{}Note that deserialized types differ from input types due to og-rek's type mapping (e.g. all ints widen to
int64,Nonebecomespickle.None{}notnil,[]bytebecomespickle.Bytes). Callers should use type assertions accordingly.Files changed
modal-go/dict.gomodal-go/dict_serialization_test.gomodal-go/test/dict_test.gomodal-go/client.goDictsfield onClientTest plan
cloudpickle.dumpsoutputogrekToCloudpickle/cloudpickleToOgRekinverse transform