When a background task fails with a litellm exception (rate limit, timeout, connection error, etc.), the docket worker cannot serialize the exception for the result queue.
What happens
docket/worker.py calls cloudpickle.dumps(e) on any exception from a failed task. The dumps succeeds, but cloudpickle.loads() later tries to reconstruct the exception by calling ExceptionClass.__init__() without arguments. litellm exception classes require message, model, and llm_provider as positional args, so this raises TypeError.
The worker can't store or report the error. The task silently disappears.
Affected classes
All litellm exception types: APIConnectionError, RateLimitError, Timeout, ServiceUnavailableError, BadRequestError, AuthenticationError, NotFoundError, ContentPolicyViolationError, InternalServerError, BadGatewayError, PermissionDeniedError, UnprocessableEntityError, APIError, APIResponseValidationError, ContextWindowExceededError.
Reproduction
import cloudpickle
import litellm
exc = litellm.exceptions.RateLimitError(
message="rate limited", model="gpt-4", llm_provider="openai"
)
data = cloudpickle.dumps(exc)
cloudpickle.loads(data) # TypeError: __init__() missing required positional arguments
Fix
Monkey-patch __reduce__ on litellm exception classes so cloudpickle reconstructs them via Exception.__new__() + __dict__ restoration, bypassing __init__. See agent_memory_server/litellm_pickle_compat.py.
The root cause is upstream in litellm — their exception classes don't implement pickle protocol methods. Reported separately there.
When a background task fails with a litellm exception (rate limit, timeout, connection error, etc.), the docket worker cannot serialize the exception for the result queue.
What happens
docket/worker.pycallscloudpickle.dumps(e)on any exception from a failed task. The dumps succeeds, butcloudpickle.loads()later tries to reconstruct the exception by callingExceptionClass.__init__()without arguments. litellm exception classes requiremessage,model, andllm_provideras positional args, so this raisesTypeError.The worker can't store or report the error. The task silently disappears.
Affected classes
All litellm exception types:
APIConnectionError,RateLimitError,Timeout,ServiceUnavailableError,BadRequestError,AuthenticationError,NotFoundError,ContentPolicyViolationError,InternalServerError,BadGatewayError,PermissionDeniedError,UnprocessableEntityError,APIError,APIResponseValidationError,ContextWindowExceededError.Reproduction
Fix
Monkey-patch
__reduce__on litellm exception classes so cloudpickle reconstructs them viaException.__new__()+__dict__restoration, bypassing__init__. Seeagent_memory_server/litellm_pickle_compat.py.The root cause is upstream in litellm — their exception classes don't implement pickle protocol methods. Reported separately there.