-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
Evaluation of scalars is done essentially via binary matmuls. This is a bit hard to see, but effectively is what happens after the VMAP transformation.
Generally, kernels are not optimized for binary matmul, so one could consider using floating-point matmul instead.
An additional strategy is bit-packing, which would also significantly save memory.
Some profiling should definitely be done for this ticket.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels