Identified in #67 / #71. Complementary to #71's closed-form-order() shortcut.
Background
SymmetryGroup.order() (and elements()) memoize their results on the instance via _order / _elements slots (src/whest/_perm_group.py:327-331). On the same instance, the second .order() call is ~1μs.
But the cache is per-instance, not per-equivalence-class. SymmetryGroup __eq__ and __hash__ are value-based on the canonical action (src/whest/_perm_group.py:407-411), so two independently constructed S_n instances compare equal but each carry their own empty cache. That's the path PR #67's c58454d97 (fixed in ed50c3853) hit: direct_product_groups / restrict_group_to_axes allocate fresh SymmetryGroup objects on every _counted_ufunc_outer / tensordot call, so the composition pays the cold O(|G|) walk every time even though all the cost-model paths really want is a single canonical group per equivalence class.
Reproducer
import time, whest as we
a = we.SymmetryGroup.symmetric(axes=tuple(range(8)))
b = we.SymmetryGroup.symmetric(axes=tuple(range(8)))
print(a == b) # True (value equality)
print(a is b) # False (separate instances)
print(hash(a) == hash(b)) # True
t = time.perf_counter()
a.order()
print(f"a cold: {(time.perf_counter() - t) * 1000:.0f}ms") # ~1800 ms
# Independently constructed b — pays the cold cost again:
t = time.perf_counter()
b.order()
print(f"b cold: {(time.perf_counter() - t) * 1000:.0f}ms") # ~1800 ms (no shared cache)
Suggested approach
Intern SymmetryGroup instances by canonical action: a process-wide WeakValueDictionary[CanonicalAction, SymmetryGroup] (or equivalent) so every equal-by-canonical-action group resolves to the same Python object — thereby sharing one _order / _elements cache.
Sketch:
# src/whest/_perm_group.py
import weakref
_GROUP_INTERN: weakref.WeakValueDictionary[
tuple, # canonical action key
"SymmetryGroup",
] = weakref.WeakValueDictionary()
class SymmetryGroup:
def __new__(cls, *generators, axes=None):
# Build the prospective instance, compute its canonical action key,
# then return the interned singleton when one already exists.
prospective = super().__new__(cls)
prospective._init_state(generators, axes=axes) # the body of today's __init__
key = prospective._canonical_axis_action()
existing = _GROUP_INTERN.get(key)
if existing is not None:
return existing
_GROUP_INTERN[key] = prospective
return prospective
__init__ becomes a no-op for re-used instances (or a guard via an _initialized flag) since __new__ already populated state. All factories (.symmetric, .cyclic, .dihedral, .young, .from_generators, .direct_product) and helpers in _symmetry_utils.py (intersect_groups, direct_product_groups, restrict_group_to_axes, remap_group_axes, embed_group, broadcast_group) inherit interning automatically.
Implications and design considerations
__eq__ becomes identity-equivalent — a == b iff a is b for interned instances. The existing value-based __eq__ would still work and remain a correctness fallback for non-interned sources (e.g. unpickled / cross-process / future code paths that bypass interning).
- Hash stability —
__hash__ is already canonical-action-based, so the existing dict lookup logic works directly.
- Memory —
WeakValueDictionary lets unused groups GC naturally; only currently-referenced groups stay interned.
- Thread safety — needs a lock around the
__new__ get-or-insert, or rely on the GIL for CPython (fine for now).
- Mutation —
SymmetryGroup is supposed to be immutable, but verify nothing inside _perm_group.py rebinds _order / _elements after construction in a way that would corrupt the shared cache. (Today _dimino-driven population is the only mutation; that's idempotent and safe to share.)
- Equality with non-interned instances — until every construction path goes through
__new__'s intern-check, value-equality must keep working. Don't replace __eq__ with id-equality.
Relationship to #71
Two complementary fixes for the same family of perf issues:
| Fix |
Helps when |
Cost |
Closed-form order() for known kinds (#71) |
Group is constructed via a known factory (symmetric, cyclic, dihedral, young, direct_product of known kinds). Even fresh instances are cheap. |
A _known_kind tag and ~10 lines of factorial / gcd / product math. |
| Interning (this issue) |
Group is reconstructed identically across calls (any kind, including from_generators). Caches are shared. |
A WeakValueDictionary and a __new__ override. |
They stack: with both in place, known-kind groups are O(1) regardless of identity; unknown-kind groups are O(|G|) once per equivalence class per process; pathological calls that keep producing new equivalence classes still hit the degree-12 bail from PR #67.
#71's closed-form shortcut is preferable as the first fix because it doesn't change any visible semantics. Interning is a slightly heavier change (touches __new__, locking, weak references) but covers cases #71 doesn't (e.g. groups loaded from disk via from_payload, or composed via from_generators).
Acceptance criteria
SymmetryGroup.symmetric(axes=tuple(range(8))) constructed twice resolves to the same Python object (a is b).
- The second instance's
.order() is O(1) (test with time.perf_counter()).
- All existing tests in
tests/test_symmetry_group_api.py, tests/test_perm_group_coverage.py, and tests/test_symmetry_transport.py pass unchanged.
- Round-trip via
to_payload / from_payload returns the interned canonical instance.
- A new test pinning the interning behaviour in
tests/test_symmetry_group_api.py.
Related
Identified in #67 / #71. Complementary to #71's closed-form-
order()shortcut.Background
SymmetryGroup.order()(andelements()) memoize their results on the instance via_order/_elementsslots (src/whest/_perm_group.py:327-331). On the same instance, the second.order()call is ~1μs.But the cache is per-instance, not per-equivalence-class.
SymmetryGroup__eq__and__hash__are value-based on the canonical action (src/whest/_perm_group.py:407-411), so two independently constructedS_ninstances compare equal but each carry their own empty cache. That's the path PR #67'sc58454d97(fixed ined50c3853) hit:direct_product_groups/restrict_group_to_axesallocate freshSymmetryGroupobjects on every_counted_ufunc_outer/tensordotcall, so the composition pays the cold O(|G|) walk every time even though all the cost-model paths really want is a single canonical group per equivalence class.Reproducer
Suggested approach
Intern
SymmetryGroupinstances by canonical action: a process-wideWeakValueDictionary[CanonicalAction, SymmetryGroup](or equivalent) so every equal-by-canonical-action group resolves to the same Python object — thereby sharing one_order/_elementscache.Sketch:
__init__becomes a no-op for re-used instances (or a guard via an_initializedflag) since__new__already populated state. All factories (.symmetric,.cyclic,.dihedral,.young,.from_generators,.direct_product) and helpers in_symmetry_utils.py(intersect_groups,direct_product_groups,restrict_group_to_axes,remap_group_axes,embed_group,broadcast_group) inherit interning automatically.Implications and design considerations
__eq__becomes identity-equivalent —a == biffa is bfor interned instances. The existing value-based__eq__would still work and remain a correctness fallback for non-interned sources (e.g. unpickled / cross-process / future code paths that bypass interning).__hash__is already canonical-action-based, so the existing dict lookup logic works directly.WeakValueDictionarylets unused groups GC naturally; only currently-referenced groups stay interned.__new__get-or-insert, or rely on the GIL for CPython (fine for now).SymmetryGroupis supposed to be immutable, but verify nothing inside_perm_group.pyrebinds_order/_elementsafter construction in a way that would corrupt the shared cache. (Today_dimino-driven population is the only mutation; that's idempotent and safe to share.)__new__'s intern-check, value-equality must keep working. Don't replace__eq__withid-equality.Relationship to #71
Two complementary fixes for the same family of perf issues:
order()for known kinds (#71)symmetric,cyclic,dihedral,young,direct_productof known kinds). Even fresh instances are cheap._known_kindtag and ~10 lines of factorial / gcd / product math.from_generators). Caches are shared.WeakValueDictionaryand a__new__override.They stack: with both in place, known-kind groups are O(1) regardless of identity; unknown-kind groups are O(|G|) once per equivalence class per process; pathological calls that keep producing new equivalence classes still hit the degree-12 bail from PR #67.
#71's closed-form shortcut is preferable as the first fix because it doesn't change any visible semantics. Interning is a slightly heavier change (touches
__new__, locking, weak references) but covers cases #71 doesn't (e.g. groups loaded from disk viafrom_payload, or composed viafrom_generators).Acceptance criteria
SymmetryGroup.symmetric(axes=tuple(range(8)))constructed twice resolves to the same Python object (a is b)..order()is O(1) (test withtime.perf_counter()).tests/test_symmetry_group_api.py,tests/test_perm_group_coverage.py, andtests/test_symmetry_transport.pypass unchanged.to_payload/from_payloadreturns the interned canonical instance.tests/test_symmetry_group_api.py.Related
_is_oversized_for_cost_model+CostFallbackWarningbail, the workaround that this issue (and SymmetryGroup.order() is O(|G|) — closed-form shortcut for known group families #71) would let us tighten / lift.order()for known group families. Complementary fix to this one.unique_elements_for_shape). Same family of fixes; that one memoizes a derived quantity, this one canonicalizes the group itself.