Skip to content

Compound NumPy / linalg functions over-count FLOPs via internal ufuncs on WhestArray inputs #69

@spMohanty

Description

@spMohanty

Identified in #67 — flagged as a known limitation in the PR's "Explicitly out of scope" section.

Symptom

A user-facing np.linalg.cond(whest_array) is correctly routed to we.linalg.cond by the __array_function__ allowlist (added in #67). But for compound functions where whest does not have its own dedicated wrapper, the protocol path strips the input to a plain np.ndarray and lets _np.<func>(stripped) run — which internally fires multiple ufunc dispatches that no longer enter the protocol path (the operands are now plain ndarrays). Whest's wrapper still charges the dense outer cost, but the internal ufunc ops compute on plain ndarrays and don't add anything. Net: under-count or over-count depending on how the wrapper estimates cost.

The risk surface today is small because #67's allowlist routes most compound ops to whest's own wrappers. But any path that ends in a raw _np.<func>(stripped, ...) call (e.g. fallbacks, ops we deliberately don't model, future numpy additions) can hit this.

Reproducer (illustrative — needs a real-world hit to confirm)

import whest as we, numpy as np
A = we.random.randn(50, 50)
with we.BudgetContext(flop_budget=int(1e10)) as bc:
    np.linalg.cond(A)             # routes via whest's wrapper — counted correctly
    # In contrast, anything that ends in `_np.<compound>(_to_base_ndarray(A))`
    # internally without a dedicated whest wrapper will double-charge or under-charge.

A concrete reproducer would help here — whoever picks this up should grep for _to_base_ndarray(...) followed by _np.<func>(...) in linalg/, _pointwise.py, etc., construct a per-call-stack model, and identify which call paths over- vs. under-charge.

Suggested approach

  1. Audit every site where whest charges a wrapper-level FLOP cost AND then calls a _np.<compound>(stripped) function. Compare the wrapper cost to the sum of ufunc costs the underlying numpy implementation would charge if every ufunc entered the protocol.
  2. Where they don't match, either:
    • Match numpy's internal-ufunc count by adjusting the wrapper cost formula, or
    • Add a dedicated whest wrapper that drives the compound op via whest-counted primitives (preferred when feasible).
  3. Add a regression test that pins wrapper-cost ≈ sum-of-internal-ufunc-cost for each audited op.

Acceptance criteria

  • A documented audit table listing each compound op and whether its current cost formula over- / under- / matches numpy's internal-ufunc decomposition.
  • For each over-counter, a fix or an explicit comment justifying the over-count.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:coreCore whest API, counting, ndarray, and dispatch/wrapping pathsbugSomething isn't workingpriority:p3Someday / aspirationaltopic:flop-accountingFLOP counting, budget deduction, cost models, and accounting policy

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions