Skip to content

Reorganize compile module and remove deprecated functionality#1351

Merged
ricardoV94 merged 11 commits intopymc-devs:v3from
ricardoV94:reduce_function_overhead
Apr 8, 2026
Merged

Reorganize compile module and remove deprecated functionality#1351
ricardoV94 merged 11 commits intopymc-devs:v3from
ricardoV94:reduce_function_overhead

Conversation

@ricardoV94
Copy link
Copy Markdown
Member

@ricardoV94 ricardoV94 commented Apr 8, 2025

Removing all the deprecated stuff that nobody uses

Update: Many of the performance-critical changes were already adopted in spin-off, so this PR is mostly deprecating old stuff and refactoring things

Old discussion on perf

I think after this there's not much more fat to trim. We could try to avoid the input/output list thing with linkers that don't need it but we still need to sort them/ recognize kwargs which is pretty much what's left.

The C part of the call takes 100ns, so the "overhead" in this case is 200-250ns.
For reference, calling numpy.zeros(100) takes like ~250ns, and a python identity function is ~20ns.

Results of the new benchmark

Before
--------------------------------------------------------------------------------------------------------------------- benchmark: 4 tests ---------------------------------------------------------------------------------------------------------------------
Name (time in ns)                                                                Min                    Max                  Mean              StdDev                Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_identity_function_overhead_benchmark[cvm_nogc-trust_input=True]        524.0000 (1.0)      55,453.8000 (8.24)       574.7866 (1.0)      326.9482 (3.11)       560.5500 (1.0)      24.5500 (1.0)      301;2121    1,739.7760 (1.0)       82217          20
test_identity_function_overhead_benchmark[cvm-trust_input=True]             590.9999 (1.13)      6,733.0000 (1.0)        654.5138 (1.14)     105.2706 (1.0)        650.9999 (1.16)     30.9999 (1.26)       89;486    1,527.8516 (0.88)      19552           1
test_identity_function_overhead_benchmark[cvm_nogc-trust_input=False]     1,432.0001 (2.73)     31,008.0000 (4.61)     1,608.2195 (2.80)     296.4835 (2.82)     1,583.0001 (2.82)     50.0002 (2.04)     722;1368      621.8057 (0.36)      37638           1
test_identity_function_overhead_benchmark[cvm-trust_input=False]          1,463.0000 (2.79)     11,541.9998 (1.71)     1,614.7948 (2.81)     329.0091 (3.13)     1,582.9999 (2.82)     60.0000 (2.44)       78;152      619.2737 (0.36)       3899           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
After
--------------------------------------------------------------------------------------------------------------------- benchmark: 4 tests ---------------------------------------------------------------------------------------------------------------------
Name (time in ns)                                                                Min                    Max                  Mean              StdDev                Median                IQR            Outliers  OPS (Kops/s)            Rounds  Iterations
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_identity_function_overhead_benchmark[cvm_nogc-trust_input=True]        322.7857 (1.0)       5,672.7143 (1.0)        358.2689 (1.0)       75.7251 (1.0)        354.9286 (1.0)      12.2143 (1.0)     1548;2274    2,791.1995 (1.0)      199641          14
test_identity_function_overhead_benchmark[cvm-trust_input=True]             439.9999 (1.36)     15,960.0002 (2.81)       505.0075 (1.41)     135.9068 (1.79)       500.9999 (1.41)     29.9999 (2.46)      136;793    1,980.1684 (0.71)      32064           1
test_identity_function_overhead_benchmark[cvm_nogc-trust_input=False]     1,312.0000 (4.06)      8,876.9998 (1.56)     1,446.4540 (4.04)     179.0074 (2.36)     1,442.0000 (4.06)     49.9999 (4.09)      114;493      691.3459 (0.25)      36521           1
test_identity_function_overhead_benchmark[cvm-trust_input=False]          1,382.0002 (4.28)      8,705.9998 (1.53)     1,525.5059 (4.26)     198.9647 (2.63)     1,512.9999 (4.26)     50.9999 (4.18)       42;142      655.5202 (0.23)       7106           1
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


I also removed the strict zip, because passing strict=False is more than 100ns slower than not specifying it at all. Check out with this snippet:

a = tuple(range(10))
b = tuple(range(9))
%timeit tuple(zip(a, b, strict=False))
%timeit tuple(zip(a, b))
587 ns ± 8.29 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
458 ns ± 17.4 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

Closes #1332
Does the same as and closes #1976
Closes #1647


📚 Documentation preview 📚: https://pytensor--1351.org.readthedocs.build/en/1351/

@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 2 times, most recently from 0dc3147 to 782bc2d Compare April 9, 2025 18:07
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 4 times, most recently from 9f026c7 to 4c1f269 Compare April 24, 2025 17:56
@ricardoV94 ricardoV94 changed the title Reduce pytensor function call overhead Remove deprecated PyTensor function functionality and reduce overhead Apr 24, 2025
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 3 times, most recently from a15d0e5 to 482cded Compare April 25, 2025 17:05
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from 482cded to b5b538f Compare March 18, 2026 09:37
@ricardoV94 ricardoV94 changed the base branch from main to v3 March 18, 2026 09:38
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 7 times, most recently from 2e801bc to aaf2419 Compare March 19, 2026 13:16
@ricardoV94 ricardoV94 marked this pull request as ready for review March 19, 2026 14:48
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from aaf2419 to 7033561 Compare March 24, 2026 11:33
@ricardoV94 ricardoV94 requested a review from Armavica March 24, 2026 15:02
@Armavica
Copy link
Copy Markdown
Member

@ricardoV94 What is this test_overhead_benchmark function?

@ricardoV94
Copy link
Copy Markdown
Member Author

Something that changed name in #1993 probably, let me find and rerun

@ricardoV94
Copy link
Copy Markdown
Member Author

Ah dropped during rebase, it was new to this PR?...

@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from 7033561 to a6fd083 Compare March 24, 2026 16:29
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 6 times, most recently from 7c1c3db to ae25e2a Compare April 5, 2026 21:55
@ricardoV94
Copy link
Copy Markdown
Member Author

mypy failures look unrelated...

@ricardoV94 ricardoV94 changed the title Reorganize pytensor/compile module and remove deprecated functionality Reorganize compile module and remove deprecated functionality Apr 6, 2026
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch 2 times, most recently from 7cc06ff to 2b325e6 Compare April 6, 2026 14:26
@ricardoV94
Copy link
Copy Markdown
Member Author

Second round, scope creep so original reviews are pretty much moot

@@ -225,56 +138,6 @@ Every element of the inputs list will be upgraded to an In instance if necessary

- a tuple ``(name, r)`` will be ``In(r, name=name)``
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO : Remove this as well

Remove pfunc as a separate entry point. function() now directly calls
construct_function_ins_and_outs + FunctionMaker, eliminating an
unnecessary layer of indirection.

Also:
- Inline _pfunc_param_to_in as a local closure (param_to_in), drop
  unused strict parameter
- Fix stale docstring in construct_function_ins_and_outs

Remove redundant tests between TestFunction (ex-TestPfunc) and
TestFunctionIn:
- test_param_strict (duplicate of test_in_strict)
- test_param_mutable (duplicate of test_in_mutable)
- test_param_allow_downcast_int (duplicate of test_in_allow_downcast_int)
- test_param_allow_downcast_floatX (duplicate of test_in_allow_downcast_floatX)
- test_param_allow_downcast_vector_floatX (duplicate of test_in_allow_downcast_vector_floatX)
- test_explicit_shared_input (duplicate of test_no_shared_as_input)
Only used internally in construct_function_ins_and_outs, replace with
direct isinstance(updates, dict) checks at the two call sites.
Replace orig_function usage in Scan with direct mode.function_maker()
call. The profiling that orig_function did is already handled by
FunctionMaker itself.
Subsume std_fgraph as a staticmethod on FunctionMaker and inline
_optcheck_fgraph in DebugMode._Maker where it was the only caller.

Also fixes a bug where the clone parameter was computed but not
passed to FunctionGraph.__init__.
Replace with flat modules:
- compile/maker.py: function(), FunctionMaker
- compile/executor.py: Function class, pickle support,
AliasedMemoryError
- compile/aliasing.py: Supervisor, alias/view/deepcopy utilities
- compile/rebuild.py: rebuild_collect_shared,
construct_function_ins_and_outs

Also moves get_info_on_inputs into Function as a static method and
inlines fgraph_updated_vars at its only call site.
- compile/debugmode.py -> compile/debug/debugmode.py
- compile/nanguardmode.py -> compile/debug/nanguardmode.py
- compile/monitormode.py -> compile/debug/monitormode.py
- compile/profiling.py -> compile/debug/profiling.py
- Move function_dump into compile/debug/dump.py
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from 2b325e6 to 85bdc1a Compare April 8, 2026 14:53
@ricardoV94 ricardoV94 force-pushed the reduce_function_overhead branch from 5e652bf to 6ad2880 Compare April 8, 2026 15:09
@ricardoV94 ricardoV94 merged commit 2f2e86a into pymc-devs:v3 Apr 8, 2026
66 checks passed
@ricardoV94 ricardoV94 deleted the reduce_function_overhead branch April 8, 2026 15:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pytensor.compile.function.types.py is a puzzling file name Do not allow passing None for arbitrary arguments in Function

4 participants