perf: avoid MeasurementUnit enum boxing allocations#5218
perf: avoid MeasurementUnit enum boxing allocations#5218jamescrosswell wants to merge 7 commits into
Conversation
Avoid enum boxing and ToLowerInvariant allocations in MeasurementUnit by storing enum kind/value and using precomputed lowercase unit names. Verified: dotnet test SentryNoMobile.slnf (fails only on known baseline net48 Sentry.AspNet.Tests issues reproducible on main). #skip-changelog Co-Authored-By: goose <goose@aaif.dev>
There was a problem hiding this comment.
thought:
This is a similar implementation a coding agent on my end suggested when I gave it a try when looking at it with a friend of mine just last weekend 😉.
I'm not too certain if I'm happy with this approach, though:
- the size of the struct is increasing
- we embed more data into the assembly (via the
string-arrays)
On the other hand, this (or a similar) implementation is necessary, in order to not change the IEquatable-semantics.
We were then experimenting with the struct only having a string field,
but that would change the Equality-semantics, where
MeasurementUnit.Duration.Millisecond == MeasurementUnit.Custom("millisecond")- old/this: would not be equal, because constructed differently
- only with
stringfield: would now become equal, being a behavioral change
There was a problem hiding this comment.
wait ... actually ... I believe I am wrong concerning the size ... at least partially wrong
- previous
System.Enum+System.String- on 64-bit: 8-byte-reference + 8-byte reference = 16B
- on 32-bit: 4-byte-reference + 4-byte reference = 8B
- this change
UnitKindenum (byte) +System.Int32+System.String- on 64-bit: 1-byte enum + 4-byte int + 3-byte padding, 8-byte reference = 16B
- on 32-bit: 1-byte enum + 3-byte padding + 4-byte int + 4-byte reference + padding = 12B
So we are only increasing the size of the struct by 4B on 32-bit systems/processes.
Yeah ... that's neglectable considering we no longer box to System.Enum.
There was a problem hiding this comment.
about the Equals-Semantics
If we would only keep a string field, assigning the string via a switch on constructions, we would simplify the code, reduce the struct size to 4B/8B (32-/64-bit systems); but we would change the Equals, as it would no longer matter whether the MeasurementUnit was constructed via MeasurementUnit.Custom or the respective implicit-ennum-conversion.
This being a behavioral change, we could simplify MeasurementUnit in the next Major ... or should we keep the Equals-Behavior as is, where there is a difference how the MeasurementUnit was constructed.
What do you think?
There was a problem hiding this comment.
To err on the side of caution (since this API is probably quite widely used already), maybe we delay the behavioural change (and saving the extra 4B) until the next major release.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #5218 +/- ##
==========================================
+ Coverage 74.13% 74.21% +0.08%
==========================================
Files 506 507 +1
Lines 18292 18350 +58
Branches 3576 3583 +7
==========================================
+ Hits 13561 13619 +58
- Misses 3859 3860 +1
+ Partials 872 871 -1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| Custom = 4 | ||
| } | ||
|
|
||
| private static readonly string[] DurationNames = |
There was a problem hiding this comment.
thought: switch instead of Array-Lookups
My AI assistant also suggested a similar solution.
Instead of embedding these arrays into the assembly and then indexing into these arrays,
we could do a switch in all of the three (Duration, Fraction, Information) implicit conversions and assign the name/string directly, while still keeping the new UnitKind enum so that we don't change Equality-semantics on how the struct is created (MeasurementUnit.Custom or via any of the three implicit "from-enum"-conversions).
There was a problem hiding this comment.
Thanks @Flash0ver. I didn't much like the duplication in the old code at all.
We could do a switch statement, but I think it would suffer from the same duplication and whilst it would reduce heap allocations (likley G2 for statics), we grow the size of the codebase by the same amount we save in heap allocations. I think the end result is having to load more data into L1 cache every time the method executes, which might actually be less performant.
I changed the LLMs code so that the names are generated dynamically in the static constructor (negligible one time hit at startup and allocates a couple of tiny arrays to G2 heap that won't trouble the GC). That makes the code easier to maintain whilst still avoiding the boxing, I believe.
There was a problem hiding this comment.
Dang... Enum.GetNames isn't trim safe.
Seems we have to go with either hard coded arrays or a source generator. Hard coded is probably easier - will add some unit tests to ensure these don't break if we add any enum values in the future.
If we did want to use a source generator, this is actually the example that Andrew Lock uses in his blog about them, so he's already built and tested some we could use OOTB.
Co-Authored-By: goose <goose-noreply@example.com>
Co-Authored-By: goose <goose-noreply@example.com>
Co-Authored-By: goose <goose-noreply@example.com>
Co-Authored-By: Goose <goose-noreply@example.com>
Summary
Implements #4844 by removing avoidable allocations in
MeasurementUnit:Enumstorage with compact internal representation (UnitKind+int value)Enum.ToString().ToLowerInvariant()with precomputed lowercase name lookups for built-in unitsThis removes both enum boxing and per-call lowercase string allocation on the enum-backed path.
Testing
Ran:
dotnet test SentryNoMobile.slnftest/Sentry.AspNet.Tests(net48) has 2 failing tests due toMissingMethodExceptioninSentryAspNetOptionsExtensionsTests, reproducible onmainwith no local changes.Reproduced baseline on main:
dotnet test test/Sentry.AspNet.Tests/Sentry.AspNet.Tests.csproj -f net48main.#skip-changelog