Decompose byte_extract from union through widest member#8950
Decompose byte_extract from union through widest member#8950tautschnig wants to merge 1 commit intodiffblue:developfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Reduces byte_extract lowering blow-ups on unions with non-constant offsets by decomposing the access through a union’s widest member, and adds a regression test to prevent the performance regression reported in #8813.
Changes:
- Add union handling to the non-constant-offset
get_subexpression_at_offsetby recursing via the widest union component. - Add a new regression test (
union_performance1) exercising non-constant indexing through same-typed union members. - Add a
simplify_byte_extractshortcut that unwraps single-operand struct/union expressions before simplifying.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| src/util/simplify_expr.cpp | Adds an early simplification path for byte_extract applied to struct/union constructor expressions with a single operand. |
| src/util/pointer_offset_size.cpp | Decomposes non-constant-offset subexpression extraction on unions via the widest component to avoid expensive lowering. |
| regression/cbmc/union_performance1/test.desc | Registers a new regression test for the union non-constant-offset case. |
| regression/cbmc/union_performance1/main.c | Test program that writes through one union arm and reads through another with a symbolic index. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #8950 +/- ##
========================================
Coverage 80.46% 80.47%
========================================
Files 1704 1704
Lines 188665 188689 +24
Branches 73 73
========================================
+ Hits 151805 151839 +34
+ Misses 36860 36850 -10 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
When byte_extract is applied to a union-typed expression with a non-constant computed offset, the byte_extract lowering creates a massive expression because it must consider all possible byte layouts. For unions where the widest member covers the entire union, we can instead decompose the access through that member, avoiding the expensive lowering. Add union handling to the non-constant-offset overload of get_subexpression_at_offset: for symbol and member expressions of union type, recurse into the widest member. Guard this on the widest member's size equalling the union's size (no trailing padding). The constant-offset overload is left unchanged to preserve existing simplification behavior (e.g., byte_extract(byte_update(...)) patterns used during constant propagation). On the reproducer from diffblue#8813, the union version now takes 1.0s (previously 94s), matching the struct version at 1.0s. Fixes: diffblue#8813 Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
f9c0f63 to
f2244fd
Compare
|
Many thanks @tautschnig! We're trying this out with mldsa-native (pq-code-package/mldsa-native#1016) and get back to you when we have an understanding of the impact. |
|
Thank you so much for looking into this, @tautschnig!! Below is an example that is closer to the data structures used in mldsa-native. With a struct this takes less than a second, with a union it takes 90 seconds on my M4 MacBook. typedef struct
{
int32_t coeffs[256];
} poly;
typedef struct
{
poly vec[8];
} polyvec;
void bar(polyvec *r)
__contract__(
requires(memory_no_alias(r, sizeof(polyvec)))
assigns(memory_slice(r, sizeof(polyvec)))
ensures(array_abs_bound(r->vec[0].coeffs, 0, 256, 1 << 23))
ensures(array_abs_bound(r->vec[1].coeffs, 0, 256, 1 << 23))
ensures(array_abs_bound(r->vec[2].coeffs, 0, 256, 1 << 23))
ensures(array_abs_bound(r->vec[3].coeffs, 0, 256, 1 << 23))
ensures(array_abs_bound(r->vec[4].coeffs, 0, 256, 1 << 23))
ensures(array_abs_bound(r->vec[5].coeffs, 0, 256, 1 << 23))
ensures(array_abs_bound(r->vec[6].coeffs, 0, 256, 1 << 23))
ensures(array_abs_bound(r->vec[7].coeffs, 0, 256, 1 << 23))
);
void harness(void)
{
#if 1 /* CHANGE THIS */
union
#else
struct
#endif
{
polyvec y;
polyvec h;
} yh;
polyvec *y = &yh.y;
polyvec *h = &yh.h;
bar(y);
bar(h);
}Full example with instructions: sign.c |
|
Thank you @mkannwischer - will take a look! |
When byte_extract is applied to a union-typed expression with a non-constant computed offset, the byte_extract lowering creates a massive expression because it must consider all possible byte layouts. For unions where the widest member covers the entire union, we can instead decompose the access through that member, avoiding the expensive lowering.
Add union handling to the non-constant-offset overload of get_subexpression_at_offset: for symbol and member expressions of union type, recurse into the widest member. This is safe because all union members share the same byte layout starting at offset 0.
The constant-offset overload is left unchanged to preserve existing simplification behavior (e.g., byte_extract(byte_update(...)) patterns used during constant propagation).
Fixes: #8813