boolbv_index: handle incomplete extern array types in symbol registration#9055
Open
tautschnig wants to merge 2 commits into
Open
boolbv_index: handle incomplete extern array types in symbol registration#9055tautschnig wants to merge 2 commits into
tautschnig wants to merge 2 commits into
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
Fixes a boolbv symbol-registration bug when flattening index expressions over incomplete extern T arr[] array symbols by avoiding zero-width literal-map entries, and adds a regression test reproducing the kernel _ctype[] case.
Changes:
- Skip
boolbv_mapt::get_literalsregistration when array width is unknown during index conversion. - Duplicate the same guard for both the byte-operator and non-byte-operator index paths.
- Add regression
regression/cbmc/incomplete_extern_array1/to prevent the invariant failure from recurring.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
| File | Description |
|---|---|
| src/solvers/flattening/boolbv_index.cpp | Avoids registering array symbols in the literal map when width is unknown, preventing zero-width invariant failures. |
| regression/cbmc/incomplete_extern_array1/test.desc | Adds a regression harness asserting success and absence of prior failure strings. |
| regression/cbmc/incomplete_extern_array1/main.c | Reproduces indexing into an incomplete extern array at multiple indices. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Comment on lines
53
to
+64
| const auto &array_width_opt = bv_width.get_width_opt(array_type); | ||
| (void)map.get_literals( | ||
| final_array.get(ID_identifier), | ||
| array_type, | ||
| array_width_opt.value_or(0)); | ||
| // Skip the registration when the array width is | ||
| // unknown (e.g. `extern T arr[]`). Passing 0 to | ||
| // get_literals creates a zero-width entry that | ||
| // trips its size-equals-width invariant when the | ||
| // same symbol is — or has been — registered at a | ||
| // non-zero width via its element-typed access path. | ||
| if(array_width_opt.has_value()) | ||
| { | ||
| (void)map.get_literals( | ||
| final_array.get(ID_identifier), array_type, *array_width_opt); | ||
| } |
Comment on lines
80
to
+87
| const auto &array_width_opt = bv_width.get_width_opt(array_type); | ||
| (void)map.get_literals( | ||
| array.get(ID_identifier), array_type, array_width_opt.value_or(0)); | ||
| // See comment above for the same case in the | ||
| // byte-operator branch. | ||
| if(array_width_opt.has_value()) | ||
| { | ||
| (void)map.get_literals( | ||
| array.get(ID_identifier), array_type, *array_width_opt); | ||
| } |
| @@ -51,10 +51,17 @@ bvt boolbvt::convert_index(const index_exprt &expr) | |||
| final_array.id() == ID_symbol || final_array.id() == ID_nondet_symbol) | |||
| { | |||
| const auto &array_width_opt = bv_width.get_width_opt(array_type); | |||
| @@ -71,8 +78,13 @@ bvt boolbvt::convert_index(const index_exprt &expr) | |||
| if(array.id() == ID_symbol || array.id() == ID_nondet_symbol) | |||
| { | |||
| const auto &array_width_opt = bv_width.get_width_opt(array_type); | |||
| ^SIGNAL=0$ | ||
| ^VERIFICATION SUCCESSFUL$ | ||
| -- | ||
| ^warning: ignoring |
b0f5503 to
2e277ce
Compare
…tion When boolbv flattens an array index expression where the array is a symbol of unbounded array_typet (e.g. an incomplete declaration like 'extern T arr[]'), it tried to register that symbol with boolbv_mapt::get_literals using a width of array_width_opt.value_or(0). Passing 0 created a zero-width entry that tripped the size-equals-width invariant in get_literals when the same symbol was — or had been — registered at a non-zero width via its element-typed access path (e.g. T arr[i] returns a T-width value). Skip the registration when the array width is unknown. The unknown-width array cannot be flattened anyway; the array decision procedure handles it through the byte-operator and free-variable branches that this registration is purely a caching helper for. This was first surfaced by integration/linux scans of kernel sources that include <linux/ctype.h>, which declares 'extern const unsigned char _ctype[]' — an incomplete extern array referenced by the is*() classifier macros that expand to '_ctype[c]'. CBMC aborted at boolbv_map.cpp:68 on every such TU. Regression test: regression/cbmc/incomplete_extern_array1/ Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
The incremental SMT2 back-end keyed its set of already-declared functions on the full expression (irept). An incomplete `extern T arr[]` symbol can be reached via two expressions that share the same SSA identifier but differ only in their (sort-equivalent) array-type irep -- e.g. _ctype[c] and _ctype[c + 1] from <linux/ctype.h>. Those are distinct expression keys, so both reached send_function_definition and emitted a second (declare-fun |_ctype#1| () (Array ...)), which z3 rejects with "constant '_ctype#1' (with the given signature) already declared". Deduplicate by SSA identifier in send_function_definition: if the symbol is already in identifier_table, map the later expression to the existing declaration instead of re-declaring it. With this, regression/cbmc/incomplete_extern_array1 also passes under the incremental SMT2 back-end, so its no-new-smt tag is removed. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
2e277ce to
232ff03
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## develop #9055 +/- ##
========================================
Coverage 80.68% 80.69%
========================================
Files 1714 1714
Lines 189501 189515 +14
Branches 73 73
========================================
+ Hits 152902 152920 +18
+ Misses 36599 36595 -4 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When boolbv flattens an array index expression where the array is a symbol of unbounded array_typet (e.g. an incomplete declaration like 'extern T arr[]'), it tried to register that symbol with boolbv_mapt::get_literals using a width of array_width_opt.value_or(0). Passing 0 created a zero-width entry that tripped the size-equals-width invariant in get_literals when the same symbol was — or had been — registered at a non-zero width via its element-typed access path (e.g. T arr[i] returns a T-width value).
Skip the registration when the array width is unknown. The unknown-width array cannot be flattened anyway; the array decision procedure handles it through the byte-operator and free-variable branches that this registration is purely a caching helper for.
This was first surfaced by integration/linux scans of kernel sources that include <linux/ctype.h>, which declares 'extern const unsigned char _ctype[]' — an incomplete extern array referenced by the is*() classifier macros that expand to '_ctype[c]'. CBMC aborted at boolbv_map.cpp:68 on every such TU.
Regression test: regression/cbmc/incomplete_extern_array1/