Linking: structural-equivalence-aware type comparison#9061
Conversation
There was a problem hiding this comment.
Pull request overview
Note
Copilot was unable to run its full agentic suite in this review.
This PR improves linker behavior when comparing types across translation units by adding a structural-equivalence check for struct/union types whose irept encodings differ due to differing completion states (forward-declared vs complete), reducing unnecessary renaming and spurious conflicts.
Changes:
- Add a recursive structural-equivalence predicate that follows tag references across both namespaces and use it as a permissive “last chance” check in type-collision handling.
- Thread a
src_ns(namespace over the source symbol table) into collision/renaming paths so new-side tag references can be resolved during comparisons. - Add regression test
Linking12covering the forward-declared-vs-complete nested-anonymous-tag pattern; adjust an existing ansi-c regression expectation.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/linking/linking_class.h | Extend type-renaming/type-duplication APIs to accept src_ns for cross-namespace tag resolution. |
| src/linking/linking.cpp | Implement and wire in structurally_equivalent, and construct src_ns during linking / symbol copying. |
| regression/cbmc/Linking12/* | New regression capturing the reported structural-equivalence scenario. |
| regression/cbmc/Linking10/test.desc | Add/clarify regression coverage around parameter replacement accumulation. |
| regression/cbmc/Linking11/* | Add regression ensuring renamed struct tags in function parameters are updated during linking. |
| regression/ansi-c/linking_conflicts2/test.desc | Update expected exit code / diagnostic text for a conflict case. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Both are tag references of the same kind: follow into their definitions. | ||
| if( | ||
| t1.id() == t2.id() && (t1.id() == ID_struct_tag || | ||
| t1.id() == ID_union_tag || t1.id() == ID_c_enum_tag)) | ||
| { | ||
| const irep_idt &id1 = to_tag_type(t1).get_identifier(); | ||
| const irep_idt &id2 = to_tag_type(t2).get_identifier(); | ||
|
|
||
| // Cycle guard: once we have started comparing this pair, treat them as | ||
| // equivalent at any deeper occurrence to avoid infinite recursion on | ||
| // mutually-recursive struct definitions like list_head. | ||
| std::string key = id2string(id1) + "\x01" + id2string(id2); | ||
| if(!visited.insert(key).second) | ||
| return true; | ||
|
|
||
| return structurally_equivalent( | ||
| follow_tags_symbols(main_ns, t1), | ||
| follow_tags_symbols(src_ns, t2), | ||
| main_ns, | ||
| src_ns, | ||
| visited); | ||
| } |
| // If either side is incomplete, accept the pair: linking will keep the | ||
| // complete one. We do not require the C source tag to match here because | ||
| // anonymous inner types may carry differing synthetic identifiers across | ||
| // translation units. | ||
| if(inc1 || inc2) | ||
| return true; |
| if(info.old_symbol.type.id() == ID_code) | ||
| { | ||
| diag.warning( | ||
| info.old_symbol, | ||
| info.new_symbol, | ||
| "pointer parameter types differ between declaration and definition"); | ||
| } |
| ^EXIT=6$ | ||
| ^SIGNAL=0$ | ||
| ^\S+\.c(:\d+:\d+|\(\d+\)): error: conflicting function declarations 'foo'$ | ||
| ^file .*other\.c line \d+: duplicate definition of function 'foo'$ |
547b812 to
fd54c79
Compare
Function parameters may require application of type fixes: our linking ensures that all types have a unique tag, and some parameter's types may require such tag updates. We used to do this for objects already, but need to apply it to parameters as well. Also, the order matters: all types need to be sorted out first before attempting objects or functions. Spotted while trying to compile a particular version of Xen 4.11. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
fd54c79 to
029dd1c
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## develop #9061 +/- ##
===========================================
+ Coverage 80.68% 80.69% +0.01%
===========================================
Files 1714 1714
Lines 189501 189577 +76
Branches 73 73
===========================================
+ Hits 152902 152987 +85
+ Misses 36599 36590 -9 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
When comparing types across translation units, the linker used to treat two struct/union types with the same C source tag as different whenever their irept representations differed at the byte level. In practice this happens for any non-trivial header chain where one TU sees a sub-type as complete and another sees only a forward declaration: type-name canonicalisation propagates the completion state up through any anonymous tag identifiers nested inside the outer type, and the byte-level comparison rejects the pair. The original RENAME_NEW path then cascades through do_type_dependencies() to every type that transitively references the mismatching pair, and through copy_symbols()'s rename_new_symbol pass all references in function/object types are rewritten to ST[...]-style synthetic identifiers. duplicate_code_symbol() compares the rewritten new-side type against the un-rewritten main-side type, sees them as different, and emits 'ignoring conflicting function declarations'. Add a recursive structural-equivalence check that follows tag references through both namespaces and returns true when: - a tag pair resolves (via main_ns / src_ns) to equivalent types; - one side is incomplete and the other complete (linking will keep the complete one); - both are complete struct/union with matching component count and names, and pairwise structurally-equivalent component types. Wire it into needs_renaming_type() and duplicate_type_symbol() as a final pass before defaulting to RENAME_NEW or to the 'unexpected difference between type symbols' error. The existing incomplete-vs-complete special cases retain priority so that the linker continues to merge incomplete declarations into complete ones. Add regression test Linking12 covering the exact pattern: an outer struct containing an anonymous union whose synthetic tag identifier diverges between TUs because of an inner forward-vs-complete type. Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
029dd1c to
7fd4d33
Compare
Depends-on: #7579
When comparing types across translation units, the linker used to treat two struct/union types with the same C source tag as different whenever their irept representations differed at the byte level. In practice this happens for any non-trivial header chain where one TU sees a sub-type as complete and another sees only a forward declaration: type-name canonicalisation propagates the completion state up through any anonymous tag identifiers nested inside the outer type, and the byte-level comparison rejects the pair.
The original RENAME_NEW path then cascades through do_type_dependencies() to every type that transitively references the mismatching pair, and through copy_symbols()'s rename_new_symbol pass all references in function/object types are rewritten to ST[...]-style synthetic identifiers. duplicate_code_symbol() compares the rewritten new-side type against the un-rewritten main-side type, sees them as different, and emits 'ignoring conflicting function declarations'.
Add a recursive structural-equivalence check that follows tag references through both namespaces and returns true when:
Wire it into needs_renaming_type() and duplicate_type_symbol() as a final pass before defaulting to RENAME_NEW or to the 'unexpected difference between type symbols' error. The existing incomplete-vs-complete special cases retain priority so that the linker continues to merge incomplete declarations into complete ones.
Add regression test Linking12 covering the exact pattern: an outer struct containing an anonymous union whose synthetic tag identifier diverges between TUs because of an inner forward-vs-complete type.