Skip to content

Linking: structural-equivalence-aware type comparison#9061

Open
tautschnig wants to merge 2 commits into
diffblue:developfrom
tautschnig:extract/linking-structural-equivalence
Open

Linking: structural-equivalence-aware type comparison#9061
tautschnig wants to merge 2 commits into
diffblue:developfrom
tautschnig:extract/linking-structural-equivalence

Conversation

@tautschnig

Copy link
Copy Markdown
Collaborator

Depends-on: #7579

When comparing types across translation units, the linker used to treat two struct/union types with the same C source tag as different whenever their irept representations differed at the byte level. In practice this happens for any non-trivial header chain where one TU sees a sub-type as complete and another sees only a forward declaration: type-name canonicalisation propagates the completion state up through any anonymous tag identifiers nested inside the outer type, and the byte-level comparison rejects the pair.

The original RENAME_NEW path then cascades through do_type_dependencies() to every type that transitively references the mismatching pair, and through copy_symbols()'s rename_new_symbol pass all references in function/object types are rewritten to ST[...]-style synthetic identifiers. duplicate_code_symbol() compares the rewritten new-side type against the un-rewritten main-side type, sees them as different, and emits 'ignoring conflicting function declarations'.

Add a recursive structural-equivalence check that follows tag references through both namespaces and returns true when:

  • a tag pair resolves (via main_ns / src_ns) to equivalent types;
  • one side is incomplete and the other complete (linking will keep the complete one);
  • both are complete struct/union with matching component count and names, and pairwise structurally-equivalent component types.

Wire it into needs_renaming_type() and duplicate_type_symbol() as a final pass before defaulting to RENAME_NEW or to the 'unexpected difference between type symbols' error. The existing incomplete-vs-complete special cases retain priority so that the linker continues to merge incomplete declarations into complete ones.

Add regression test Linking12 covering the exact pattern: an outer struct containing an anonymous union whose synthetic tag identifier diverges between TUs because of an inner forward-vs-complete type.

  • Each commit message has a non-empty body, explaining why the change was made.
  • Methods or procedures I have added are documented, following the guidelines provided in CODING_STANDARD.md.
  • The feature or user visible behaviour I have added or modified has been documented in the User Guide in doc/cprover-manual/
  • Regression or unit tests are included, or existing tests cover the modified code (in this case I have detailed which ones those are in the commit message).
  • My commit message includes data points confirming performance improvements (if claimed).
  • My PR is restricted to a single feature or bugfix.
  • White-space or formatting changes outside the feature-related changed lines are in commits of their own.

Copilot AI review requested due to automatic review settings June 18, 2026 15:57
@tautschnig tautschnig requested a review from kroening as a code owner June 18, 2026 15:57
@tautschnig tautschnig self-assigned this Jun 18, 2026

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR improves linker behavior when comparing types across translation units by adding a structural-equivalence check for struct/union types whose irept encodings differ due to differing completion states (forward-declared vs complete), reducing unnecessary renaming and spurious conflicts.

Changes:

  • Add a recursive structural-equivalence predicate that follows tag references across both namespaces and use it as a permissive “last chance” check in type-collision handling.
  • Thread a src_ns (namespace over the source symbol table) into collision/renaming paths so new-side tag references can be resolved during comparisons.
  • Add regression test Linking12 covering the forward-declared-vs-complete nested-anonymous-tag pattern; adjust an existing ansi-c regression expectation.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/linking/linking_class.h Extend type-renaming/type-duplication APIs to accept src_ns for cross-namespace tag resolution.
src/linking/linking.cpp Implement and wire in structurally_equivalent, and construct src_ns during linking / symbol copying.
regression/cbmc/Linking12/* New regression capturing the reported structural-equivalence scenario.
regression/cbmc/Linking10/test.desc Add/clarify regression coverage around parameter replacement accumulation.
regression/cbmc/Linking11/* Add regression ensuring renamed struct tags in function parameters are updated during linking.
regression/ansi-c/linking_conflicts2/test.desc Update expected exit code / diagnostic text for a conflict case.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/linking/linking.cpp
Comment on lines +892 to +913
// Both are tag references of the same kind: follow into their definitions.
if(
t1.id() == t2.id() && (t1.id() == ID_struct_tag ||
t1.id() == ID_union_tag || t1.id() == ID_c_enum_tag))
{
const irep_idt &id1 = to_tag_type(t1).get_identifier();
const irep_idt &id2 = to_tag_type(t2).get_identifier();

// Cycle guard: once we have started comparing this pair, treat them as
// equivalent at any deeper occurrence to avoid infinite recursion on
// mutually-recursive struct definitions like list_head.
std::string key = id2string(id1) + "\x01" + id2string(id2);
if(!visited.insert(key).second)
return true;

return structurally_equivalent(
follow_tags_symbols(main_ns, t1),
follow_tags_symbols(src_ns, t2),
main_ns,
src_ns,
visited);
}
Comment thread src/linking/linking.cpp
Comment on lines +945 to +950
// If either side is incomplete, accept the pair: linking will keep the
// complete one. We do not require the C source tag to match here because
// anonymous inner types may carry differing synthetic identifiers across
// translation units.
if(inc1 || inc2)
return true;
Comment thread src/linking/linking.cpp
Comment on lines +502 to +508
if(info.old_symbol.type.id() == ID_code)
{
diag.warning(
info.old_symbol,
info.new_symbol,
"pointer parameter types differ between declaration and definition");
}
Comment on lines +4 to +6
^EXIT=6$
^SIGNAL=0$
^\S+\.c(:\d+:\d+|\(\d+\)): error: conflicting function declarations 'foo'$
^file .*other\.c line \d+: duplicate definition of function 'foo'$
@tautschnig tautschnig force-pushed the extract/linking-structural-equivalence branch from 547b812 to fd54c79 Compare June 18, 2026 19:23
Function parameters may require application of type fixes: our linking
ensures that all types have a unique tag, and some parameter's types may
require such tag updates. We used to do this for objects already, but
need to apply it to parameters as well. Also, the order matters: all
types need to be sorted out first before attempting objects or
functions.

Spotted while trying to compile a particular version of Xen 4.11.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig force-pushed the extract/linking-structural-equivalence branch from fd54c79 to 029dd1c Compare June 18, 2026 19:34
@codecov

codecov Bot commented Jun 19, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 73.25581% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.69%. Comparing base (321ba11) to head (7fd4d33).

Files with missing lines Patch % Lines
src/linking/linking.cpp 72.61% 23 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #9061      +/-   ##
===========================================
+ Coverage    80.68%   80.69%   +0.01%     
===========================================
  Files         1714     1714              
  Lines       189501   189577      +76     
  Branches        73       73              
===========================================
+ Hits        152902   152987      +85     
+ Misses       36599    36590       -9     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

When comparing types across translation units, the linker used to treat
two struct/union types with the same C source tag as different whenever
their irept representations differed at the byte level.  In practice
this happens for any non-trivial header chain where one TU sees a
sub-type as complete and another sees only a forward declaration:
type-name canonicalisation propagates the completion state up through
any anonymous tag identifiers nested inside the outer type, and the
byte-level comparison rejects the pair.

The original RENAME_NEW path then cascades through
do_type_dependencies() to every type that transitively references the
mismatching pair, and through copy_symbols()'s rename_new_symbol pass
all references in function/object types are rewritten to ST[...]-style
synthetic identifiers.  duplicate_code_symbol() compares the rewritten
new-side type against the un-rewritten main-side type, sees them as
different, and emits 'ignoring conflicting function declarations'.

Add a recursive structural-equivalence check that follows tag references
through both namespaces and returns true when:
 - a tag pair resolves (via main_ns / src_ns) to equivalent types;
 - one side is incomplete and the other complete (linking will keep the
   complete one);
 - both are complete struct/union with matching component count and
   names, and pairwise structurally-equivalent component types.

Wire it into needs_renaming_type() and duplicate_type_symbol() as a
final pass before defaulting to RENAME_NEW or to the 'unexpected
difference between type symbols' error.  The existing
incomplete-vs-complete special cases retain priority so that the linker
continues to merge incomplete declarations into complete ones.

Add regression test Linking12 covering the exact pattern: an outer
struct containing an anonymous union whose synthetic tag identifier
diverges between TUs because of an inner forward-vs-complete type.

Co-authored-by: Kiro <kiro-agent@users.noreply.github.com>
@tautschnig tautschnig force-pushed the extract/linking-structural-equivalence branch from 029dd1c to 7fd4d33 Compare June 19, 2026 08:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants