Experimental: Symbol-based usage detection (opt-in)#135
Experimental: Symbol-based usage detection (opt-in)#135
Conversation
Replace the GetUsedAssemblyReferences approach with a Roslyn analyzer that tracks symbol usage at finer granularity, behind the ReferenceTrimmerUseSymbolAnalysis MSBuild property (opt-in, defaults to false). The new approach uses RegisterSymbolAction and RegisterOperationAction to track which assemblies contain symbols that the code actually references, rather than relying on the compiler's broader 'used assembly' heuristic which over-reports usage by treating transitive assembly dependencies as used. Key design decisions: - RT0001 (bare Reference): always uses conservative transitive closure to avoid breaking runtime dependencies that lack automatic transitive resolution - RT0002 (ProjectReference): uses transitive closure only when DisableTransitiveProjectReferences is set; otherwise uses precise detection - RT0003 (PackageReference): always uses precise symbol-based detection since NuGet handles transitive package deps automatically - Attribute constructor/named arguments (including typeof) are tracked - Early exit optimization when all reference assemblies are already tracked The legacy GetUsedAssemblyReferences code path is preserved as the default. All E2E tests run in both modes via DataRow parameterization (91 pass). Version bumped from 3.4 to 3.5. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
| // Build mappings from reference assembly identities to their metadata reference display paths. | ||
| // These are used both for symbol tracking and for the transitive closure computation. | ||
| var assemblyToPath = new Dictionary<AssemblyIdentity, string>(); | ||
| var pathToAssembly = new Dictionary<string, IAssemblySymbol>(StringComparer.OrdinalIgnoreCase); |
There was a problem hiding this comment.
Consider making case-agnostic comparer on Windows and case-sensitive on non-Windows.
There was a problem hiding this comment.
Applies to other comparers used in this project
|
Linking to dotnet/roslyn#625 as the original issue in the Roslyn repo. Tagging @AlekseyTs (the author of |
If I correctly interpret the approach, I think this approach is likely to undereport references needed for a successful build. There are situations when compiler needs an formation from types or assemblies that aren't explicitly mentioned in code. |
|
@AlekseyTs mind sharing an example? Are you saying there are things that are unavailable to analyze during compilation, or that they're just missing from the current implementation? |
I didn't review the implementation in this PR. Based on description of the approach, I assumed that implementation records information based on symbols referenced explicitly in source. If that is the case, this approach is likely to undereport references needed for a successful build. There are situations when compiler needs an formation from types or assemblies that aren't explicitly mentioned in code. For example, assemblies with type forwarders aren't referenced explicitly in source, but they are necessary to locate forwarded types. This is just one example, I am pretty sure there are other scenarios. |
|
@AlekseyTs i tried to cover some of the gaps in #138 . I think there two possibilities - either some of the functionality of Also, isn't type forwarding defined in the code, as an assembly attribute? That's available during compilation |
* type forwarding * Fill out the gaps * Add tests
Summary
Adds an experimental symbol-based analysis mode behind
ReferenceTrimmerUseSymbolAnalysis(opt-in, defaults to false). The legacyGetUsedAssemblyReferencescode path is preserved as the default.Motivation
GetUsedAssemblyReferencesover-reports usage by treating transitive assembly dependencies as "used" even when the project's code doesn't reference them directly.Approach
Uses
RegisterSymbolAction+RegisterOperationActionto track which assemblies contain symbols that code actually references. Safety measures for runtime deps: RT0001 uses conservative transitive closure for bare References; RT0002 respectsDisableTransitiveProjectReferences; RT0003 uses precise detection (NuGet handles transitive deps).Opt-in
xml <PropertyGroup> <ReferenceTrimmerUseSymbolAnalysis>true</ReferenceTrimmerUseSymbolAnalysis> </PropertyGroup>Testing
All E2E tests run in both modes via DataRow parameterization (91 pass). New test
UnusedDirectReferenceUsedTransitivelyvalidates the key improvement.Rollout plan