diff --git a/HYPOTHESIS.md b/HYPOTHESIS.md new file mode 100644 index 00000000000..1696df1696b --- /dev/null +++ b/HYPOTHESIS.md @@ -0,0 +1,409 @@ +# Hypothesis Analysis: False Positive Byref Errors in seq.fs + +## Problem +The transformation to fix struct object expression captures is incorrectly triggering on `seq.fs` code that uses byref parameters in object expression methods. The code is valid but the transformation is capturing ALL free variables including byref-typed ones, which then causes errors. + +## Context +The errors all occur in enumerator implementations like: +```fsharp +let map f (e: IEnumerator<_>) : IEnumerator<_> = + upcast + { new MapEnumerator<_>() with + member _.DoMoveNext(curr: byref<_>) = // 'curr' is a byref PARAMETER, not a captured variable + if e.MoveNext() then + curr <- f e.Current + true +``` + +The variable `curr` is a **byref parameter** to the method, not a captured free variable. The transformation is incorrectly treating it as a free variable that needs to be extracted. + +--- + +## Hypothesis 1: Free Variable Analysis Includes Method Parameters + +**Theory**: The `freeInExpr` function is incorrectly identifying method parameters (including byref parameters) as "free variables" that need extraction. These parameters are bound within the method scope and should NOT be captured. + +**How to Test**: +1. Add debug logging in `TryExtractStructMembersFromObjectExpr` to print the `LogicalName` and `IsByRefPointer` of all detected "problematic" variables +2. Check if `curr` appears in the `problematicVars` list +3. Check if `curr.IsByRefPointer` is true + +**How to Fix**: +Add a filter to exclude method parameters from extraction: +```fsharp +let problematicVars = + freeVars.FreeLocals + |> Zset.elements + |> List.filter (fun (v: Val) -> + // EXCLUDE method parameters - they are not actually captured + not v.IsParameter && + // Case 1: Variable belongs to a struct type + (v.HasDeclaringEntity && v.DeclaringEntity.Deref.IsStructOrEnumTycon) + || + // Case 2: Variable without declaring entity (likely struct constructor param) + (not v.HasDeclaringEntity && not v.IsModuleBinding)) +``` + +**Expected Outcome**: The byref parameters in seq.fs would be excluded from extraction, eliminating the false positive errors. + +--- + +## Hypothesis 2: Transformation is Being Applied to Non-Struct Contexts + +**Theory**: The transformation is being applied to object expressions that are NOT inside struct member methods (e.g., module-level functions in seq.fs). The early guard `if isInterfaceTy` only checks if the object expression implements an interface, but doesn't check if we're in a struct context. + +**How to Test**: +1. Add debug logging at the start of `TryExtractStructMembersFromObjectExpr` to print: + - `enclosingStructTyconRefOpt` (should be None for module functions) + - `isInterfaceTy` value + - Whether any problematic vars were found +2. Check if transformation is running for seq.fs enumerators + +**How to Fix**: +Add an early guard to skip transformation when not in a struct context: +```fsharp +let TryExtractStructMembersFromObjectExpr ... = + // Only transform if we're inside a struct member method + match enclosingStructTyconRefOpt with + | None -> [], Remap.Empty // Not in a struct, skip transformation + | Some _ when isInterfaceTy -> [], Remap.Empty // Interface only, skip + | Some _ -> + // Continue with transformation logic... +``` + +**Expected Outcome**: The transformation would not run at all for module-level functions in seq.fs, eliminating all false positives. + +--- + +## Hypothesis 3: Byref Variables Are Being Incorrectly Remapped + +**Theory**: The transformation correctly identifies that `curr` should NOT be extracted, but the remapping logic is still trying to remap it anyway, creating illegal captured byref references. + +**How to Test**: +1. Add debug logging before creating the `Remap` to show which variables are in the remap dictionary +2. Check if `curr` or other byref variables appear in the remap +3. Verify that the remapping is only happening for struct members, not for all variables + +**How to Fix**: +Ensure the remap ONLY includes the variables we extracted to locals: +```fsharp +// Only create remap for the variables we actually extracted +let remap = + if List.isEmpty problematicVars then + Remap.Empty + else + let captureLocals = problematicVars |> List.map (fun v -> mkCompGenLocal v.Range v.LogicalName v.Type) + let captureBindings = List.zip captureLocals problematicVars |> List.map (fun (local, orig) -> + local, exprForValRef mWholeExpr (mkLocalValRef orig)) + + // Build remap: original var -> local var + let remapDict = + (problematicVars, captureLocals) + ||> List.zip + |> List.fold (fun acc (origV, localV) -> + ValMap.add origV (mkLocalValRef localV) acc) ValMap.empty + + captureBindings, Remap.Empty.BindVals remapDict +``` + +**Expected Outcome**: Only struct members would be remapped, byref parameters would remain unchanged. + +--- + +## Recommended Action Priority + +1. **TEST Hypothesis 2 FIRST** - This is most likely the root cause. The transformation shouldn't run at all for module functions. +2. **TEST Hypothesis 1 SECOND** - Even if #2 is fixed, we should still filter out parameters as a defensive measure. +3. **TEST Hypothesis 3 LAST** - Only if #1 and #2 don't fully resolve the issue. + +## Implementation Plan + +1. Add debug instrumentation to verify which hypothesis is correct +2. Implement the fix for the correct hypothesis +3. Remove debug instrumentation +4. Run bootstrap build to verify FSharp.Core compiles +5. Run tests to verify the original issue is still fixed + +--- + +# Test Failure Analysis - StructObjectExpression Tests + +## Summary of Test Failures + +**Test Run**: All 3 StructObjectExpression tests FAILED +- Test 1: "Simple case" - TypeLoadException (byref field still generated) +- Test 2: "Multiple fields" - FS0406 error (underscore variable '_' treated as byref) +- Test 3: "Referencing field in override method" - FS0406 error (same issue) + +## Hypothesis 4: env.eFamilyType Not Set for Struct Members + +**Theory**: `env.eFamilyType` is not being set when type-checking struct member methods, so `enclosingStructTyconRefOpt` is always None, and the transformation never runs. + +**Evidence**: +- Test 1 still gets TypeLoadException, indicating no transformation occurred +- The transformation has early guard for None that would skip it + +**How to Test**: +Add debug logging in CheckExpressions.fs to print `env.eFamilyType` value when calling the transformation. + +**How to Fix**: +If `env.eFamilyType` is not set, need to find alternative way to detect struct context. Options: +1. Check the type being defined in the current scope +2. Pass additional context from the calling code +3. Use a different mechanism to detect struct member methods + +**Status**: TESTING + +## Hypothesis 5: IsMemberOrModuleBinding Filter Rejecting Struct Fields + +**Theory**: Constructor parameters that become struct fields are not marked as `IsMemberOrModuleBinding`, so they're being filtered out and not extracted. + +**Evidence**: +- Tests use struct constructor parameters (test, x, y, value) +- Filter includes `v.IsMemberOrModuleBinding` which might be false for these + +**How to Test**: +Add debug logging to see what `IsMemberOrModuleBinding` returns for the test struct parameters. + +**How to Fix**: +Need to identify the correct property/check for struct constructor parameters. Possibilities: +1. Check `v.IsCtorThisVal` or similar +2. Check if variable is declared in constructor scope +3. Remove `IsMemberOrModuleBinding` filter and rely on other checks + +**Status**: TESTING + +## Hypothesis 6: Underscore Variable Being Captured + +**Theory**: The error mentions byref variable '_' which suggests the `this` parameter (represented as underscore in method signatures) is being treated as a problematic variable. + +**Evidence**: +- Error messages say "The byref-typed variable '_' is used in an invalid way" +- Struct methods use `member _.MethodName()` syntax + +**How to Test**: +Check if the free variable analysis is finding the implicit `this` parameter (represented as _). + +**How to Fix**: +Filter out the `this` parameter from problematic variables. Check for: +1. `v.IsCtorThisVal` +2. `v.BaseOrThisInfo = ThisVal` +3. Variable name is a compiler-generated name for `this` + +**Status**: INVESTIGATING + +## Hypothesis 7: env.eFamilyType Overwritten by EnterFamilyRegion + +**Theory**: In CheckExpressions.fs line 7240, `EnterFamilyRegion tcref env` sets `env.eFamilyType` to the object expression's base type (tcref), OVERWRITING any previous family type. If we're inside a struct member when creating the object expression, the struct's TyconRef in `env.eFamilyType` is lost before we can use it at line 7315. + +**Evidence**: +- Line 7240: `let env = EnterFamilyRegion tcref env` - tcref is the object expression's type +- Line 7315: We try to read `env.eFamilyType` but it's now the object expression type, not the struct +- Debug output shows: `DEBUG: TryExtractStructMembers - No enclosing struct (enclosingStructTyconRefOpt=None)` + +**How to Test**: +Add debug logging BEFORE line 7240 to see if `env.eFamilyType` contains a struct TyconRef. + +**How to Fix**: +Save the original `env.eFamilyType` BEFORE calling `EnterFamilyRegion` at line 7240: +```fsharp +// Save the enclosing family type before entering the object expression's family region +let enclosingStructTyconRefOpt = + match env.eFamilyType with + | Some tcref when tcref.IsStructOrEnumTycon -> Some tcref + | _ -> None + +// Object expression members can access protected members of the implemented type +let env = EnterFamilyRegion tcref env +``` + +## Hypothesis 8: Filter Logic Rejecting Constructor Parameters Without DeclaringEntity + +**Theory**: Struct constructor parameters captured by object expressions don't have `DeclaringEntity` set and aren't marked as `IsMemberOrModuleBinding`. The current filter logic requires one of these, so these parameters are being filtered out and not extracted. + +**Evidence from debug output**: +``` +DEBUG: Free var 'test': IsMemberOrModuleBinding=false, HasDeclaringEntity=false, IsModuleBinding=false +DEBUG: No problematic vars found - skipping transformation +``` + +The variable `test` is a struct constructor parameter that SHOULD be extracted, but: +- It doesn't have `DeclaringEntity` +- It's not marked as `IsMemberOrModuleBinding` +- Current filter logic rejects it + +**How to test**: +Debug output already confirms this - the variables are found as free vars but filtered out as "not problematic". + +**How to fix**: +When we're in a struct context (`enclosingStructTyconRefOpt` is Some), we should extract ALL non-parameter free variables that aren't module-level bindings. The fact that we're in a struct member method means any free variables captured are likely struct instance state. + +New filter logic: +```fsharp +match enclosingStructTyconRefOpt with +| None -> [] // Not in struct, no transformation needed +| Some _ -> + // In a struct context - extract free variables that could cause byref captures + freeVars.FreeLocals + |> Zset.elements + |> List.filter (fun (v: Val) -> + // Exclude constructor calls (they're not captured variables) + not (v.LogicalName = ".ctor") && + // Exclude module-level bindings (they're global, not instance state) + not v.IsModuleBinding) +``` + +## Hypothesis 9: Interface Implementations Also Need Transformation + +**Theory**: The early guard `if isInterfaceTy then [], Remap.Empty` was added based on the assumption that interface implementations don't cause byref issues. However, the test "Object expression in struct referencing field in override method" shows that interface implementations CAN capture struct members and cause the same byref issue. + +**Evidence from test failure**: +```fsharp +[] +type MyStruct(value: int) = + member _.CreateFoo() = { + new IFoo with + member _.DoSomething() = value * 2 // <-- Captures 'value' from struct + } +``` + +Error: `The byref-typed variable '_' is used in an invalid way.` + +Debug shows: `DEBUG: TryExtractStructMembers - Interface only (struct=MyStruct, isInterface=true)` + +**How to test**: +The test already demonstrates the issue. Remove the `isInterfaceTy` guard and see if it passes. + +**How to fix**: +Remove the early guard for `isInterfaceTy`. The transformation should apply to BOTH: +1. Object expressions deriving from base classes +2. Object expressions implementing interfaces + +When inside a struct member and capturing struct instance state. + +**Status**: FIXED (commit 7623be8) + +--- + +## Hypothesis 10: Static Members Incorrectly Transformed + +**Theory**: The transformation is running on static members that create object expressions, even though static members don't have instance state that could cause byref captures. The issue is in FSharp.Core seqcore.fs: + +```fsharp +[] +type StructBox<'T> = + member x.Value = value + + static member Comparer = // <-- STATIC member, not instance + let gcomparer = HashIdentity.Structural<'T> + { new IEqualityComparer> with + member _.GetHashCode(v) = gcomparer.GetHashCode(v.Value) + member _.Equals(v1,v2) = gcomparer.Equals(v1.Value,v2.Value) } +``` + +**The Problem**: +1. We're inside StructBox's static member → `env.eFamilyType` = Some(StructBox) +2. Object expression parameters `v`, `v1`, `v2` are of type `StructBox<'T>` +3. Current filter: excludes `.ctor` and module bindings only +4. Parameters `v`, `v1`, `v2` pass the filter (they're not .ctor, not module bindings) +5. Transformation incorrectly extracts them → creates invalid local bindings + +**Root Cause**: +The filter doesn't distinguish between: +- **Actual instance members** of the enclosing struct (e.g., `this.value`, `this.field`) +- **Method parameters** that happen to be the struct type (e.g., `v`, `v1`, `v2`) + +**How to test**: +Add check: Only extract variables that are **actual instance members** of the enclosing struct. +- Must have `v.HasDeclaringEntity` +- AND `tyconRefEq g v.DeclaringEntity enclosingTcref` + +**How to fix**: +Change filter from: +```fsharp +v.LogicalName <> ".ctor" && not v.IsModuleBinding +``` + +To: +```fsharp +v.LogicalName <> ".ctor" && +not v.IsModuleBinding && +v.HasDeclaringEntity && +(match tryTcrefOfAppTy g v.Type with + | ValueSome tcref -> tyconRefEq g tcref enclosingTcref + | _ -> false) +``` + +This ensures we only extract variables that are instance members of the ENCLOSING struct, not just any variable of the struct type. + +**Status**: IMPLEMENTING FIX + + + +## Hypothesis 10: Static member false positive - ROOT CAUSE & REVERT + +**Status**: ROOT CAUSE IDENTIFIED - ALL TRANSFORMATION CODE REVERTED + +**Problem**: StructBox.Comparer static member in seqcore.fs causes regression + +**Root Cause**: The approach using `env.eFamilyType` is fundamentally flawed: +- `env.eFamilyType` represents the TYPE being defined (e.g., StructBox) +- It does NOT distinguish between static vs instance member contexts +- Static member: `env.eFamilyType` = Some(StructBox) even though there's no `this` +- Instance member: `env.eFamilyType` = Some(StructBox) with a `this` context + +**Critical Discovery**: Even after COMPLETELY DISABLING the transformation (`enclosingStructTyconRefOpt = None`), +the seqcore.fs error persists! This proves the bootstrap compiler has the buggy code baked in. + +**What This Means**: +1. Can't fix the regression with code changes alone - need to rebuild bootstrap +2. The current approach is architecturally wrong +3. Need a different detection mechanism that identifies instance member context + +**Decision**: REVERTED ALL TRANSFORMATION CODE +- Removed TryExtractStructMembersFromObjectExpr function (89 lines) +- Removed transformation calls from CheckExpressions.fs (42 lines) +- Kept test file for future reference +- Total: 131 lines of code removed + +**Next Steps**: Need a completely fresh approach that: +1. Detects instance member context (not just struct type context) +2. Probably needs to check for presence of a `this` value in scope +3. Or pass additional context parameter through TcObjectExpr + +**Test Results**: N/A - transformation completely removed + +--- + +## Hypothesis 11: Test Validation Without Fix + +**Theory**: With all transformation code removed, the tests should fail with the original bug, confirming they are correctly written to detect the issue. + +**How to test**: +Run: `dotnet test tests/FSharp.Compiler.ComponentTests/FSharp.Compiler.ComponentTests.fsproj --filter "FullyQualifiedName~StructObjectExpression"` + +**Expected Results**: All 3 tests should fail with byref-related errors + +**Actual Results**: ✅ CONFIRMED - All 3 tests fail as expected: + +**Test 1**: "Object expression in struct should not generate byref field - simple case" +- **Error**: `TypeLoadException: A ByRef or ByRef-like type cannot be used as the type for an instance field in a non-ByRef-like type` +- **Meaning**: The object expression closure is capturing a byref to the struct, generating invalid IL that fails at runtime + +**Test 2**: "Object expression in struct with multiple fields" +- **Error**: `Error 406: The byref-typed variable '_' is used in an invalid way. Byrefs cannot be captured by closures or passed to inner functions.` +- **Location**: Line 8, column 17-25 (the `CreateObj()` method) +- **Meaning**: Compiler detects the invalid byref capture at compile time + +**Test 3**: "Object expression in struct referencing field in override method" +- **Error**: `Error 406: The byref-typed variable '_' is used in an invalid way. Byrefs cannot be captured by closures or passed to inner functions.` +- **Location**: Line 9, column 17-28 (the `CreateFoo()` method) +- **Meaning**: Compiler detects the invalid byref capture at compile time + +**Conclusion**: +- ✅ Tests are properly written and correctly detect the bug +- ✅ The problem is real and reproducible +- ✅ Test suite is ready for validating a future correct implementation +- ❌ Need a different architectural approach to fix the issue + +**Status**: INVESTIGATION COMPLETE - AWAITING CORRECT IMPLEMENTATION APPROACH diff --git a/tests/FSharp.Compiler.ComponentTests/Conformance/Expressions/ObjectExpressions/StructObjectExpression.fs b/tests/FSharp.Compiler.ComponentTests/Conformance/Expressions/ObjectExpressions/StructObjectExpression.fs new file mode 100644 index 00000000000..7fc00cb16f1 --- /dev/null +++ b/tests/FSharp.Compiler.ComponentTests/Conformance/Expressions/ObjectExpressions/StructObjectExpression.fs @@ -0,0 +1,65 @@ +namespace FSharp.Compiler.ComponentTests.Conformance.Expressions + +open Xunit +open FSharp.Test.Compiler + +module StructObjectExpression = + + [] + let ``Object expression in struct should not generate byref field - simple case`` () = + Fsx """ +type Class(test : obj) = class end + +[] +type Struct(test : obj) = + member _.Test() = { + new Class(test) with + member _.ToString() = "" + } + +let s = Struct(42) +let obj = s.Test() + """ + |> withOptions [ "--nowarn:52" ] // Suppress struct copy warning + |> compileExeAndRun + |> shouldSucceed + + [] + let ``Object expression in struct with multiple fields`` () = + Fsx """ +type Base(x: int, y: string) = class end + +[] +type MyStruct(x: int, y: string) = + member _.CreateObj() = { + new Base(x, y) with + member _.ToString() = y + string x + } + +let s = MyStruct(42, "test") +let obj = s.CreateObj() + """ + |> withOptions [ "--nowarn:52" ] + |> compileExeAndRun + |> shouldSucceed + + [] + let ``Object expression in struct referencing field in override method`` () = + Fsx """ +type IFoo = + abstract member DoSomething : unit -> int + +[] +type MyStruct(value: int) = + member _.CreateFoo() = { + new IFoo with + member _.DoSomething() = value * 2 + } + +let s = MyStruct(21) +let foo = s.CreateFoo() +let result = foo.DoSomething() + """ + |> withOptions [ "--nowarn:52" ] + |> compileExeAndRun + |> shouldSucceed diff --git a/tests/FSharp.Compiler.ComponentTests/FSharp.Compiler.ComponentTests.fsproj b/tests/FSharp.Compiler.ComponentTests/FSharp.Compiler.ComponentTests.fsproj index c7c7127c070..0fa99cc4339 100644 --- a/tests/FSharp.Compiler.ComponentTests/FSharp.Compiler.ComponentTests.fsproj +++ b/tests/FSharp.Compiler.ComponentTests/FSharp.Compiler.ComponentTests.fsproj @@ -87,6 +87,7 @@ +