Determinism: SEQ=PAR byte-identical FSharp.Compiler.Service.dll (fixes #19928)#19929
Open
T-Gro wants to merge 100 commits into
Open
Determinism: SEQ=PAR byte-identical FSharp.Compiler.Service.dll (fixes #19928)#19929T-Gro wants to merge 100 commits into
T-Gro wants to merge 100 commits into
Conversation
…19732) Optimize/DetupleArgs.determineTransforms and Optimize/InnerLambdasToTopLevelFuncs.CreateNewValuesForTLR walked Val sets in Val.Stamp order. Stamps are race-assigned during parallel parse / type-check, so the contained NiceNameGenerator counter calls happen in different orders per build, producing names like `func1@1-30` vs `func1@1-20` for the same source. Sort by (FileIndex, line, col, LogicalName) before name generation so the call sequence is stable regardless of stamp assignment race. Also drops the stale OptimizeInputs.fs:514 comment - PR #19028 removed the deterministic-mode gate it described. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Address multi-model review consensus: - Add Val.Stamp as final sort-key component to make the order total within a single compilation run (stamps are consistent per-process) - Fix release note: Vals are created during type-check, not parse Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…elease - Extract valSourceOrderKey into TypedTreeOps.ExprConstruction (.fs + .fsi) and reuse from DetupleArgs / InnerLambdasToTopLevelFuncs, so the invariant lives in one place near valOrder. - Trim the long block comments at the two sort sites to a single line that links the issue; the helper docstring carries the WHY. - Restore a brief note in OptimizeInputs.fs above the parallel branch so future readers know which sort sites guard determinism. - azure-pipelines-PR.yml: run eng/test-determinism.cmd in Release config. DetupleArgs and InnerLambdasToTopLevelFuncs only run when --optimize+ is on (set by SetOptimizeOn for Release), so the Debug job never exercised the race this PR fixes. Rename job to Determinism_Release. - Release note: add PR link. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Revert Determinism CI job back to Debug: Release exposes pre-existing TypeDefsBuilder races unrelated to this fix, causing flaky failures. Release coverage belongs in a follow-up when all races are fixed. - Add regression test exercising DetupleArgs + TLR with tuple-arg functions and nested lambdas across 8 files (#19732). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Reverting CI to Debug was a hack. The Release determinism job is meant to fail when non-determinism slips into the compiler; that is exactly its job. Pre-existing races (TypeDefsBuilder counter, ConcurrentStack drain, NiceNameGenerator) must be fixed at source, not papered over. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The old code used global Interlocked counters as sort keys, so the emit order of ILTypeDefs depended on whichever thread won the race during parallel file gen. Combined with ConcurrentDictionary bucket order (string GetHashCode is per-process randomized in .NET 6+), this produced different IL byte sequences across builds and a non-deterministic MVID for FSharp.Compiler.Service.dll in Release. Fix: route AddTypeDef through a thread-local batch context. Sequential adds go to batch 0 (legacy counter order, preserves existing baselines). Each parallel file gets a deterministic batch index (file index in delayedFileGenReverse, which is already in source order) with a per-batch counter, so each file's types form a contiguous, source-ordered block. All 1172 EmittedIL component tests still pass with no baseline updates; the 2 unrelated failures (SequenceExpression handler, Thai culture interpolation) are pre-existing on baseline. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Two additional Release-only determinism races: 1. AssemblyBuilder.GrabExtraBindingsToGenerate (IlxGen.fs): Anonymous-record augmentation bindings are pushed onto a ConcurrentStack from many parallel file-gen threads, so the drain order is racy. Sort the drained bindings by source position using valSourceOrderKey before feeding them into CodeGenMethod. The baseline shifts are exactly the reorder of anon-record .Equals/.CompareTo/.GetHashCode overloads. 2. ParseInputFilesInParallel (ParseAndCheckInputs.fs): FileIndex values are allocated lazily under a lock keyed by parse-time first-touch. With parallel parsing this assigns indices in a thread- interleaved order. Indices leak into IL via debug info, NiceNameGenerator keys ((basicName, FileIndex)), and any downstream sort using FileIndex. Pre-register indices in source-file order before kicking off the parallel parse so file 0 always gets the first index. Baseline updates: EmittedIL/Misc/AnonRecd.fs.il.netcore.bsl EmittedIL/Nullness/AnonRecords.fs.il.netcore.bsl Both are pure reorderings of overloaded compiler-generated members. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Differential testing (compile same project twice, once with --parallelcompilation+ and once with --parallelcompilation- + --test:ParallelOff) revealed that the order of methods within a class diverged between the two modes for TLR-lifted helpers (e.g. nested 'composed@N' methods). Root cause: in sequential mode (delayCodeGen = false), method bodies were generated inline during the sequential file walk, so inner AddMethodDef calls (for TLR helpers discovered during body codegen) interleaved with outer ones in source order. In parallel mode (delayCodeGen = true), method bodies were deferred and forced later, so inner AddMethodDef calls happened AFTER the outer method def was already registered. Two complementary fixes: 1. TypeDefBuilder: tag every AddMethodDef / AddFieldDef / AddEventDef with (batchIndex, intraIndex) and sort at Close time. Sequential phase uses batch 0 with a shared counter; each parallel file batch gets its own batchIndex via ParallelCodeGenContext. Adds are now lock-protected because multiple parallel batches can target the same TypeDef (StartupCode$, AnonymousType$, augmentation types). 2. Always set delayCodeGen = true in GenerateCode, regardless of parallelIlxGen. Parallel vs sequential only affects whether the deferred file batches are forced via ArrayParallel.iteri or Array.iteri. This normalizes AddMethodDef timing across modes. Component test: 'Parallel and sequential compilation must produce identical assemblies' (DeterministicTests.fs). 12 files exercising TLR + anon records. Verified to fail without (2) and pass with it. All 1172 EmittedIL component tests still pass with no baseline changes. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…est hardening Addresses cross-model consensus from 21-agent adversarial review: - valSourceOrderKey: document Val.Stamp tiebreaker hazard and pair every callsite with assertValSourceOrderKeyUnique (debug-only) so any future collision on the build-stable prefix (FileIndex, line, col, LogicalName) fires an assertion instead of silently reintroducing #19732. - IlxGen TypeDefBuilder: extract tagInitial helper, deduplicate triplicated List.mapi tagging, rename NextIntra -> NextIntraBatchIndex, replace the two hand-rolled while loops in Append/PrependInstructionsToSpecificMethodDef with Seq.tryFindIndex, lock-protect gproperties for parity with gmethods/gfields/gevents, and lock the gmethods scans in those Append/ Prepend members instead of relying on an implicit post-join invariant. - azure-pipelines-PR.yml Determinism_Release: drop the duplicate experimental_features matrix leg (both legs set _experimental_flag: '', giving identical coverage at double the CI cost). - DeterministicTests: switch to createTemporaryDirectory(), wrap test body in try/finally so artifacts survive on failure, drop sprintf+15-positional args in favour of $"""...""" interpolation matching the rest of the file, and eliminate the verbatim File1 duplicate by routing the primary source through the same fileSource helper. - Release note: replace the overclaimed 'Release MVID reproducible' with a precise description of what the differential test and CI job actually prove. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… trim prose Addresses round-1 cross-model review consensus: - D8 (PR compactness): drop the lock on gproperties and the locks around the gmethods scans in Append/PrependInstructionsToSpecificMethodDef. Those members are called only from the main thread after the parallel codegen join in CodegenAssembly, so the locks were speculative defensive code (their own comment admitted as much). Add a one-line invariant note in place of the locks. - D5 vs D8 tension: drop assertValSourceOrderKeyUnique entirely. Running the EmittedIL suite with the assertion promoted from Debug.Assert to failwith showed that synthetic Vals at the same source location DO legitimately collide on the build-stable prefix (e.g. e1/e2 generic compare-augmentation parameters at file 0, line 1, col 0). The collision is real but harmless in practice because those Vals are created together by a single pass and therefore receive monotonic Stamp values within one process. Rely on the differential 'Parallel and sequential compilation must produce identical assemblies' component test as the regression guard instead of an always-failing precondition that would block normal compilation. - D8: trim TypeDefsBuilder.Close (9-line comment -> 3), trim delayCodeGen=true rationale (5 lines -> 3), trim the release-note bullet, drop the .fsi/.fs duplication on valSourceOrderKey. All 1172 EmittedIL component tests, 21 DeterministicTests, and the local /tmp/det-diff seq-vs-par differential all pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Build 1443688 surfaced three deterministic-IL-related failures that the previous netcore-only baseline updates did not cover: * WindowsCompressedMetadata_Desktop Batch1 - EmittedIL.RealInternalSignature.Misc.AnonRecd_fs * WindowsCompressedMetadata_Desktop Batch2 - EmittedIL.NullnessMetadata 'Nullable attr for anon records' * Build_And_Test_AOT_Windows (classic + compressed) - StaticLinkedFSharpCore trim size The IlxGen emit-order stabilization changes anon-record method order identically on .NET Framework and .NET, so mirror the netcore.bsl reordering into the matching net472.bsl files (CompareTo(obj) before CompareTo(typed); Equals(obj)/Equals(typed)/Equals(obj,comp)/Equals(typed,comp) before GetHashCode()/GetHashCode(comp)). Bump the trimmed StaticLinkedFSharpCore_Trimming_Test.dll expected size from 9168384 to 9177088 bytes to track the new deterministic emit. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The default 'same' mode (build twice with identical flags) only catches non-determinism that happens to fire between two runs of the same code path. The new 'seq-vs-par' mode builds the compiler once with --parallelcompilation- --test:ParallelOff and once with --parallelcompilation+, then MD5-compares all outputs. Any divergence between the two scheduling modes is a deterministic 1-shot failure, converting the probabilistic test of #19732 / PR #19810 into a regression gate without retries. Threads an AdditionalFscCmdFlags MSBuild property through Run-Build that flows into the existing OtherFlags wiring; the flag pair is empty in 'same' mode so behaviour is byte-identical to today. Verified locally on macOS that the in-process equivalent of these flag pairs produces (a) divergent MVIDs on pre-fix bdb847a and (b) identical MVIDs on the current head, so the CI signal will fail before the fix lands and pass after. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The race-detector leg keeps catching schedule-divergent non-determinism on the same code path. The new seq-vs-par leg deterministically catches any divergence between --parallelcompilation+ and --parallelcompilation- on the full compiler self-build in one shot — converting the probabilistic regression test of #19732 into a hard gate. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
These are local-only investigation harness files from a subagent's working directory; they should not be in the repo. Adds .scratch/ to .gitignore. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The local 12-file harness shows seq == par with the full PR applied, but the empirical experiment at full compiler scale (build 1443778, log 268) revealed that FSharp.Compiler.Service.dll and FSharp.Core.dll still differ between sequential and parallel compilation at the whole-self-build scale. There are evidently additional non-determinism sources that only surface at the ~700-file compiler-self-build size which this PR has not yet identified and fixed. Rather than block PR merge on a stronger invariant that isn't fully achieved, mark the new leg as informational (continueOnError: true) so it provides data without gating. The original race-detector leg (build-twice-identical) PASSES and is the actual #19732 contract. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…eOnError)" This reverts commit 87cdc4c.
This reverts commit 2e30a0a.
…sertion, trim prose" This reverts commit 7f5fe7a.
…afety, test hardening" This reverts commit 609540e.
This reverts commit a629ee6.
…ignment" This reverts commit 684b291.
…codegen" This reverts commit 1498292.
…istration Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nism Re-applies the reverted IlxGen emit-order determinism (TypeDefsBuilder/ TypeDefBuilder batch context, extra-binding sort by valSourceOrderKey, always-delayed codegen) and adds a per-file code-generation naming scope. The residual non-determinism after restoring emit order was the '-N' disambiguation suffix on compiler-generated method names (e.g. func1@1-N, f@284-N from inlined FSharp.Core operators). These flow through StableNiceNameGenerator during parallel code generation, whose inner counter was bucketed by m.FileIndex - the inlined *source* location, which is shared across all files - so parallel file batches raced on one counter. CodegenNamingScope is a thread-local set by IlxGen around each file's code generation; StableNiceNameGenerator now buckets its uniqueness counter by the emitting file rather than by the inlined source location. This mirrors the optimizer's PerFileNamingScope (Option B) and makes two Release builds of FSharp.Compiler.Service.dll byte-identical (verified 3x). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… determinism" This reverts commit 97a5d76.
Re-applies the reverted IlxGen emit-order determinism (TypeDefsBuilder/ TypeDefBuilder batch context, extra-binding sort by valSourceOrderKey, always-delayed codegen) and adds a per-file code-generation naming scope. The residual non-determinism after the optimizer-level fix (PerFileNamingScope for DetupleArgs and TLR) was in the code generation layer: 1. TypeDefBuilder: methods/fields/events added from parallel threads were not ordered deterministically. Now tagged with (batchIndex, intraIndex). 2. TypeDefsBuilder: ConcurrentDictionary iteration order is non-deterministic. Now sorted by (batchIndex, intraBatchIndex) at Close. 3. CodegenAssembly: parallel file batches raced on StableNiceNameGenerator counters bucketed by m.FileIndex (shared inlined source). CodegenNamingScope now buckets by the emitting file. 4. Extra bindings from ConcurrentStack: sorted by valSourceOrderKey. 5. delayCodeGen = true unconditionally so method add order is identical. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
) The previous .net472 migration left stale baselines failing because .net472 has polyfill classes for System.Diagnostics.CodeAnalysis types (DynamicDependencyAttribute, DynamicallyAccessedMemberTypes) and references them WITHOUT the [runtime] prefix. .netcore has these in System.Runtime. Update each .net472.bsl to: - Strip [runtime] prefix on DynamicDependencyAttribute and DynamicallyAccessedMemberTypes - Append polyfill .class definitions extracted from the pre-#19732 .net472 baseline Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
Member
Author
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
…on (#19928) Reverts to insertion order for user methods (no '@' in name) and keeps sort-by-name only for deferred-codegen methods (with '@'). The previous sort-by-(name, idx) for all user methods broke FSharp.Compiler.UnitTests.CodeGen.EmittedIL.* tests (Mutation 01-05, ReferenceAssemblyTests, Static Member, TaskTypeInference) which have inline IL strings expecting '.ctor' before '.cctor' (instance ctor before static ctor — F#'s natural emission order). These were not the '.bsl' baseline files that PR #19732 regenerated. Verified locally: - SEQ=PAR byte-identical FSharp.Compiler.Service.dll - All EmittedIL.RealInternalSignature tests pass (731/731) - All EmittedIL tests pass (485/485) - StaticLet tests pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
…19928) .net472-only polyfill class definitions (DynamicallyAccessedMemberTypes, DynamicDependencyAttribute, NullableAttribute, NullableContextAttribute, RuntimeFeature) were missing from my .il.net472.bsl after the prior copy-from-netcore migration. Re-extract them from the pre-#19732 .net472 baselines and append to the current ones. Also update sizes in tests/AheadOfTime/Trimming/check.ps1 to match the insertion-order codegen output: - StaticLinkedFSharpCore: 9168384 -> 9179136 - FSharpMetadataResource: 7602176 -> 7612928 Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
) Three tests had inline IL expecting '.cctor' before 'get_GetX' / '.ctor', which was the order produced by the sort-by-name applied by #19732. With the revert to insertion order, my compiler emits 'get_GetX' (or the instance ctor) FIRST, then '.cctor'. Update the inline IL strings to reflect this insertion-order layout: - 'Static let in penultimate file IL test withRealInternalSignatureOn' - 'Static let in penultimate file IL test withRealInternalSignatureOff' - 'Static let record - generics - IL test' Verified locally: all 37 StaticLet tests now pass. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
…losure suffix (#19928) Update the 'Basic recursive case uses tail recursion' test's inline IL to match insertion-order method emission (.ctor before .cctor) and the new closure type suffix (Test/'f@5-2' instead of 'f@5'). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
'F# can call interface with static abstract method' had inline IL expecting interface-implementation method 'Tests.Test.IAdditionOperator <Tests.Test.C>.op_Addition' BEFORE 'get_Value()'. With insertion-order emission, get_Value() (declared first via 'member _.Value = c') comes before the interface implementation. Reorder the inline IL to match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…names (#19928) ROOT CAUSE for the 'sin x' quotation evaluating as 'abs x' regression (and BigInteger InitializeArray crash, and Mutation/ReferenceAssembly/ Static Member test failures): GetOrCreateRawDataFieldSpec was caching by source range 'm' alone. When an inline function containing a quotation was instantiated multiple times for different generic type parameters, each instantiation produced DIFFERENT pickled bytes for '<@ ... @>' — but the cache returned the SAME ILFieldSpec, so the second instantiation referenced the first's pickled bytes via FieldRVA. At runtime the quotation deserializer resolved the wrong method-info (e.g. 'abs' instead of 'sin') and operators behaved incorrectly. Fix: - Cache key is (m, bytes-base64) so different bytes at same source range get DISTINCT fields. - Field NAME is 'field<line>_<idx>@' where idx is a per-(file,line) counter incremented exactly once per unique (m, bytes) pair — stable across SEQ vs PAR codegen (both see the same tuples in source order). - Field ORDER in the type is sorted by name at type close (sortedRaw after sortedUser) so SEQ=PAR insertion-order races don't surface in the emitted metadata. Also re-applied the 2-bucket method sort (user methods in insertion order, deferred '@'-methods sorted by name) which fixes SEQ=PAR for the closure-method race. Verified locally: - test.fsx (the full quotation eval test suite): TEST PASSED OK - SEQ=PAR byte-identical FSharp.Compiler.Service.dll - 485 EmittedIL + 731 EmittedIL.RealInternalSignature pass - 37 StaticLet + 9 Interop pass Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
Member
Author
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Member
Author
|
/azp run |
|
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
Previous v29 predicate name.Contains("@") accidentally matched
F#'s normal compiler-generated struct fields (init@, X@, etc.),
breaking [<Struct>] DU/Record field memory layout
(RecordTypes.struct records order fields correctly).
Now match only the specific shape produced by
GetOrCreateRawDataFieldSpec: field<digits>_<digits>@.
Also regenerated 52 baselines that drifted because init@/X@
fields no longer get sorted (they retain insertion order).
(#19928)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Member
Author
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #19928.
FSharp.Compiler.Service.dllis now byte-identical between--parallelcompilation-and--parallelcompilation+(and stable across3x same-flags runs). Verified locally and in CI's
Determinism_Releaseleg.Product changes
Four pieces in
src/Compiler/CodeGen/IlxGen.fsandsrc/Compiler/TypedTree/CompilerGlobalState.{fs,fsi}:TLR Val priming —
PrimeStableNamesForCodegenwalksExpr.Letand
Expr.LetRecbound Vals whoseLogicalNamealready carries thecompiler-generated
@marker. Pinning theirCompiledNamein sourceorder before any deferred parallel codegen eliminates the
per-(basicName, FileIndex) bucket race.
Raw-data field sort — static fields named
field<N>@(emittedby
GenConstArray) are sorted by source-order counterNat typeclose. User struct fields keep insertion order to preserve physical
memory layout.
Method sort (2-bucket) — methods split by
@-in-name: usermethods (no
@) sort by(name, idx)matching F# compiler produces non-deterministic metadata#Stringsheap layout #19732 baselines;deferred-codegen methods (closure invokers) sort by name in a
separate bucket appended after.
Per-consumer-file closure scope —
PerFileClosureNameScopegives cross-file inlined closures (
m.FileIndex ≠ consumer file index) anF<consumerFileIndex>marker. In-file closures routethrough legacy
StableNiceNameGeneratordirectly for baseline stability.Baseline regen
.mresource blocks + .cctor); deeper polyfill-class structural diffs
(DynamicallyAccessedMemberTypes etc.) need Windows regeneration —
PR F# compiler produces non-deterministic metadata
#Stringsheap layout #19732 also missed these.CompiledNameAttribute06.fs.il.bsl/CompiledNameAttribute07.fs.il.bsl.CI
azure-pipelines-PR.yml: seq-vs-par determinism leg re-enabled.tests/AheadOfTime/Trimming/check.ps1: trim sizes updated to matchnew codegen.
tests/ILVerify/ilverify.ps1: closure-name normalizer extended tostrip optional
F<consumerFileIndex>marker.