Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[main] Update dependencies from dotnet/runtime #12631

Closed

Conversation

dotnet-maestro[bot]
Copy link
Contributor

@dotnet-maestro dotnet-maestro bot commented Dec 12, 2024

This pull request updates the following dependencies

From https://github.com/dotnet/runtime

  • Subscription: d9f5b309-084f-43b5-02de-08d8b80548e4
  • Build: 20241217.3
  • Date Produced: December 17, 2024 6:40:45 PM UTC
  • Commit: 040cbe276907174316e2cc07b35814b3069874a6
  • Branch: refs/heads/main
Microsoft Reviewers: Open in CodeFlow

…1211.9

Microsoft.Internal.Runtime.WindowsDesktop.Transport , Microsoft.NET.Sdk.IL , Microsoft.NETCore.App.Ref , Microsoft.NETCore.App.Runtime.win-x64 , Microsoft.NETCore.ILAsm , Microsoft.NETCore.ILDAsm , Microsoft.NETCore.Platforms , Microsoft.Win32.Registry.AccessControl , Microsoft.Win32.SystemEvents , runtime.win-x64.Microsoft.NETCore.ILAsm , runtime.win-x86.Microsoft.NETCore.ILAsm , System.CodeDom , System.ComponentModel.Composition , System.ComponentModel.Composition.Registration , System.Configuration.ConfigurationManager , System.Data.Odbc , System.Data.OleDb , System.Diagnostics.EventLog , System.Diagnostics.PerformanceCounter , System.DirectoryServices , System.DirectoryServices.AccountManagement , System.DirectoryServices.Protocols , System.Formats.Nrbf , System.IO.Hashing , System.IO.Packaging , System.IO.Ports , System.Management , System.Reflection.Context , System.Reflection.MetadataLoadContext , System.Resources.Extensions , System.Runtime.Caching , System.Runtime.Serialization.Formatters , System.Security.Cryptography.Pkcs , System.Security.Cryptography.ProtectedData , System.Security.Cryptography.Xml , System.Security.Permissions , System.ServiceModel.Syndication , System.ServiceProcess.ServiceController , System.Speech , System.Text.Encoding.CodePages , System.Text.Encodings.Web , System.Text.Json , System.Threading.AccessControl , System.Windows.Extensions , VS.Redist.Common.NetCore.SharedFramework.x64.10.0
 From Version 10.0.0-alpha.1.24611.1 -> To Version 10.0.0-alpha.1.24611.9
@dotnet-maestro dotnet-maestro bot requested a review from a team as a code owner December 12, 2024 13:02
Copy link
Contributor

@dotnet-policy-service dotnet-policy-service bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Go, you big red fire engine!

…1213.1

Microsoft.Internal.Runtime.WindowsDesktop.Transport , Microsoft.NET.Sdk.IL , Microsoft.NETCore.App.Ref , Microsoft.NETCore.App.Runtime.win-x64 , Microsoft.NETCore.ILAsm , Microsoft.NETCore.ILDAsm , Microsoft.NETCore.Platforms , Microsoft.Win32.Registry.AccessControl , Microsoft.Win32.SystemEvents , runtime.win-x64.Microsoft.NETCore.ILAsm , runtime.win-x86.Microsoft.NETCore.ILAsm , System.CodeDom , System.ComponentModel.Composition , System.ComponentModel.Composition.Registration , System.Configuration.ConfigurationManager , System.Data.Odbc , System.Data.OleDb , System.Diagnostics.EventLog , System.Diagnostics.PerformanceCounter , System.DirectoryServices , System.DirectoryServices.AccountManagement , System.DirectoryServices.Protocols , System.Formats.Nrbf , System.IO.Hashing , System.IO.Packaging , System.IO.Ports , System.Management , System.Reflection.Context , System.Reflection.MetadataLoadContext , System.Resources.Extensions , System.Runtime.Caching , System.Runtime.Serialization.Formatters , System.Security.Cryptography.Pkcs , System.Security.Cryptography.ProtectedData , System.Security.Cryptography.Xml , System.Security.Permissions , System.ServiceModel.Syndication , System.ServiceProcess.ServiceController , System.Speech , System.Text.Encoding.CodePages , System.Text.Encodings.Web , System.Text.Json , System.Threading.AccessControl , System.Windows.Extensions , VS.Redist.Common.NetCore.SharedFramework.x64.10.0
 From Version 10.0.0-alpha.1.24611.1 -> To Version 10.0.0-alpha.1.24613.1
…1213.10

Microsoft.Internal.Runtime.WindowsDesktop.Transport , Microsoft.NET.Sdk.IL , Microsoft.NETCore.App.Ref , Microsoft.NETCore.App.Runtime.win-x64 , Microsoft.NETCore.ILAsm , Microsoft.NETCore.ILDAsm , Microsoft.NETCore.Platforms , Microsoft.Win32.Registry.AccessControl , Microsoft.Win32.SystemEvents , runtime.win-x64.Microsoft.NETCore.ILAsm , runtime.win-x86.Microsoft.NETCore.ILAsm , System.CodeDom , System.ComponentModel.Composition , System.ComponentModel.Composition.Registration , System.Configuration.ConfigurationManager , System.Data.Odbc , System.Data.OleDb , System.Diagnostics.EventLog , System.Diagnostics.PerformanceCounter , System.DirectoryServices , System.DirectoryServices.AccountManagement , System.DirectoryServices.Protocols , System.Formats.Nrbf , System.IO.Hashing , System.IO.Packaging , System.IO.Ports , System.Management , System.Reflection.Context , System.Reflection.MetadataLoadContext , System.Resources.Extensions , System.Runtime.Caching , System.Runtime.Serialization.Formatters , System.Security.Cryptography.Pkcs , System.Security.Cryptography.ProtectedData , System.Security.Cryptography.Xml , System.Security.Permissions , System.ServiceModel.Syndication , System.ServiceProcess.ServiceController , System.Speech , System.Text.Encoding.CodePages , System.Text.Encodings.Web , System.Text.Json , System.Threading.AccessControl , System.Windows.Extensions , VS.Redist.Common.NetCore.SharedFramework.x64.10.0
 From Version 10.0.0-alpha.1.24611.1 -> To Version 10.0.0-alpha.1.24613.10
…1216.1

Microsoft.Internal.Runtime.WindowsDesktop.Transport , Microsoft.NET.Sdk.IL , Microsoft.NETCore.App.Ref , Microsoft.NETCore.App.Runtime.win-x64 , Microsoft.NETCore.ILAsm , Microsoft.NETCore.ILDAsm , Microsoft.NETCore.Platforms , Microsoft.Win32.Registry.AccessControl , Microsoft.Win32.SystemEvents , runtime.win-x64.Microsoft.NETCore.ILAsm , runtime.win-x86.Microsoft.NETCore.ILAsm , System.CodeDom , System.ComponentModel.Composition , System.ComponentModel.Composition.Registration , System.Configuration.ConfigurationManager , System.Data.Odbc , System.Data.OleDb , System.Diagnostics.EventLog , System.Diagnostics.PerformanceCounter , System.DirectoryServices , System.DirectoryServices.AccountManagement , System.DirectoryServices.Protocols , System.Formats.Nrbf , System.IO.Hashing , System.IO.Packaging , System.IO.Ports , System.Management , System.Reflection.Context , System.Reflection.MetadataLoadContext , System.Resources.Extensions , System.Runtime.Caching , System.Runtime.Serialization.Formatters , System.Security.Cryptography.Pkcs , System.Security.Cryptography.ProtectedData , System.Security.Cryptography.Xml , System.Security.Permissions , System.ServiceModel.Syndication , System.ServiceProcess.ServiceController , System.Speech , System.Text.Encoding.CodePages , System.Text.Encodings.Web , System.Text.Json , System.Threading.AccessControl , System.Windows.Extensions , VS.Redist.Common.NetCore.SharedFramework.x64.10.0
 From Version 10.0.0-alpha.1.24611.1 -> To Version 10.0.0-alpha.1.24616.1
…1216.9

Microsoft.Internal.Runtime.WindowsDesktop.Transport , Microsoft.NET.Sdk.IL , Microsoft.NETCore.App.Ref , Microsoft.NETCore.App.Runtime.win-x64 , Microsoft.NETCore.ILAsm , Microsoft.NETCore.ILDAsm , Microsoft.NETCore.Platforms , Microsoft.Win32.Registry.AccessControl , Microsoft.Win32.SystemEvents , runtime.win-x64.Microsoft.NETCore.ILAsm , runtime.win-x86.Microsoft.NETCore.ILAsm , System.CodeDom , System.ComponentModel.Composition , System.ComponentModel.Composition.Registration , System.Configuration.ConfigurationManager , System.Data.Odbc , System.Data.OleDb , System.Diagnostics.EventLog , System.Diagnostics.PerformanceCounter , System.DirectoryServices , System.DirectoryServices.AccountManagement , System.DirectoryServices.Protocols , System.Formats.Nrbf , System.IO.Hashing , System.IO.Packaging , System.IO.Ports , System.Management , System.Reflection.Context , System.Reflection.MetadataLoadContext , System.Resources.Extensions , System.Runtime.Caching , System.Runtime.Serialization.Formatters , System.Security.Cryptography.Pkcs , System.Security.Cryptography.ProtectedData , System.Security.Cryptography.Xml , System.Security.Permissions , System.ServiceModel.Syndication , System.ServiceProcess.ServiceController , System.Speech , System.Text.Encoding.CodePages , System.Text.Encodings.Web , System.Text.Json , System.Threading.AccessControl , System.Windows.Extensions , VS.Redist.Common.NetCore.SharedFramework.x64.10.0
 From Version 10.0.0-alpha.1.24611.1 -> To Version 10.0.0-alpha.1.24616.9
@lonitra
Copy link
Member

lonitra commented Dec 17, 2024

We are observing access violations / ExecutionEngineExceptions in our tests with this update. I tried to investigate further by debugging our ScrollBarTests suite, which is where I first saw tests aborting. Unfortunately, it is inconsistent on where it is seen surfacing.. I have seen it surface when xunit tries to invoke test methods, doing simple null checks, when trying to create new objects. Whenever there is an access violation typically message says something like:

Exception thrown at 0x00007FF825E2C0D6 (coreclr.dll) in testhost.exe: 0xC0000005: Access violation reading location 0x0000000000000020.

These are the commits included in the first dependency update dotnet/runtime@aa9cd3b...c646425 but it is not obvious to me what may have caused this behavior. @ericstj are you aware of any changes that could have caused this or who can help investigate?

@ericstj
Copy link
Member

ericstj commented Dec 17, 2024

@lonitra any chance you have a dump?

Of those commits identified, the one that seems the most likely is dotnet/runtime@69d8076 cc @VSadov @jkotas.

@jkotas
Copy link
Member

jkotas commented Dec 18, 2024

Of those commits identified, the one that seems the most likely is dotnet/runtime@69d8076

I have confirmed that it is caused by this change. I am reverting it in dotnet/runtime#110801

The crash repros fairly reliably for me locally. It is not obvious what the problem is.

@VSadov
Copy link
Member

VSadov commented Dec 18, 2024

My guess - either winforms scenarios use FLS in a conflicting manner, or something that we do on thread detach in CoreCLR is somehow dependent on holding the loader lock (that would be a bit inconvenient).

…1217.3

Microsoft.Internal.Runtime.WindowsDesktop.Transport , Microsoft.NET.Sdk.IL , Microsoft.NETCore.App.Ref , Microsoft.NETCore.App.Runtime.win-x64 , Microsoft.NETCore.ILAsm , Microsoft.NETCore.ILDAsm , Microsoft.NETCore.Platforms , Microsoft.Win32.Registry.AccessControl , Microsoft.Win32.SystemEvents , runtime.win-x64.Microsoft.NETCore.ILAsm , runtime.win-x86.Microsoft.NETCore.ILAsm , System.CodeDom , System.ComponentModel.Composition , System.ComponentModel.Composition.Registration , System.Configuration.ConfigurationManager , System.Data.Odbc , System.Data.OleDb , System.Diagnostics.EventLog , System.Diagnostics.PerformanceCounter , System.DirectoryServices , System.DirectoryServices.AccountManagement , System.DirectoryServices.Protocols , System.Formats.Nrbf , System.IO.Hashing , System.IO.Packaging , System.IO.Ports , System.Management , System.Reflection.Context , System.Reflection.MetadataLoadContext , System.Resources.Extensions , System.Runtime.Caching , System.Runtime.Serialization.Formatters , System.Security.Cryptography.Pkcs , System.Security.Cryptography.ProtectedData , System.Security.Cryptography.Xml , System.Security.Permissions , System.ServiceModel.Syndication , System.ServiceProcess.ServiceController , System.Speech , System.Text.Encoding.CodePages , System.Text.Encodings.Web , System.Text.Json , System.Threading.AccessControl , System.Windows.Extensions , VS.Redist.Common.NetCore.SharedFramework.x64.10.0
 From Version 10.0.0-alpha.1.24611.1 -> To Version 10.0.0-alpha.1.24617.3
@jkotas
Copy link
Member

jkotas commented Dec 18, 2024

@VSadov The symptoms of this crash are very similar to dotnet/runtime#110442 . One possible explanation is that there is a subtle race condition where we forget to unregister the dying thread and the FLS change just made it more likely to be hit.

@jkotas
Copy link
Member

jkotas commented Dec 18, 2024

I was able to catch the problem under debugger. The problem is that COM apartment shutdown ends up calling into managed code and re-initializes the Thread at the callstack below. The re-initialized Thread won't be unregistered and we will crash later when trying to access it:

combase!PeekTillDone+0x47
combase!OXIDEntry::WaitForApartmentShutdown+0x2b
combase!OXIDEntry::StopServer+0x52
combase!CComApartment::StopServer+0x31
combase!StopThread+0x3d
combase!ApartmentUninitialize+0x91
combase!wCoUninitialize+0x2cf
combase!CoUninitialize+0x197
combase!DoThreadSpecificCleanup+0x5d
combase!FlsThreadCleanupCallback+0x29
ntdll!RtlpFlsCallbackPerform+0xa
ntdll!RtlpFlsDataCleanup+0xff
ntdll!RtlProcessFlsData+0x15
ntdll!LdrShutdownThread+0x49
ntdll!RtlExitUserThread+0x46

@VSadov
Copy link
Member

VSadov commented Dec 18, 2024

By the looks of FlsThreadCleanupCallback it seems like combase uses similar trick with FLS as we do.

Are you seeing a scenario when we are called first (and detach) and then their callback gets called and they end up calling managed code? (it is not clear from the stack what calls managed code)

@jkotas
Copy link
Member

jkotas commented Dec 18, 2024

Are you seeing a scenario when we are called first (and detach) and then their callback gets called and they end up calling managed code?

Yes.

@VSadov
Copy link
Member

VSadov commented Dec 18, 2024

Our assumption that a thread cannot call into managed code after calling our FLS callback is not correct then.
~destructors of c++ thread-local objects could, in theory, have the same issue - anyone can set up one and do whatever in in it.

@jkotas
Copy link
Member

jkotas commented Dec 19, 2024

Our assumption that a thread cannot call into managed code after calling our FLS callback

Dtto for regular TLS callbacks - I guess that it is the problem in dotnet/runtime#110442. We wish that it can declared as unsupported scenario, but we have not done a good job failing deterministically in this situation, so there is likely quite a bit out code that depends on it "working".

@VSadov
Copy link
Member

VSadov commented Dec 19, 2024

At high level dotnet/runtime#110442 seems similar - it also seems to be an issue with thread destruction notification. Although it is a different platform with different notification mechanism and from the discussion it appears the notification just did not arrive (or was not acted upon) before TLS was destroyed and then GC happened and crashed. Here the notification did arrive, but the thread found a way to do a reverse pinvoke after that.

Unix issue feels simpler - we should be able to be notified and detach before TLS is destroyed. Something did not work as expected.

The issue with thread termination callback itself entering managed code seems a bit harder to deal with. I do not have any good ideas at the moment.

FLS callbacks are not very commonly used and the documentation says that they should be simple, basically just memory deallocation. Perhaps there is a way to prevent entering managed code from that?

@jkotas
Copy link
Member

jkotas commented Dec 19, 2024

Here the notification did arrive, but the thread found a way to do a reverse pinvoke after that.

My guess is that the Unix issue is the same: The notification did arrive, but the thread found a way to do a reverse pinvoke after that via thread static in some native library.

FLS callbacks are not very commonly used and the documentation says that they should be simple, basically just memory deallocation. Perhaps there is a way to prevent entering managed code from that?

We can fail-fast if the thread tries to re-enter managed code after the thread destruction notification (e.g. we can have a bool thread static that is set in the thread destruction notification). But it would be a severe breaking change in combination with the FLS change on Windows.

@jkotas
Copy link
Member

jkotas commented Dec 19, 2024

FLS callbacks are not very commonly used and the documentation says that they should be simple, basically just memory deallocation.

The COM FLS fallback is anything but simple. It pumps Windows messages... .

it would be a severe breaking change in combination with the FLS change on Windows.

Brainstorming about possible solutions:

  • Ensure that our FLS slot is allocated after COM FLS slot, so that our callback comes after the COM callback. It would not be pretty, but I am sure that it would fix the winforms crash.
    (or)
  • Do a full attach/detach on attempt to enter managed code after thread destruction notification. It would make JIT_ReversePInvokeExit a bit slower (we would have to check a flag and to do full detach if the flag is set)

@VSadov
Copy link
Member

VSadov commented Dec 19, 2024

it would be a severe breaking change in combination with the FLS change on Windows.

On that I agree. Poisoning a thread after detaching will get us in trouble in exactly same scenarios.

Do a full attach/detach on attempt to enter managed code after thread destruction notification.

In theory it could work. Everything is synchronous as it is local to the thread, so no races to be concerned about.
Possible problems could be:

  • I am not sure if registering an FLS slot from a detach callback is something that API expects. (same applies to any other kind of thread termination notifications - is it ok to reregister in a callback?).
  • might be some subtle issues with managed thread identity - is it the "same" thread after reattachment, is the managed ID the same, are thread statics resurrected somehow, does the thread still own the same locks (although terminating thread while owning locks is mostly UB anyways...).
    It feels like it might need to be a new managed thread, while native part would still remember old things.

It would make JIT_ReversePInvokeExit a bit slower

If the idea is to not reregister with OS at all and just detach on every PInvokeExit, then it eliminates the concern #1 (can we reregister at all), while the concern #2 (thread's identity) may get worse.

Ensure that our FLS slot is allocated after COM FLS slot, so that our callback comes after the COM callback.

If registering FLS slot earlier guarantees that callback is called later (i.e. LIFO), this might be sufficient.

We'd need just to be sure that the FLS is allocated before everything else when a thread is attached.

(another assumption is that anything that allocates FLS slots before calling managed code will not call managed code in its callback - not the 100% robust, but plausible assumption, I think)

@jkotas
Copy link
Member

jkotas commented Dec 19, 2024

If registering FLS slot earlier guarantees that callback is called later (i.e. LIFO), this might be sufficient.

It is the other way around: Callbacks registered earlier get called earlier. (As long as there is no FLS slot reuse.) We registered our FLS callback before the COM one, so it got called before the COM one.

@jkotas
Copy link
Member

jkotas commented Dec 19, 2024

It feels like it might need to be a new managed thread, while native part would still remember old things.

I think it can be the same managed/native combo. We can use the thread static callback that gets executed under loader lock to tell signal that the managed/native combo is not needed anymore (without entering the coop mode).

@VSadov
Copy link
Member

VSadov commented Dec 19, 2024

Also wonder why we did not see this before.
I guess it is just because FLS callbacks are called before DllMain (or whatever calls threadlocal destructors).

@VSadov
Copy link
Member

VSadov commented Dec 19, 2024

What exactly managed code do they need to run on apartment shutdown? Can we move that to native or fake it and not run at all (conditionally - if thread is detached) ?

@jkotas
Copy link
Member

jkotas commented Dec 19, 2024

What exactly managed code do they need to run on apartment shutdown?

The managed code that I have seen running under WaitForApartmentShutdown was a managed Windows Proc registered by WinForms. It run because of the thread was pumping windows messages and the message happened to be for the window owned by WinForms.

@lonitra
Copy link
Member

lonitra commented Dec 20, 2024

Thank you for investigating this! I'm closing this PR as #12655 has superseded this and contains the fix.

@lonitra lonitra closed this Dec 20, 2024
@dotnet-maestro dotnet-maestro bot deleted the darc-main-c234ee6c-30cc-4517-9cda-e9d43eb82c42 branch December 20, 2024 17:15
@VSadov
Copy link
Member

VSadov commented Dec 22, 2024

I was able to catch the problem under debugger.

@jkotas what was your way to repro this? Did you run a particular test or a directed repro sample?

@jkotas
Copy link
Member

jkotas commented Dec 22, 2024

@jkotas what was your way to repro this? Did you run a particular test or a directed repro sample?

Build and run all winforms tests against the changes in this PR. The crash occurs when running the WinForms unit tests (System.Windows.Forms.Tests). I have not isolated the specific test that triggers the crash. I have used coreclr.dll with custom instrumentation to figure out what's going on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants