Thread handle leak in Intel driver in GPU process |
|
Issue description
I've noticed on my home laptop that the handle count in the GPU process goes a bit crazy - up to 12,300 handles in one case. I attached windbg and ran !handle and it said that most of the handles were to threads. I then looked at individual handles and the vast majority are to threads in the GPU process, and the reports usually look like this:
0:000> !handle 738 8
Handle 738
Object Specific Information
Thread Id 9860.24d4
Priority 8
Base Priority 0
Start Address 160ea10 igd10iumd64!OpenAdapter10_2
0:000> !handle 73c 8
Handle 73c
Object Specific Information
Thread Id 9860.c4d4
Priority 8
Base Priority 0
Start Address 160ea10 igd10iumd64!OpenAdapter10_2
0:000> !handle 740 8
Handle 740
Object Specific Information
Thread Id 9860.9744
Priority 9
Base Priority 0
Start Address 160ea10 igd10iumd64!OpenAdapter10_2
0:000> !handle 744 8
Handle 744
Object Specific Information
Thread Id 9860.be34
Priority 8
Base Priority 0
Start Address 160ea10 igd10iumd64!OpenAdapter10_2
0:000> !handle 748 8
Handle 748
Object Specific Information
Thread Id 9860.5e78
Priority 8
Base Priority 0
Start Address 160ea10 igd10iumd64!OpenAdapter10_2
So... it looks like "somebody" is creating a thread with OpenAdapter10_2 as the start address, and then failing to close the thread handle. This seems like it must be an Intel driver bug. I don't know what the memory consequences are but with 12,300 handles leaked I think they must be non-trivial.
I am discussing this with Intel.
,
Jan 12
Twitter thread is here: https://twitter.com/gfxlisa/status/1083798621282156544
,
Jan 17
(6 days ago)
I attached windbg to the GPU process and set a breakpoint on CreateThread like this: bu kernel32!CreateThreadStub "kc;g" This means that I get a call stack every time somebody creates a thread in the GPU process. Then I went !handle, resumed the process, created a tab, started navigating to twitter.com, broke into the debugger, and ran !handle again. I've attached a summary of the output but the TL;DR is that every time a pixel or vertex shader is created the Intel driver (igd10iumd64) creates a thread and it never closes the thread handle. The output shows that seven threads were created, and the number of Thread handles open went from 3260 to 3267. It's a 100% leak. The two (partial) call stacks I saw were these - full stacks are in the attached output file: 00 KERNEL32!CreateThreadStub 01 igd10iumd64!OpenAdapter10_2+0x85bb47 02 igd10iumd64!GTPIN_IGC_Instrument+0x16762 03 d3d11!CVertexShader::CLS::FinalConstruct 04 d3d11!CLayeredObjectWithCLS<CVertexShader>::FinalConstruct 05 d3d11!CLayeredObjectWithCLS<CVertexShader>::CreateInstance 06 d3d11!CDevice::CreateLayeredChild 07 d3d11!NDXGI::CDevice::CreateLayeredChild 08 d3d11!NOutermost::CDevice::CreateLayeredChild 00 KERNEL32!CreateThreadStub 01 igd10iumd64!OpenAdapter10_2+0x85bb47 02 igd10iumd64!GTPIN_IGC_Instrument+0x16762 03 d3d11!CPixelShader::CLS::FinalConstruct 04 d3d11!CLayeredObjectWithCLS<CPixelShader>::FinalConstruct 05 d3d11!CLayeredObjectWithCLS<CPixelShader>::CreateInstance 06 d3d11!CDevice::CreateLayeredChild 07 d3d11!NDXGI::CDevice::CreateLayeredChild 08 d3d11!NOutermost::CDevice::CreateLayeredChild Note that the function names in igd10iumd64 are not meaningful because there are no symbols. If Intel published symbols on a symbol server then I could give more detailed bug reports. The "kc" output removes offsets so I pasted in the results of "k" in order to get the offsets and avoid being misleading. I also attached a minidump on one of the calls. Please close your thread handles when you are done with them.
,
Jan 17
(5 days ago)
Thanks Bruce. I would like to spend a little time with the dump file and our private symbols. I did not see a leak on my system do you know if its specific to a web page or web content? I am using the standard chrome browser download. It seems possible i need a specific version thanks dave
,
Jan 17
(5 days ago)
I saw this on Chrome stable. The precise version that I am using is embedded in the dump file. The leak may well only happen on certain code paths. Due to crbug.com/893289 (possibly a separate Intel driver bug?) I am running Chrome on that machine with --disable-direct-composition. I was hoping that with the crash dump making it clear where the handles were being created it would then be a matter of figuring out where they were supposed to be freed, and then fixing that code to make sure that they are. If it isn't that straightforward then I can try doing additional debugging (setting breakpoints on the stacks where they should be freed to figure out where the logic is going awry and causing CloseHandle to be skipped) but that of course requires Intel's symbols. Finding where CreateThread is called is easy, but figuring out why CloseHandle is *not* called is much more difficult (for somebody without source and symbols).
,
Jan 17
(5 days ago)
I was able to track back to a fix made a few months after the driver you are testing with. We expect the fix to resolve your issue and should be in drivers with version 100.6092 or later. The fix in question, as you suspected, added a missing endthreadex call :).
,
Jan 17
(5 days ago)
Here is a link to a driver that should have the fix already https://downloadcenter.intel.com/download/28445/Intel-Graphics-Driver-for-Windows-10?product=80939 Please dont install it with the have disk method use the installer program Version: DCH 25.20.100.6471 |
|
►
Sign in to add a comment |
|
Comment 1 by bruce.da...@gmail.com
, Jan 112.6 MB
2.6 MB Download