Issue metadata
Sign in to add a comment
|
Chrome: Crash Report - crashpad::Thread::Start |
||||||||||||||||||||
Issue descriptionUnable to find the crash in Fracas, hence reported from Create new issue link. Product name: Chrome Magic Signature: crashpad::Thread::Start Current link: https://crash.corp.google.com/browse?q=product.name%3D'Chrome'%20AND%20product.version%3D'58.0.3029.33'%20AND%20custom_data.ChromeCrashProto.channel%3D'beta'%20AND%20custom_data.ChromeCrashProto.ptype%3D'crashpad-handler'%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D'crashpad%3A%3AThread%3A%3AStart'%20AND%20ReportID%3D'c6c9adfd60000000'&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#3 Search properties: product.name: Chrome product.version: 58.0.3029.33 custom_data.chromecrashproto.channel: beta custom_data.chromecrashproto.ptype: crashpad-handler custom_data.chromecrashproto.magic_signature_1.name: crashpad::Thread::Start reportid: c6c9adfd60000000 Metadata : Product Name: Chrome Product Version: 58.0.3029.33 Report ID: c6c9adfd60000000 Report Time: Tue, 28 Mar 2017 14:13:02 GMT Uptime: 1000 ms Cumulative Uptime: 0 ms User Email: OS Name: Windows NT OS Version: 6.1.7600 CPU Architecture: x86 CPU Info: GenuineIntel family 6 model 37 stepping 5 Stack Trace ============================= Thread 0 CRASHED [EXCEPTION_ACCESS_VIOLATION_EXEC @ 0x00000000 ] MAGIC SIGNATURE THREAD Stack Quality91%Show frame trust levels 0x00000000 0x75f9bd7b (KERNELBASE.dll + 0x0000bd7b ) CreateRemoteThreadEx 0x76bc281c (kernel32.dll + 0x0005281c ) CreateThreadStub 0x00c48ad3 (chrome.exe -thread_win.cc:23 ) crashpad::Thread::Start() 0x00c45ad5 (chrome.exe -session_end_watcher.cc:109 ) crashpad::SessionEndWatcher::SessionEndWatcher() 0x00c2bd63 (chrome.exe -handler_main.cc:367 ) crashpad::`anonymous namespace'::InstallCrashHandler 0x00c2bda5 (chrome.exe -handler_main.cc:376 ) crashpad::HandlerMain(int,char * * const) 0x00c25970 (chrome.exe -run_as_crashpad_handler_win.cc:46 ) crash_reporter::RunAsCrashpadHandler(base::CommandLine const &) 0x00bf20c7 (chrome.exe -chrome_exe_main_win.cc:252 ) wWinMain 0x00c609c7 (chrome.exe -exe_common.inl:253 ) __scrt_common_main_seh 0x76bc1173 (kernel32.dll + 0x00051173 ) BaseThreadInitThunk 0x77cab3f4 (ntdll.dll + 0x0005b3f4 ) __RtlUserThreadStart 0x77cab3c7 (ntdll.dll + 0x0005b3c7 ) _RtlUserThreadStart 1) This is a regression crash seen in M58 and M59 channel i.e in latest Beta #58.0.3029.33 and prev. Dev #59.0.3047.4. 2) Currently its a top #9 crashpad-handler crasher having 11 crashes from 9 unique client Ids. 3) Crashes are seen on M58 channel as below. 58.0.3029.33 18.64% 11 -- Latest Beta 59.0.3047.4 18.64% 11 -- Prev. Dev 4) Link to list of builds where crashes are seen: https://crash.corp.google.com/browse?q=product.name%3D%27Chrome%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27crashpad-handler%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27crashpad%3A%3AThread%3A%3AStart%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#samplereports:5,productversion:1000 5) Possible suspect from the code search on the crashed file "thread_win.cc" based on recent changes made. Review-Url: https://codereview.chromium.org/1505213004 mark@ : Could you please take a look into this if its related to your change. Note: Adding label ReleaseBlock-Stable as it seems to be a recent regression. Thanks...!!
,
Mar 30 2017
More general query: https://crash.corp.google.com/browse?q=product.name%3D%27Chrome%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27crashpad-handler%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27crashpad%3A%3AThread%3A%3AStart%27 These are all 32-bit Windows, and 97% Windows 7 (two reports are from 8.1). There are three different ways that this shows up: 44 70% EXCEPTION_ACCESS_VIOLATION_EXEC @ 0x00000000 15 24% EXCEPTION_BREAKPOINT 4 6% EXCEPTION_ACCESS_VIOLATION_READ @ 0x00000000 First, the null-pointer ones, which look like 0x00000000 0x75bdbd7b (KERNELBASE.dll + 0x0000bd7b) CreateRemoteThreadEx 0x7614281c (kernel32.dll + 0x0005281c) CreateThreadStub 0x012d8ad3 (chrome.exe -thread_win.cc:23) crashpad::Thread::Start() The null-pointer ones look like the same thing, an attempt to execute address 0, but different exception codes are being used. The majority of these (42/48) are crashing starting a thread for SessionEndWatcher::SessionEndWatcher(), but I also see examples for each of the two WorkerThread instances that HandlerMain starts: CrashReportUploadThread and PruneCrashReportThread. I don’t think that there’s anything particularly wrong with SessionEndWatcher’s use, it’s just the first thread that we try to start via Thread::Start(), so if things are broken, that’s where we’re more likely to see the crash. Now, the EXCEPTION_BREAKPOINT ones, which look like 0x00f88adc (chrome.exe -thread_win.cc:25) crashpad::Thread::Start() 23 platform_thread_ = 24 CreateThread(nullptr, 0, ThreadEntryThunk, this, 0, nullptr); 25 PCHECK(platform_thread_) << "CreateThread"; In most of these cases, Thread::Start() is starting a thread for one of the two WorkerThread instances in HandlerMain(), but I also see two for the SessionEndWatcher thread as well. For the null-pointer ones, I don’t know why CreateThread()’s internals would attempt to execute at 0. For the PCHECK() ones, I don’t know why CreateThread() would be failing.
,
Mar 30 2017
No secrets here.
,
Mar 30 2017
For both the null pointer crashes and the PCHECK()s, the dumps I’ve looked at so far say 0:000> !gle LastErrorValue: (Win32) 0x2 (2) - The system cannot find the file specified. LastStatusValue: (NTSTATUS) 0xc0000034 - Object Name not found.
,
Mar 30 2017
I think that the PCHECK() and null pointer execution are the same thing, but on different OS versions. All of the null pointer executions I’ve seen are using kernelbase.dll 6.1.7600.16385, 6.1.7601.17514, or 6.1.7601.23392 (but only one for that last one). All of the PCHECK()s that I’ve seen are using kernelbase.dll 6.1.7600.17206 or 6.1.7601.19135. The vast majority are 6.1.7600.16385, which is Windows 7 RTM. On my fully-patched Windows 7 installation, I have kernelbase.dll 6.1.7601.23677. It looks to me like this is an underlying OS bug that caused CreateThread() to crash (under what circumstances?), was subsequently “fixed” so that CreateThread() would fail instead, and was ultimately fixed so that it’d work reliably. The two crashes from Windows 8 both have bavnt.dll and bavum.dll loaded, Baidu Antivirus, so there’s third-party taint.
,
Mar 30 2017
Seems the same as bug 124839 . Bug 690847 may be a hang variant of the same.
,
Mar 30 2017
Yeah, there's no way CreateThread is generally broken of course, this has be third party suckage. I don't know if there's much to be done here, other than monitoring the crash rates as they might indicate somewhere we need to do outreach if they change radically (assuming it's AV rather than malware).
,
Mar 30 2017
I didn’t see evidence of third-party taint in most of the crash reports here, just unpatched Windows 7 RTM.
,
Mar 30 2017
,
Mar 30 2017
OK, it's not impossible, but it feels pretty unlikely that CreateThread() is broken-by-default in Win7 RTM and/or SP1? We could compare the dumps of CreateRemoteThreadEx() for RTM vs. 23677 I guess to see if there's any suspicious looking delta... My VM happens to be 16385, here's a uf /i kernelbase!CreateRemoteThreadEx for that version https://gist.github.com/sgraham/f074dbe2965c06a1e6374c584829141e
,
Mar 30 2017
Will and I talked, it’s probably more likely to be third-party-ware doing something like messing with us remotely, without having any of its own modules loaded.
,
Mar 31 2017
#c10, nothing really suspicious. Diff of normalized disassembly: https://gist.github.com/sgraham/f074dbe2965c06a1e6374c584829141e#gistcomment-2042716
,
Mar 31 2017
One thing that’s confusing to me is that all of the reports with kernelbase.dll 6.1.7600.16385 are showing kernelbase.dll + 0xbd7b. But Scott’s disassembly from #c10 has CreateRemoteThreadEx at 0x761b2ef3 (so it loaded at 0x761a0000? Guess.) and it’s 0x30a bytes long. So it begins at 0x12ef3 relative to the module. This can’t be the same module that’s in all of the crash reports, because CreateRemoteThreadEx is in the wrong place. What’s “!lmi kernelbase.dll” give for the PDB GUID and age? The crash reports for this version have debug ID 59D5EEBCB6B044C7A1572DAE49752E1D2.
,
Mar 31 2017
Dug out the real symbols for that version (https://gist.github.com/sgraham/f074dbe2965c06a1e6374c584829141e#gistcomment-2043070) and the instruction that’s jumping to 0 is 754bbd76 ff1544134b75 call dword ptr [KERNELBASE!_imp__NtResumeThread (754b1344)] obviously it’s highly unusual for that to be 0.
,
Mar 31 2017
But then I remembered that Breakpad’s stackwalker has this bug where it skips over frame 1 if frame 0 has no unwind info and didn’t even get far enough to allocate a stack frame, so I fed go/crash/62798102e0000000 into windbg. 0:000> .ecxr eax=001ff01c ebx=00000000 ecx=001fefbc edx=76e164f4 esi=00000001 edi=00000004 eip=00000000 esp=001fef50 ebp=001fefe4 iopl=0 nv up ei pl zr na pe nc cs=001b ss=0023 ds=0023 es=0023 fs=003b gs=0000 efl=00010246 00000000 ?? ??? 0:000> k *** Stack trace for last set context - .thread/.cxr resets it # ChildEBP RetAddr WARNING: Frame IP not in any known module. Following frames may be wrong. 00 001fef4c 20025c34 0x0 01 001fefe4 751dbd7c 0x20025c34 02 001ff270 76c4281d KERNELBASE!CreateRemoteThreadEx+0x318 *** WARNING: Unable to verify timestamp for chrome.exe *** ERROR: Module load completed but symbols could not be loaded for chrome.exe 03 001ff298 00b2b48f kernel32!CreateThreadStub+0x20 04 001ff390 00b0e794 chrome+0x5b48f 05 001ff848 00b082f1 chrome+0x3e794 06 001ff8b0 00ad20c8 chrome+0x382f1 07 001ff9c8 00b434c8 chrome+0x20c8 08 001ffa14 76c41174 chrome+0x734c8 09 001ffa20 76e2b3f5 kernel32!BaseThreadInitThunk+0xe 0a 001ffa60 76e2b3c8 ntdll!__RtlUserThreadStart+0x70 0b 001ffa78 00000000 ntdll!_RtlUserThreadStart+0x1b 0:000> !address 0x20025c34 Mapping file section regions... Mapping module regions... Mapping PEB regions... Mapping TEB and stack regions... Mapping heap regions... Mapping page heap regions... Mapping other regions... Mapping stack trace database regions... Mapping activation context regions... Usage: <unknown> Base Address: 20021000 End Address: 20028000 Region Size: 00007000 ( 28.000 kB) State: 00001000 MEM_COMMIT Protect: 00000020 PAGE_EXECUTE_READ Type: 00020000 MEM_PRIVATE Allocation Base: 20020000 Allocation Protect: 00000040 PAGE_EXECUTE_READWRITE Content source: 1 (target), length: 300 0:000> ub 0x751dbd7c l2 KERNELBASE!CreateRemoteThreadEx+0x30c: 751dbd70 ffb5e4fdffff push dword ptr [ebp-21Ch] 751dbd76 ff1544131d75 call dword ptr [KERNELBASE!_imp__NtResumeThread (751d1344)] 0:000> So it’s not really calling NtResumeThread (which is just a system call stub) at all, it’s doing something that lands it in this region at 0x20021000, and there’s code there that calls 0. All right, I relent. This does sound a whole lot like third-party taint now.
,
Mar 31 2017
Oh, right, we capture some interesting memory regions now!
; ================ B E G I N N I N G O F P R O C E D U R E ================
; Variables:
; arg_4: 12
; arg_0: 8
; var_14: -20
; var_1C: -28
sub_20025bda:
20025bda push ebp ; DATA XREF=sub_20025c4e+55
20025bdb mov ebp, esp
20025bdd add esp, 0xffffff78
20025be3 pushal
20025be4 cmp dword [0x2002958e], 0x0
20025beb je loc_20025c27
20025bed cmp dword [0x20029592], 0x0
20025bf4 je loc_20025c27
20025bf6 cmp dword [ebp+arg_0], 0x0
20025bfa je loc_20025c27
20025bfc push 0x0
20025bfe push 0x1c
20025c00 lea eax, dword [ebp+var_1C]
20025c03 push eax
20025c04 push 0x0
20025c06 push dword [ebp+arg_0]
20025c09 call 0x200248db
20025c0e or eax, eax
20025c10 jne loc_20025c27
20025c12 mov eax, dword [0x2002959a]
20025c17 cmp eax, dword [ebp+var_14]
20025c1a je loc_20025c27
20025c1c push dword [ebp+arg_0] ; argument #2 for method sub_20025b36
20025c1f push dword [ebp+var_14] ; argument #1 for method sub_20025b36
20025c22 call sub_20025b36
loc_20025c27:
20025c27 popal ; CODE XREF=sub_20025bda+17, sub_20025bda+26, sub_20025bda+32, sub_20025bda+54, sub_20025bda+64
20025c28 push dword [ebp+arg_4]
20025c2b push dword [ebp+arg_0]
20025c2e call dword [0x2002959e]
20025c34 leave
20025c35 ret 0x8
; endp
,
Mar 31 2017
I went looking through dump 62798102e0000000 for interesting strings and found one: \Sessions\1\Windows\ApiPortection The misspelling’s a giveaway, so I pulled a handful of dumps matching the crash signature and looked for that string in them. It shows up in all of the dumps I saw from 6.1.7600.16385 (7) and some other versions, intermittently in the dumps from 6.1.7601.17514 (7sp1), and not at all in the two lone dumps from 6.3.9600.18202 (8.1), although those two did have evidence of other third-party taint. The Internet doesn’t know much about this string, other than that it shows up in a few samples collected by malware analysis places. I also pulled a few dumps from bug 698471 and found this string in them too (both 7 and 7sp1). Looks like crapware.
,
Apr 5 2017
This is attributable to third-party software messing with things, and is not actionable.
,
Aug 9 2017
Issue 753760 has been merged into this issue. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by fdoray@chromium.org
, Mar 29 2017