New issue
Advanced search Search tips

Issue 706393 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Closed: Apr 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug-Regression



Sign in to add a comment

Chrome: Crash Report - crashpad::Thread::Start

Project Member Reported by krajshree@chromium.org, Mar 29 2017

Issue description

Unable to find the crash in Fracas, hence reported from Create new issue link.

Product name: Chrome
Magic Signature: crashpad::Thread::Start

Current link:
https://crash.corp.google.com/browse?q=product.name%3D'Chrome'%20AND%20product.version%3D'58.0.3029.33'%20AND%20custom_data.ChromeCrashProto.channel%3D'beta'%20AND%20custom_data.ChromeCrashProto.ptype%3D'crashpad-handler'%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D'crashpad%3A%3AThread%3A%3AStart'%20AND%20ReportID%3D'c6c9adfd60000000'&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#3


Search properties:
product.name: Chrome
product.version: 58.0.3029.33
custom_data.chromecrashproto.channel: beta
custom_data.chromecrashproto.ptype: crashpad-handler
custom_data.chromecrashproto.magic_signature_1.name: crashpad::Thread::Start
reportid: c6c9adfd60000000

Metadata :
Product Name: Chrome
Product Version: 58.0.3029.33
Report ID: c6c9adfd60000000
Report Time: Tue, 28 Mar 2017 14:13:02 GMT
Uptime: 1000 ms
Cumulative Uptime: 0 ms
User Email: 
OS Name: Windows NT
OS Version: 6.1.7600 
CPU Architecture: x86
CPU Info: GenuineIntel family 6 model 37 stepping 5

Stack Trace
=============================
Thread 0 CRASHED [EXCEPTION_ACCESS_VIOLATION_EXEC @ 0x00000000 ] MAGIC SIGNATURE THREAD
Stack Quality91%Show frame trust levels
0x00000000		
0x75f9bd7b	(KERNELBASE.dll + 0x0000bd7b )	CreateRemoteThreadEx
0x76bc281c	(kernel32.dll + 0x0005281c )	CreateThreadStub
0x00c48ad3	(chrome.exe -thread_win.cc:23 )	crashpad::Thread::Start()
0x00c45ad5	(chrome.exe -session_end_watcher.cc:109 )	crashpad::SessionEndWatcher::SessionEndWatcher()
0x00c2bd63	(chrome.exe -handler_main.cc:367 )	crashpad::`anonymous namespace'::InstallCrashHandler
0x00c2bda5	(chrome.exe -handler_main.cc:376 )	crashpad::HandlerMain(int,char * * const)
0x00c25970	(chrome.exe -run_as_crashpad_handler_win.cc:46 )	crash_reporter::RunAsCrashpadHandler(base::CommandLine const &)
0x00bf20c7	(chrome.exe -chrome_exe_main_win.cc:252 )	wWinMain
0x00c609c7	(chrome.exe -exe_common.inl:253 )	__scrt_common_main_seh
0x76bc1173	(kernel32.dll + 0x00051173 )	BaseThreadInitThunk
0x77cab3f4	(ntdll.dll + 0x0005b3f4 )	__RtlUserThreadStart
0x77cab3c7	(ntdll.dll + 0x0005b3c7 )	_RtlUserThreadStart


1) This is a regression crash seen in M58 and M59 channel i.e in latest Beta #58.0.3029.33 and prev. Dev #59.0.3047.4.

2) Currently its a top #9 crashpad-handler crasher having 11 crashes from 9 unique client Ids.

3) Crashes are seen on M58 channel as below.

   58.0.3029.33	18.64%	11	-- Latest Beta
   59.0.3047.4	18.64%	11      -- Prev. Dev

4) Link to list of builds where crashes are seen:
https://crash.corp.google.com/browse?q=product.name%3D%27Chrome%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27crashpad-handler%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27crashpad%3A%3AThread%3A%3AStart%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#samplereports:5,productversion:1000

5) Possible suspect from the code search on the crashed file "thread_win.cc" based on recent changes made.
Review-Url: https://codereview.chromium.org/1505213004 

mark@ : Could you please take a look into this if its related to your change.

Note: Adding label ReleaseBlock-Stable as it seems to be a recent regression.

Thanks...!!
 

Comment 1 by fdoray@chromium.org, Mar 29 2017

Components: -Internals>TaskScheduler

Comment 2 by mark@chromium.org, Mar 30 2017

Cc: scottmg@chromium.org
More general query: https://crash.corp.google.com/browse?q=product.name%3D%27Chrome%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27crashpad-handler%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27crashpad%3A%3AThread%3A%3AStart%27

These are all 32-bit Windows, and 97% Windows 7 (two reports are from 8.1). There are three different ways that this shows up:

44  70%  EXCEPTION_ACCESS_VIOLATION_EXEC @ 0x00000000
15  24%  EXCEPTION_BREAKPOINT
 4   6%  EXCEPTION_ACCESS_VIOLATION_READ @ 0x00000000

First, the null-pointer ones, which look like

0x00000000		
0x75bdbd7b	(KERNELBASE.dll + 0x0000bd7b)	CreateRemoteThreadEx
0x7614281c	(kernel32.dll + 0x0005281c)	CreateThreadStub
0x012d8ad3	(chrome.exe -thread_win.cc:23)	crashpad::Thread::Start()

The null-pointer ones look like the same thing, an attempt to execute address 0, but different exception codes are being used. The majority of these (42/48) are crashing starting a thread for SessionEndWatcher::SessionEndWatcher(), but I also see examples for each of the two WorkerThread instances that HandlerMain starts: CrashReportUploadThread and PruneCrashReportThread. I don’t think that there’s anything particularly wrong with SessionEndWatcher’s use, it’s just the first thread that we try to start via Thread::Start(), so if things are broken, that’s where we’re more likely to see the crash.

Now, the EXCEPTION_BREAKPOINT ones, which look like

0x00f88adc	(chrome.exe -thread_win.cc:25)	crashpad::Thread::Start()

23    platform_thread_ =
24        CreateThread(nullptr, 0, ThreadEntryThunk, this, 0, nullptr);
25    PCHECK(platform_thread_) << "CreateThread";

In most of these cases, Thread::Start() is starting a thread for one of the two WorkerThread instances in HandlerMain(), but I also see two for the SessionEndWatcher thread as well.

For the null-pointer ones, I don’t know why CreateThread()’s internals would attempt to execute at 0.

For the PCHECK() ones, I don’t know why CreateThread() would be failing.

Comment 3 by mark@chromium.org, Mar 30 2017

Labels: -Restrict-View-Google
No secrets here.

Comment 4 by mark@chromium.org, Mar 30 2017

For both the null pointer crashes and the PCHECK()s, the dumps I’ve looked at so far say

0:000> !gle
LastErrorValue: (Win32) 0x2 (2) - The system cannot find the file specified.
LastStatusValue: (NTSTATUS) 0xc0000034 - Object Name not found.

Comment 5 by mark@chromium.org, Mar 30 2017

I think that the PCHECK() and null pointer execution are the same thing, but on different OS versions.

All of the null pointer executions I’ve seen are using kernelbase.dll 6.1.7600.16385, 6.1.7601.17514, or 6.1.7601.23392 (but only one for that last one).

All of the PCHECK()s that I’ve seen are using kernelbase.dll 6.1.7600.17206 or 6.1.7601.19135.

The vast majority are 6.1.7600.16385, which is Windows 7 RTM.

On my fully-patched Windows 7 installation, I have kernelbase.dll 6.1.7601.23677.

It looks to me like this is an underlying OS bug that caused CreateThread() to crash (under what circumstances?), was subsequently “fixed” so that CreateThread() would fail instead, and was ultimately fixed so that it’d work reliably.

The two crashes from Windows 8 both have bavnt.dll and bavum.dll loaded, Baidu Antivirus, so there’s third-party taint.

Comment 6 by mark@chromium.org, Mar 30 2017

Seems the same as  bug 124839 . Bug 690847 may be a hang variant of the same.
Cc: wfh@chromium.org
Yeah, there's no way CreateThread is generally broken of course, this has be third party suckage. I don't know if there's much to be done here, other than monitoring the crash rates as they might indicate somewhere we need to do outreach if they change radically (assuming it's AV rather than malware).

Comment 8 by mark@chromium.org, Mar 30 2017

I didn’t see evidence of third-party taint in most of the crash reports here, just unpatched Windows 7 RTM.

Comment 9 by wfh@chromium.org, Mar 30 2017

Labels: Stability-ThirdParty
OK, it's not impossible, but it feels pretty unlikely that CreateThread() is broken-by-default in Win7 RTM and/or SP1?

We could compare the dumps of CreateRemoteThreadEx() for RTM vs. 23677 I guess to see if there's any suspicious looking delta... My VM happens to be 16385, here's a uf /i kernelbase!CreateRemoteThreadEx for that version

https://gist.github.com/sgraham/f074dbe2965c06a1e6374c584829141e

Comment 11 by mark@chromium.org, Mar 30 2017

Will and I talked, it’s probably more likely to be third-party-ware doing something like messing with us remotely, without having any of its own modules loaded.

Comment 12 by mark@chromium.org, Mar 31 2017

#c10, nothing really suspicious. Diff of normalized disassembly: https://gist.github.com/sgraham/f074dbe2965c06a1e6374c584829141e#gistcomment-2042716

Comment 13 by mark@chromium.org, Mar 31 2017

One thing that’s confusing to me is that all of the reports with kernelbase.dll 6.1.7600.16385 are showing kernelbase.dll + 0xbd7b. But Scott’s disassembly from #c10 has CreateRemoteThreadEx at 0x761b2ef3 (so it loaded at 0x761a0000? Guess.) and it’s 0x30a bytes long. So it begins at 0x12ef3 relative to the module. This can’t be the same module that’s in all of the crash reports, because CreateRemoteThreadEx is in the wrong place.

What’s “!lmi kernelbase.dll” give for the PDB GUID and age? The crash reports for this version have debug ID 59D5EEBCB6B044C7A1572DAE49752E1D2.

Comment 14 by mark@chromium.org, Mar 31 2017

Dug out the real symbols for that version (https://gist.github.com/sgraham/f074dbe2965c06a1e6374c584829141e#gistcomment-2043070) and the instruction that’s jumping to 0 is

754bbd76 ff1544134b75    call    dword ptr [KERNELBASE!_imp__NtResumeThread (754b1344)]

obviously it’s highly unusual for that to be 0.

Comment 15 by mark@chromium.org, Mar 31 2017

Labels: -ReleaseBlock-Stable
But then I remembered that Breakpad’s stackwalker has this bug where it skips over frame 1 if frame 0 has no unwind info and didn’t even get far enough to allocate a stack frame, so I fed go/crash/62798102e0000000 into windbg.

0:000> .ecxr
eax=001ff01c ebx=00000000 ecx=001fefbc edx=76e164f4 esi=00000001 edi=00000004
eip=00000000 esp=001fef50 ebp=001fefe4 iopl=0         nv up ei pl zr na pe nc
cs=001b  ss=0023  ds=0023  es=0023  fs=003b  gs=0000             efl=00010246
00000000 ??              ???
0:000> k
  *** Stack trace for last set context - .thread/.cxr resets it
 # ChildEBP RetAddr  
WARNING: Frame IP not in any known module. Following frames may be wrong.
00 001fef4c 20025c34 0x0
01 001fefe4 751dbd7c 0x20025c34
02 001ff270 76c4281d KERNELBASE!CreateRemoteThreadEx+0x318
*** WARNING: Unable to verify timestamp for chrome.exe
*** ERROR: Module load completed but symbols could not be loaded for chrome.exe
03 001ff298 00b2b48f kernel32!CreateThreadStub+0x20
04 001ff390 00b0e794 chrome+0x5b48f
05 001ff848 00b082f1 chrome+0x3e794
06 001ff8b0 00ad20c8 chrome+0x382f1
07 001ff9c8 00b434c8 chrome+0x20c8
08 001ffa14 76c41174 chrome+0x734c8
09 001ffa20 76e2b3f5 kernel32!BaseThreadInitThunk+0xe
0a 001ffa60 76e2b3c8 ntdll!__RtlUserThreadStart+0x70
0b 001ffa78 00000000 ntdll!_RtlUserThreadStart+0x1b
0:000> !address 0x20025c34
                                     
Mapping file section regions...
Mapping module regions...
Mapping PEB regions...
Mapping TEB and stack regions...
Mapping heap regions...
Mapping page heap regions...
Mapping other regions...
Mapping stack trace database regions...
Mapping activation context regions...

Usage:                  <unknown>
Base Address:           20021000
End Address:            20028000
Region Size:            00007000 (  28.000 kB)
State:                  00001000          MEM_COMMIT
Protect:                00000020          PAGE_EXECUTE_READ
Type:                   00020000          MEM_PRIVATE
Allocation Base:        20020000
Allocation Protect:     00000040          PAGE_EXECUTE_READWRITE


Content source: 1 (target), length: 300
0:000> ub 0x751dbd7c l2
KERNELBASE!CreateRemoteThreadEx+0x30c:
751dbd70 ffb5e4fdffff    push    dword ptr [ebp-21Ch]
751dbd76 ff1544131d75    call    dword ptr [KERNELBASE!_imp__NtResumeThread (751d1344)]
0:000>

So it’s not really calling NtResumeThread (which is just a system call stub) at all, it’s doing something that lands it in this region at 0x20021000, and there’s code there that calls 0.

All right, I relent. This does sound a whole lot like third-party taint now.

Comment 16 by mark@chromium.org, Mar 31 2017

Oh, right, we capture some interesting memory regions now!

        ; ================ B E G I N N I N G   O F   P R O C E D U R E ================

        ; Variables:
        ;    arg_4: 12
        ;    arg_0: 8
        ;    var_14: -20
        ;    var_1C: -28


             sub_20025bda:
20025bda         push       ebp                                                 ; DATA XREF=sub_20025c4e+55
20025bdb         mov        ebp, esp
20025bdd         add        esp, 0xffffff78
20025be3         pushal
20025be4         cmp        dword [0x2002958e], 0x0
20025beb         je         loc_20025c27

20025bed         cmp        dword [0x20029592], 0x0
20025bf4         je         loc_20025c27

20025bf6         cmp        dword [ebp+arg_0], 0x0
20025bfa         je         loc_20025c27

20025bfc         push       0x0
20025bfe         push       0x1c
20025c00         lea        eax, dword [ebp+var_1C]
20025c03         push       eax
20025c04         push       0x0
20025c06         push       dword [ebp+arg_0]
20025c09         call       0x200248db
20025c0e         or         eax, eax
20025c10         jne        loc_20025c27

20025c12         mov        eax, dword [0x2002959a]
20025c17         cmp        eax, dword [ebp+var_14]
20025c1a         je         loc_20025c27

20025c1c         push       dword [ebp+arg_0]                                   ; argument #2 for method sub_20025b36
20025c1f         push       dword [ebp+var_14]                                  ; argument #1 for method sub_20025b36
20025c22         call       sub_20025b36

             loc_20025c27:
20025c27         popal                                                          ; CODE XREF=sub_20025bda+17, sub_20025bda+26, sub_20025bda+32, sub_20025bda+54, sub_20025bda+64
20025c28         push       dword [ebp+arg_4]
20025c2b         push       dword [ebp+arg_0]
20025c2e         call       dword [0x2002959e]
20025c34         leave
20025c35         ret        0x8
                        ; endp

Comment 17 by mark@chromium.org, Mar 31 2017

I went looking through dump 62798102e0000000 for interesting strings and found one:

\Sessions\1\Windows\ApiPortection

The misspelling’s a giveaway, so I pulled a handful of dumps matching the crash signature and looked for that string in them. It shows up in all of the dumps I saw from 6.1.7600.16385 (7) and some other versions, intermittently in the dumps from 6.1.7601.17514 (7sp1), and not at all in the two lone dumps from 6.3.9600.18202 (8.1), although those two did have evidence of other third-party taint.

The Internet doesn’t know much about this string, other than that it shows up in a few samples collected by malware analysis places.

I also pulled a few dumps from  bug 698471  and found this string in them too (both 7 and 7sp1).

Looks like crapware.

Comment 18 by mark@chromium.org, Apr 5 2017

Status: WontFix (was: Assigned)
This is attributable to third-party software messing with things, and is not actionable.

Comment 19 by mark@chromium.org, Aug 9 2017

Issue 753760 has been merged into this issue.

Sign in to add a comment