Can't start new chrome child processes |
|||||||||||||||||
Issue descriptionVersion: 55.0.2853.0 (from attached debugger) OS: Windows 10 What steps will reproduce the problem? (1) Sometimes this happens. Not sure how. (2) (3) What is the expected output? Chrome processes can start What do you see instead? Chrome processes don't start. Please use labels and text to provide additional information. sorry this bug report isn't very good.
,
Sep 9 2016
getlasterror appears to be 0x57 ERROR_INVALID_PARAMETER
,
Sep 9 2016
CreateProcessAsUserW inside sandbox::TargetProcess::Create is failing with ERROR_INVALID_PARAMETER
,
Sep 9 2016
,
Sep 9 2016
locals at time of call to CreateProcessAsUserW are:
0:009> dv
this = 0x000001cc`c28c7c40
exe_path = 0x000001cc`bf36d1d0 "C:\Users\wfh\AppData\Local\Google\Chrome SxS\Application\chrome.exe"
command_line = <value unavailable>
inherit_handles = true
startup_info = 0x00000067`0f2fd850
target_info = 0x00000067`0f2fd800
win_error = 0x00000067`0f2fd9c0
flags = 0x8040c
temp_process_info = struct _PROCESS_INFORMATION
process_info = class base::win::ScopedProcessInformation
entry_point = <value unavailable>
context = struct _CONTEXT
cmd_line = unique_ptr 34 '"'
temp_thread = 0x00000000`00000000
this.lockdown_token_ is: 0x00000000`00007d20
cmd_line is: "C:\Users\wfh\AppData\Local\Google\Chrome SxS\Application\chrome.exe" --type=renderer --enable-features="AutofillProfileCleanup<AutofillProfileCleanup,AutomaticTabDiscarding<AutomaticTabDiscarding,ForceDirectWriteCustomFonts<ForceDirectWriteCustomFonts,*MaterialDesignHistory<MD History,MaterialDesignSettings<MaterialDesignSettings,MaterialDesignUserManager<MaterialDesignUserManager,MediaFoundationH264Encoding<MediaFoundationH264Encoding,OverrideYouTubeFlashEmbed<Override YouTube Flash emed,ParseHTMLOnMainThread<ParseHTMLOnMainThread,PassiveEventListenersDueToFling<PassiveEventListenersDueToFling,PersistentHistograms<PersistentHistograms,PointerEvent<PointerEvent,*PreconnectMore<PreconnectMore,SSLPostQuantumExperiment<SSLPostQuantum,*SafeBrowsingV4LocalDatabaseManagerEnabled<SafeBrowsingV4LocalDatabaseManagerEnabled,SecurityWarningIconUpdate<SecurityWarningIconUpdate,SpeculativeLaunchServiceWorker<SpeculativeLaunchServiceWorker,*SubresourceFilter<SubresourceFilter,SyzyasanDeferredFree<SyzyasanDeferredFree,TimerThrottlingForHiddenFrames<TimerThrottlingForHiddenFrames,TranslateUI2016Q2<TranslateUI2016Q2,*UseGoogleLocalNtp<UseGoogleLocalNtp,V8Ignition<V8Ignition,V8_ES2015_TailCalls<V8_ES2015_TailCalls,*WebRTC-H264WithOpenH264FFmpeg<WebRTC-H264WithOpenH264FFmpeg,token-binding<TokenBinding" --disable-features=BlockSmallPluginContent<PluginPowerSaverTiny,DisableFirstRunAutoImport<DisableFirstRunAutoImport,DocumentWriteEvaluator<DisallowFetchForDocWrittenScriptsInMainFrame,ExpectCTReporting<ExpectCTReporting,NetworkPrediction<NetworkPrediction,NetworkTimeServiceQuerying<NetworkTimeQueries,NewAudioRenderingMixingStrategy<NewAudioRenderingMixingStrategy,PassiveDocumentEventListeners<PassiveDocumentEventListeners,UpdateRendererPriorityOnStartup<UpdateRendererPriorityOnStartup --force-fieldtrials="*AppBannerTriggering/site-engagement-conservative/*AutofillFieldMetadata/Enabled/*AutofillProfileCleanup/Enabled_Dogfood/*AutofillSingleClick/Enabled/*AutomaticTabDiscarding/Enabled_Once_10_Dogfood/BackgroundModeRestart/Control/*BrowserScheduler/Default/*CSSExternalScanner/Enabled_ScanOnly/CaptivePortalInterstitial/Enabled/*ChildAccountDetection/Enabled/ChromeSuggestionsReleaseQualification/Default/ChromeSuggestionsTuning/Default/*ClientSideDetectionModel/Model0/DisableFirstRunAutoImport/Control/*DisallowFetchForDocWrittenScriptsInMainFrame/DocumentWriteScriptBlockGroup2/*EnableMediaRouter/Enabled/EnableMediaRouterWithCastExtension/Enabled/ExpectCTReporting/ExpectCTReportingDisabled_Dogfood/*ExtensionContentVerification/Enforce/*ExtensionDeveloperModeWarning/Enabled/*ForceDirectWriteCustomFonts/forceCustomFonts/ForcePluginPowerSaver/Dogfood/*GFE/Default/InstanceID/Enabled/*MD History/Control/MaterialDesignDownloads/Enabled_Dogfood/*MaterialDesignSettings/Enabled_Dogfood/MaterialDesignUserManager/Enabled/*MediaFoundationH264Encoding/Enabled/*NetworkPrediction/Disabled/*NetworkQualityEstimator/Enabled/*NetworkTimeQueries/NetworkTimeQueriesDisabled/NewAudioRenderingMixingStrategy/Disabled/*OfferUploadCreditCards/Enabled/*OmniboxBundledExperimentV1/DevDogfoodControlR6_Dogfood/*Override YouTube Flash emed/YouTubeFlashRewrite/*ParseHTMLOnMainThread/Enabled_25/*PassiveDocumentEventListeners/Control/*PassiveEventListenersDueToFling/Enabled/PasswordBranding/SmartLockBrandingSavePromptOnly/*PasswordGeneration/Enabled/*PersistentHistograms/EnabledInMemory/*PluginPowerSaverTiny/Control/*PointerEvent/Enabled/*QUIC/Enabled/ReportCertificateErrors/ShowAndPossiblySend/SHA1IdentityUIWarning/Enabled/SHA1ToolbarUIJanuary2016/Warning/SHA1ToolbarUIJanuary2017/Error/*SRTPromptFieldTrial/MonthlyPrompt/*SSLBufferSizeRecv/17408 (Default)/*SSLBufferSizeSend/17408 (Control)/SSLCommonNameMismatchHandling/Control/*SSLPostQuantum/enabled/*SafeBrowsingIncidentReportingService/Enabled/SafeBrowsingUnverifiedDownloads/DisableByParameterMostSbTypes2/*SafeBrowsingUpdateFrequency/Default/*SafeBrowsingV4LocalDatabaseManagerEnabled/Default/SecurityWarningIconUpdate/Enabled/SignInPasswordPromo/Default/*SiteEngagement/Default/SiteIsolationExtensions/Control/*SpeculativeLaunchServiceWorker/Enabled/StackProfiling/Report profiles/*StrictSecureCookies/Enabled/*SubresourceFilter/Disabled/SyncHttpContentCompression/Disabled/SyzyasanDeferredFree/Enabled/*TimerThrottlingForHiddenFrames/Control/*TokenBinding/TokenBinding_Dogfood/*TranslateUI2016Q2/EnabledWithoutParams/*TriggeredResetFieldTrial/On/*UMA-Population-Restrict/dogfood/*UMA-Uniformity-Trial-1-Percent/group_95/*UMA-Uniformity-Trial-10-Percent/group_06/*UMA-Uniformity-Trial-100-Percent/group_01/*UMA-Uniformity-Trial-20-Percent/group_03/*UMA-Uniformity-Trial-5-Percent/group_06/*UMA-Uniformity-Trial-50-Percent/default/*V8CacheStrategiesForCacheStorage/normal/*V8Ignition/Ignition_Staging/*V8_ES2015_TailCalls/Enabled/WebBluetoothBlacklist/BlacklistUpdate1/WebRTC-DisableCpuAdaptation/CpuAdaptationOn/*WebRTC-StunInterPacketDelay/5/WebRTC-SupportVP9SVC/Default/" --primordial-pipe-token=CB0D554754CE3609796675E21D0F9937 --lang=en-US --extension-process --enable-webrtc-hw-h264-encoding --enable-offline-auto-reload --enable-offline-auto-reload-visible-only --blink-settings=disallowFetchForDocWrittenScriptsInMainFrame=false,disallowFetchForDocWrittenScriptsInMainFrameOnSlowConnections=true,parseHTMLOnMainThreadCoalesceChunks=false,parseHTMLOnMainThreadSyncTokenize=false,cssExternalScannerNoPreload=true,cssExternalScannerPreload=false --enable-pinch --device-scale-factor=1 --num-raster-threads=4 --enable-main-frame-before-activation --content-image-texture-target=0,0,3553;0,1,3553;0,2,3553;0,3,3553;0,4,3553;0,5,3553;0,6,3553;0,7,3553;0,8,3553;0,9,3553;0,10,3553;0,11,3553;0,12,3553;0,13,3553;0,14,3553;1,0,3553;1,1,3553;1,2,3553;1,3,3553;1,4,3553;1,5,3553;1,6,3553;1,7,3553;1,8,3553;1,9,3553;1,10,3553;1,11,3553;1,12,3553;1,13,3553;1,14,3553;2,0,3553;2,1,3553;2,2,3553;2,3,3553;2,4,3553;2,5,3553;2,6,3553;2,7,3553;2,8,3553;2,9,3553;2,10,3553;2,11,3553;2,12,3553;2,13,3553;2,14,3553;3,0,3553;3,1,3553;3,2,3553;3,3,3553;3,4,3553;3,5,3553;3,6,3553;3,7,3553;3,8,3553;3,9,3553;3,10,3553;3,11,3553;3,12,3553;3,13,3553;3,14,3553 --mojo-application-channel-token=CB0D554754CE3609796675E21D0F9937 --channel="23260.1625.1101331456\507922842" --mojo-platform-channel-handle=27968 /prefetch:1
I'm not sure why this call would be failing all of a sudden.
,
Sep 9 2016
https://uma.googleplex.com/p/chrome/timeline_v2?sid=abe9c9c825c22d010076c2e0ce7fa1b5 shows SBOX_ERROR_CREATE_PROCESS starting to rise maybe around 54.0.2815.0 maybe?
,
Sep 9 2016
Process.Sandbox.Launch.Error for 0x57 ERROR_INVALID_PARAMETER seems to also spike around the same time, maybe, if you squint. https://uma.googleplex.com/p/chrome/timeline_v2?sid=04d3bf37ca080858b648ecbbe2b66ecc https://codereview.chromium.org/1854323002 is in this revision range, assuming something around 2813 - 2818 maybe. At some point I'm going to have to restart Chrome.
,
Sep 9 2016
FWIW the extension point changes landed in original (browser/renderer) - df585142df9fd58299ef02b64da3381d8b733cb8 - 54.0.2817.0 revert - 406b433e8b3468b8c252a552119c0d65c7be874f - 54.0.2824.0 just renderer - 3087cbe87c1d2cd8a43c55d8ed4d7bbe2d9d9609 - 54.0.2824.0 https://uma.googleplex.com/p/chrome/timeline_v2?sid=271adf8680f87932219e31ba47c5c12e gives a "clearer" regression range possibility of 2813 being the first time seen, so unlikely to be the extension point changes. https://chromium.googlesource.com/chromium/src/+log/54.0.2812.0..54.0.2813.0?pretty=fuller&n=10000 the only change that stands out there is the gn switch https://codereview.chromium.org/2192303002 but I'm not sure how that could have changed these codes unless something subtle changed in the binary. of course this could be something totally random and windowsy, perhaps forshaw@ has some ideas before I give up and restart my browser.
,
Sep 9 2016
If you've still got it running trying to find out which part is failing will be useful. Could you stick breakpoints on: ntdll!NtCreateUserProcess ntdll!RtlCreateProcessParametersEx kernelbase!BasepConvertWin32AttributeList and see what the status code returned in rax is. If not all the places are reached it's something in between. If they all succeed then it's something to do with CSRSS registration or similar which is causing the problem. Of course if it's in the kernel we'll need to have a further poke around. Also can you use !token on the token handle to check what token we're passing in?
,
Sep 9 2016
All returning 0 except ntdll!NtCreateUserProcess which returns c000000d (STATUS_INVALID_PARAMETER)
!token on the lockdown_token handle gives
0:009> !token 0x6110
TS Session ID: 0x1
User: S-1-5-21-39260824-743453154-142223018-554139
User Groups:
00 S-1-5-21-39260824-743453154-142223018-513
Attributes - DenyOnly
01 S-1-1-0
Attributes - DenyOnly
02 S-1-5-21-88736187-1791044414-3698198224-1000
Attributes - DenyOnly
03 S-1-5-32-544
Attributes - DenyOnly
04 S-1-5-32-559
Attributes - DenyOnly
05 S-1-5-32-555
Attributes - DenyOnly
06 S-1-5-32-545
Attributes - DenyOnly
07 S-1-5-4
Attributes - DenyOnly
08 S-1-2-1
Attributes - DenyOnly
09 S-1-5-11
Attributes - DenyOnly
10 S-1-5-15
Attributes - DenyOnly
11 S-1-5-5-0-1265597
Attributes - Mandatory Default Enabled LogonId
12 S-1-2-0
Attributes - DenyOnly
13 S-1-5-21-39260824-743453154-142223018-472178
Attributes - DenyOnly
14 S-1-5-21-39260824-743453154-142223018-485690
Attributes - DenyOnly
15 S-1-5-21-39260824-743453154-142223018-429271
Attributes - DenyOnly
16 S-1-5-21-39260824-743453154-142223018-24369
Attributes - DenyOnly
17 S-1-5-21-39260824-743453154-142223018-387277
Attributes - DenyOnly
18 S-1-5-21-39260824-743453154-142223018-143643
Attributes - DenyOnly
19 S-1-5-21-39260824-743453154-142223018-180283
Attributes - DenyOnly
20 S-1-5-21-39260824-743453154-142223018-451739
Attributes - DenyOnly
21 S-1-5-21-39260824-743453154-142223018-469382
Attributes - DenyOnly
22 S-1-18-1
Attributes - DenyOnly
23 S-1-16-4096
Attributes - GroupIntegrity GroupIntegrityEnabled
Primary Group: S-1-5-21-39260824-743453154-142223018-513
Privs:
Auth ID: 0:1359f1
Impersonation Level: Anonymous
TokenType: Primary
Is restricted token: yes.
Restricted SIDs:
00 S-1-0-0
Attributes - Mandatory Default Enabled
SandBoxInert: 0
Elevation Type: 3 (Limited)
Mandatory Policy: TOKEN_MANDATORY_POLICY_VALID_MASK
Integrity Level: S-1-16-4096
Process Trust Level: (null)
Token Virtualized: Disabled
UIAccess: 0
IsAppContainer: 0
Security Attributes Information:
00 Attribute Name: TSA://ProcUnique
Value Type : TOKEN_SECURITY_ATTRIBUTE_TYPE_UINT64
Value[0] : 1429
Value[1] : 85108765159
Device Groups:
,
Sep 9 2016
I think I did something different today that perhaps might have caused it, but I haven't retested it. 1. install chrome 2. wait for chrome to update 3. while old chrome is still running, launch new_chrome.exe with a new --user-data-dir so you get a new chrome. 5. old chrome is now broken?
,
Sep 9 2016
when I say today I mean yesterday - I definitely manually ran up a new chrome canary with a new user data dir to test another bug (that I needed a clean profile for) and I remember thinking at the time I had to run new_chrome.exe since chrome.exe wasn't there. Perhaps I did also run chrome.exe before new_chrome.exe but it was run with --user-data-dir to a new empty profile directory.
,
Sep 9 2016
It's unexpected that there be new_chrome.exe but not chrome.exe. +fdoray who has been working on update things recently. Do you see any errors in %temp%\chrome_installer.log?
,
Sep 9 2016
I can reliably repro this by following the steps in #11. Running new_chrome.exe manually with a new --user-data-dir while an existing chrome.exe is running causes the existing one to die in a bad way - it seems the chrome.exe from the existing one (4,002,120 bytes) vanishes entirely, so it's not surprising the old one can't start new processes any more.
C:\Users\wfh\AppData\Local\Google\Chrome SxS\Application>dir
Volume in drive C has no label.
Volume Serial Number is 2CD5-AC93
Directory of C:\Users\wfh\AppData\Local\Google\Chrome SxS\Application
09/09/2016 11:26 <DIR> .
09/09/2016 11:26 <DIR> ..
07/09/2016 01:55 <DIR> 55.0.2853.0
09/09/2016 11:26 <DIR> 55.0.2854.2
07/09/2016 01:20 4,002,120 chrome.exe
09/09/2016 11:26 422 chrome.VisualElementsManifest.xml
08/09/2016 18:27 96,500 debug.log
01/06/2016 11:00 <DIR> Dictionaries
08/09/2016 19:47 3,996,488 new_chrome.exe
09/09/2016 11:54 <DIR> SetupMetrics
4 File(s) 8,095,530 bytes
6 Dir(s) 250,339,467,264 bytes free
C:\Users\wfh\AppData\Local\Google\Chrome SxS\Application>new_chrome.exe --user-data-dir=d:\src\temp6
C:\Users\wfh\AppData\Local\Google\Chrome SxS\Application>dir
Volume in drive C has no label.
Volume Serial Number is 2CD5-AC93
Directory of C:\Users\wfh\AppData\Local\Google\Chrome SxS\Application
09/09/2016 13:15 <DIR> .
09/09/2016 13:15 <DIR> ..
07/09/2016 01:55 <DIR> 55.0.2853.0
09/09/2016 11:26 <DIR> 55.0.2854.2
08/09/2016 19:47 3,996,488 chrome.exe
09/09/2016 11:26 422 chrome.VisualElementsManifest.xml
08/09/2016 18:27 96,500 debug.log
01/06/2016 11:00 <DIR> Dictionaries
09/09/2016 11:54 <DIR> SetupMetrics
3 File(s) 4,093,410 bytes
6 Dir(s) 250,337,746,944 bytes free
C:\Users\wfh\AppData\Local\Google\Chrome SxS\Application>
,
Sep 10 2016
Awesome. Thanks for the repro steps. I'll take a look to see what needs to happen.
,
Sep 11 2016
Explanation of why the steps described in comment #11 lead to a broken Chrome: 1. install chrome 2. wait for chrome to update - Creates new_chrome.exe 3. while old chrome is still running, launch new_chrome.exe with a new --user-data-dir so you get a new chrome. - new_chrome.exe launches a --rename-chrome-exe process - The --rename-chrome-exe process runs RenameChromeExecutables which: -- Renames chrome.exe to old_chrome.exe -- Renames new_chrome.exe to chrome.exe -- Tries to delete old_chrome.exe, or, it that fails (which is the case here), moves it to a delete-on-reboot directory. 5. old chrome is now broken? - When the browser process launches a child process, it uses the executable path with which it was itself launched https://cs.chromium.org/chromium/src/content/common/content_paths.cc?l=16. - That means that creating a new tab in the old chrome creates a child process using a path that now refers to the new executable. That isn't supported... but I would have expected a crash or a browser malfunction rather than an ERROR_INVALID_PARAMETER in CreateProcessAsUserW. Before launching a --rename-chrome-exe process, Chrome uses IsBrowserAlreadyRunning() to verify that it isn't already running. Unfortunately, that function supposes that the executable path is always the same (but that isn't the case with chrome.exe and new_chrome.exe). Note that none of this code is new. I can't explain the spike in Process.Sandbox.Launch.Error for 0x57 ERROR_INVALID_PARAMETER.
,
Sep 12 2016
Thanks for taking a look, fdoray. I can think of a few options off the top of my head. Maybe there are better ones down below somewhere: - Change IsBrowserAlreadyRunning so that it uses the path the installation directory rather than to the binary itself. This would suppress the rename dance and also make the "close yr browserz" dialog appear on uninstall if new_chrome.exe was running. - Change browser startup so that if anything other than chrome.exe is launched, the proc relaunches itself as chrome.exe. This would prevent new_chrome.exe from spinning up a browser process (which I don't think we should support). Any other ideas?
,
Sep 12 2016
I agree that we shouldn't support running new_chrome.exe. Some code depends on the chrome.exe name: - https://cs.chromium.org/chromium/src/chrome/browser/lifetime/application_lifetime.cc?l=204 - https://cs.chromium.org/chromium/src/chrome/browser/safe_browsing/incident_reporting/binary_integrity_analyzer_win.cc?l=27
,
Sep 12 2016
Glad we've gotten to the bottom of this. I agree that running new_chrome.exe manually is something we shouldn't support. However, is there a chance that Windows could dynamically update shortcuts or do something funky that could cause new_chrome.exe to be run by accident? Perhaps aborting really early on here would be best.
,
Sep 12 2016
Some developers want to launch Chrome with a different executable name https://groups.google.com/a/chromium.org/d/msg/chromium-dev/eJcSxsAvhTg/i6idCYYbCQAJ . I still don't think we should support that. I planned to add the code the relaunch with chrome.exe in ChromeBrowserMainParts::PreCreateThreadsImpl. whf@: Do you believe that doing it earlier (perhaps in main() https://cs.chromium.org/chromium/src/chrome/app/chrome_exe_main_win.cc?l=213) could prevent Windows from dynamically updating shortcuts? In *theory* the install directory always contains chrome.exe [1]. Because of that, I believe it's ok to launch chrome.exe (without checking if it exists) and exit the current process when the current executable's name isn't chrome.exe. [1] Except for a brief moment when RenameChromeExecutables() runs https://cs.chromium.org/chromium/src/chrome/installer/setup/setup_main.cc?l=396
,
Sep 12 2016
Windows will only dynamically update shortcuts if the chrome.exe filename doesn't exist, but as you say I think it always does except [1] in your message. I don't know if launching chrome.exe if the current executable name is not chrome.exe, is the right thing to do here since I'm not familiar with this code. grt@ might have a comment on this.
,
Sep 13 2016
I think Windows does its shortcut fixups when the shortcut is resolved; i.e., when a user double-clicks it. I think we've already lost if chrome.exe is missing and Windows has launched some other foo_chrome.exe. Maybe we should take a conservative stance and only relaunch when new_chrome.exe or old_chrome.exe is run. This may not break devs who expect to rename chrome.exe. What do you think?
,
Sep 13 2016
See also issue 371087 and issue 495620 .
,
Sep 13 2016
I'm not sure if it's appropriate in this case but if it's mainly down to shell link tracking weirdness then changing to use the ReplaceFile API should sort that out (in theory). That function updates the tracking service so that the new file is determined to be the target file. You can even specify a backup filename so could pass chrome.exe as target, new_chrome.exe and replacement file and old_chrome.exe as backup. Obviously if old_chrome.exe is just trying to spawn chrome.exe we've got a problem, we could eliminate that by manually querying for actual process name (not the one it necessarily started with) which should be reflected in the renamed file but that might not work in the general case.
,
Oct 22 2016
As a datapoint, I've managed to hit this issue
_as long as I don't set breakpoints on CreateProcess and friends_ in the debugger.
0:013>
eax=00000001 ebx=00000000 ecx=0008040c edx=0000040c esi=41981038 edi=3531fc78
eip=00ce2044 esp=0bebe934 ebp=0bebe9a4 iopl=0 nv up ei pl zr na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000246
chrome!sandbox::TargetProcess::Create+0x95:
00ce2044 ff157c10d300 call dword ptr [chrome!_imp__CreateProcessAsUserW (00d3107c)] ds:002b:00d3107c={ADVAPI32!CreateProcessAsUserWStub (770a2c20)}
0:013> dv
this = 0x3531fc78
exe_path = 0x3d2c7c40 "C:\Users\robliao\AppData\Local\Google\Chrome SxS\Application\chrome.exe"
command_line = 0x0008040c "--- memory read error at address 0x0008040c ---"
inherit_handles = true
startup_info = 0x0bebe9e0
target_info = 0x0bebea28
win_error = 0x0bebeb58
flags = 0x8040c
temp_process_info = struct _PROCESS_INFORMATION
process_info = class base::win::ScopedProcessInformation
cmd_line = class std::unique_ptr<wchar_t,base::FreeDeleter>
temp_thread = <value unavailable>
0:013> !gle
LastErrorValue: (Win32) 0x57 (87) - The parameter is incorrect.
LastStatusValue: (NTSTATUS) 0xc000000d - An invalid parameter was passed to a service or function.
Once I set breakpoints, CreateProcess succeeds (and I see the new process), but it's in a very strange state where
1:035> x chrome_child!g_target_services
6978c714 chrome_child!g_target_services = 0x00000000
Nothing obvious shows up in ProcMon either except that Windows does do a lot of the motions of attempting to start the process.
1) Probes to see the file is there (Success, File Locked With Only Readers)
2) Checks HKLM\Software\Microsoft\Windows NT\CurrentVersion\Image File Execution Options\chrome.exe\[UseFilter|Debugger|UseLargePages|NodeOptions\ForceWakeChange|PerfOptions]
This is a standard check to prep the process for launch and do things like redirect DLLs via .local and launching with debuggers.
This means at this point, CreateProcessAsUser thinks this has a good chance of succeeding. We fail after this point.
,
Oct 22 2016
Oh, and I'm running Chrome 56.0.2897.1 32-bit SyzyASAN.
,
Dec 21 2016
,
Feb 7 2017
Got a kernel debugger on this and got some more data (until the kernel debugger hung :-( ). The good news is the kd verified what I saw in ProcMon, where Image File Execution options are read. The kd hung before I could get to the failure case (requiring a system restart and losing the repro). I've attached a log of the internals of CreateProcess.
,
Feb 7 2017
Current Status:
The kernel debugger suggests we're failing duplicating the handle table during create process:
17: kd> r
rax=000000003070ff9f rbx=0000000000004cd8 rcx=fffffffff64ab000
rdx=ffffe0005152fc00 rsi=ffffc0012652c360 rdi=0000000000004c00
rip=fffff802a6bf7303 rsp=ffffd0002251ba00 rbp=00000000000000d8
r8=ffffc001264c6001 r9=ffffc0011c9d7000 r10=0000000000000002
r11=ffffc0011b4ad040 r12=ffffc0012652c358 r13=0000000000000003
r14=0000000000000036 r15=ffffc0011c9d7360
iopl=0 nv up ei pl nz na po nc
cs=0010 ss=0018 ds=002b es=002b fs=0053 gs=002b efl=00000206
nt! ?? ::NNGAKEGL::`string'+0x200e3:
fffff802`a6bf7303 41bf0d0000c0 mov r15d,0C000000Dh
STATUS_INVALID_PARAMETER is set here at the ready. Then we end up returning after a few loops.
Stack:
17: kd> kn200
# Child-SP RetAddr Call Site
00 ffffd000`2251ba00 fffff802`a6af0a87 nt! ?? ::NNGAKEGL::`string'+0x200e3
nt!ExDupHandleTable (likely)
01 ffffd000`2251bab0 fffff802`a6a49656 nt!ObInitProcess+0x53
02 ffffd000`2251bb10 fffff802`a6aeeaec nt!PspAllocateProcess+0x88a
03 ffffd000`2251bee0 fffff802`a67cd1a3 nt!NtCreateUserProcess+0x620
04 ffffd000`2251ca90 00007ff9`04ec6824 nt!KiSystemServiceCopyEnd+0x13
05 0000004a`e12fded8 00007ff9`0206e541 ntdll!NtCreateUserProcess+0x14
06 0000004a`e12fdee0 00007ff9`0206c873 KERNELBASE!CreateProcessInternalW+0x14e1
07 0000004a`e12fe950 00007ff9`03ce7f6f KERNELBASE!CreateProcessAsUserW+0x63
08 0000004a`e12fe9c0 00007ff7`a6347070 ADVAPI32!CreateProcessAsUserWStub+0x5f
09 0000004a`e12fea30 00007ff7`a6341963 chrome!sandbox::TargetProcess::Create+0xe0 [c:\b\build\slave\win64-pgo\build\src\sandbox\win\src\target_process.cc @ 112]
0a 0000004a`e12feb10 00007ff8`d719d3c7 chrome!sandbox::BrokerServicesBase::SpawnTarget+0x4c3 [c:\b\build\slave\win64-pgo\build\src\sandbox\win\src\broker_services.cc @ 417]
0b 0000004a`e12fed60 00007ff8`d743d00b chrome_7ff8d70c0000!content::StartSandboxedProcess+0x743 [c:\b\build\slave\win64-pgo\build\src\content\common\sandbox_win.cc @ 811]
0c 0000004a`e12ff0e0 000001d2`e90f2c30 chrome_7ff8d70c0000!content::internal::ChildProcessLauncherHelper::LaunchProcessOnLauncherThread+0x257 [c:\b\build\slave\win64-pgo\build\src\content\browser\child_process_launcher_helper_win.cc @ 79]
,
Feb 8 2017
Some more notes after pondering:
We seem to always fail at about the same place in the ExDupHandleTable.
Optimistically, if I interpret the register as a handle value (seems suggestive of that), we fail here:
0:038> !handle 4cd8
Handle 4cd8
Type File
0:038> !handle 4cd8 f
Handle 4cd8
Type File
Attributes 0
GrantedAccess 0x120089:
ReadControl,Synch
Read/List,ReadEA,ReadAttr
HandleCount 2
PointerCount 65378
No Object Specific Information available
,
Feb 8 2017
Can you grab a copy of Process Hacker (http://processhacker.sourceforge.net/), double click the chrome process and look at the handle table when this occurs. Specially it should a) tell you the path for the file (which I'm betting will be a named pipe anyway) and b) if you right click the columns you can add the Attributes column. This column will show whether the Inherit flag has been set. If the Inherit flag is not set that's a clear indication of a problem.
,
Feb 8 2017
4cd8 is... C:\Windows\Fonts\seguiemj.ttf The inherit flag is _not_ set. Very strange times indeed.
,
Feb 8 2017
I'm assuming this is from the still broken instance of Chrome you had before? If that's the case then at least we can be reasonably confident of the _why_ the create process call is failing. Now the problem is understanding _how_ we end up with a completely bogus handle. Most likely reason is we have a reference counting problem somewhere which causes the original handle to be inherited to be closed and something else comes along later and reallocated the same number. This could take a while to track down, do we have handle verification on the browser process? If not might be worth turning it on and seeing if we get any double handle frees.
,
Feb 8 2017
Yep. This is still from the broken browser process.
Some more stats from Process Hacker:
References 27770
Handles 1
That's a bit different from what !handle is currently reporting:
0:045> !handle 4cd8 f
Handle 4cd8
Type File
Attributes 0
GrantedAccess 0x120089:
ReadControl,Synch
Read/List,ReadEA,ReadAttr
HandleCount 2
PointerCount 60518
No Object Specific Information available
,
Feb 8 2017
Yep! That was the bad handle! I closed the handle from Process Hacker and everyone works again. As you pointed out, now we have to figure out why this was bad in the first place.
,
Feb 8 2017
s/everyone/everything/
,
Feb 8 2017
Next steps: Going to see if AppVerifier can pick up anything for us.
,
Feb 8 2017
Assigning to myself as I'm actively investigating at the moment.
,
Mar 8 2017
xref to consumer/enterprise escalated bug 698307 which sounds similar (no child processes launching https://bugs.chromium.org/p/chromium/issues/detail?id=698307#c3, --no-sandbox avoids problem.
,
Mar 10 2017
This has been happening more frequently as of late. A few people in the office have seen it resolve itself after shutting down a bunch of tabs and waiting a bit (+siggi)
,
Mar 10 2017
,
Mar 10 2017
Yups, I just saw this. Fired up Sawbuck and found error logs like this: ERROR 8900 8924 11:12:20-931 0 Failed to launch child process After a while it resolved itself, here's a copy of the log messages from last failure to launch a process, to the point where it's resolved: VERBOSE 8900 9192 00:00:01-437 0 To buffer: http://chromium-status.appspot.com/current?format=raw VERBOSE 8900 9576 00:00:02-067 0 Downloading types Bookmarks, Preferences, Passwords, Autofill Profiles, Autofill, Autofill Wallet, Autofill Wallet Metadata, Themes, Typed URLs, Extensions, Search Engines, Sessions, Apps, App settings, Extension settings, History Delete Directives, Dictionary, Favicon Images, Favicon Tracking, Device Info, Priority Preferences, Managed Users, Managed User Shared Settings, Encryption Keys, Experiments VERBOSE 8900 9192 00:00:02-467 0 HttpBridge received response code: 200 VERBOSE 8900 9576 00:00:02-467 0 Committing from types Bookmarks, Preferences, Passwords, Autofill Profiles, Autofill, Autofill Wallet, Autofill Wallet Metadata, Themes, Typed URLs, Extensions, Search Engines, Sessions, Apps, App settings, Extension settings, History Delete Directives, Dictionary, Favicon Images, Favicon Tracking, Device Info, Priority Preferences, Managed Users, Managed User Shared Settings, Encryption Keys, Experiments VERBOSE 8900 9192 00:00:02-557 0 HttpBridge received response code: 200 VERBOSE 8900 9192 00:00:14-968 0 To buffer: https://play.google.com/log?format=json&u=0&authuser=0 VERBOSE 8900 9192 00:00:31-439 0 To buffer: http://chromium-status.appspot.com/current?format=raw VERBOSE 8900 9192 00:00:48-721 0 Scheduling upload for between 60 and 300 seconds from now. VERBOSE 8900 9192 00:00:55-803 0 Starting upload to collector 0. VERBOSE 8900 9192 00:00:55-803 0 Uploading report to https://beacons.gvt2.com/domainreliability/upload VERBOSE 8900 9192 00:00:55-902 0 Upload finished with net error 0, response code 200, retry after 0s VERBOSE 8900 9192 00:00:55-902 0 Upload to collector 0 succeeded. VERBOSE 8900 9192 00:01:01-441 0 To buffer: http://chromium-status.appspot.com/current?format=raw VERBOSE 8900 9192 00:01:37-216 0 To buffer: http://chromium-status.appspot.com/current?format=raw VERBOSE 8900 8924 00:01:40-426 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. VERBOSE 8900 8924 00:01:40-426 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. INFORMATION 8900 8924 00:01:43-358 0 "SW registered", source: https://www.google.ca/_/chrome/newtab?espv=2&ie=UTF-8 (6) VERBOSE 8900 8924 00:01:43-446 0 Canonicalized siggi@google.com to siggi@google.com VERBOSE 8900 8924 00:01:43-446 0 Canonicalized siggi@google.com to siggi@google.com VERBOSE 8900 9192 00:01:44-075 0 Scheduling upload for between 60 and 300 seconds from now. VERBOSE 8900 9192 00:01:44-706 0 To buffer: https://play.google.com/log?format=json&authuser=0 VERBOSE 8900 9192 00:01:49-117 0 To buffer: https://play.google.com/log?format=json VERBOSE 8900 8924 00:01:53-542 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. VERBOSE 8900 8924 00:01:53-542 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. VERBOSE 8900 9192 00:02:01-454 0 To buffer: http://chromium-status.appspot.com/current?format=raw VERBOSE 8900 9192 00:02:01-455 0 Starting upload to collector 0. VERBOSE 8900 9192 00:02:01-455 0 Uploading report to https://beacons.gvt2.com/domainreliability/upload VERBOSE 8900 9192 00:02:01-549 0 Upload finished with net error 0, response code 200, retry after 0s VERBOSE 8900 9192 00:02:01-549 0 Upload to collector 0 succeeded. VERBOSE 8900 8924 00:02:03-564 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. VERBOSE 8900 8924 00:02:03-564 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. VERBOSE 8900 8924 00:02:07-036 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. VERBOSE 8900 8924 00:02:07-036 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. VERBOSE 8900 8924 00:02:08-183 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. VERBOSE 8900 8924 00:02:08-183 0 CrossDevicePromo::MaybeBrowsingSessionStarted; Ineligible for promo. INFORMATION 8900 8924 00:02:08-276 0 "SW registered", source: https://www.google.ca/_/chrome/newtab?espv=2&ie=UTF-8 (6) VERBOSE 8900 8924 00:02:08-340 0 Canonicalized siggi@google.com to siggi@google.com VERBOSE 8900 8924 00:02:08-340 0 Canonicalized siggi@google.com to siggi@google.com VERBOSE 8900 9192 00:02:09-080 0 Scheduling upload for between 60 and 300 seconds from now. VERBOSE 8900 9192 00:02:09-661 0 To buffer: https://play.google.com/log?format=json&authuser=0 VERBOSE 8900 9192 00:02:10-820 0 Scheduling upload for between 60 and 300 seconds from now. VERBOSE 8900 9576 00:02:11-101 0 Committing from types Bookmarks, Preferences, Passwords, Autofill Profiles, Autofill, Autofill Wallet, Autofill Wallet Metadata, Themes, Typed URLs, Extensions, Search Engines, Sessions, Apps, App settings, Extension settings, History Delete Directives, Dictionary, Favicon Images, Favicon Tracking, Device Info, Priority Preferences, Managed Users, Managed User Shared Settings, Encryption Keys, Experiments VERBOSE 8900 9192 00:02:11-388 0 HttpBridge received response code: 200 VERBOSE 8900 9192 00:02:13-367 0 Scheduling upload for between 59 and 299 seconds from now. VERBOSE 8900 9192 00:02:14-196 0 To buffer: https://googleads.g.doubleclick.net/pagead/id VERBOSE 8900 9192 00:02:14-306 0 To buffer: https://fonts.gstatic.com/s/roboto/v15/CWB0XYA8bzo0kSThX0UTuA.woff2 VERBOSE 8900 9192 00:02:14-496 0 To buffer: https://r13---sn-tt17dnee.googlevideo.com/videoplayback?signature=B5C08FAF33465808ABFAF56D6B807105077462BA.81D365841A06DBBCC0BD564BF47C0B33275EE9AB&requiressl=yes&sparams=clen%2Cctier%2Cdur%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cupn%2Cexpire&itag=243&key=yt6&mime=video%2Fwebm&pl=48&ctier=L&source=youtube&gir=yes&upn=9VRUC8A3WfE&expire=1489184188&lmt=1489124790570795&id=o-AKQ1igC_ieTjMKoReV23HxrAoDALEsCD_Xx5ukT3Td16&mm=31&ipbits=0&mn=sn-tt17dnee&initcwndbps=9302500&clen=1572274&ip=2620%3A0%3A100c%3A0%3A1c77%3A8670%3A97b5%3A23f6&ms=au&mt=1489162474&dur=29.988&mv=m&alr=yes&keepalive=yes&ratebypass=yes&cpn=T9-mLeZmnXdFvE4V&c=WEB&cver=1.20170309&range=0-134289&rn=0&rbuf=0 VERBOSE 8900 9192 00:02:14-497 0 To buffer: https://r13---sn-tt17dnee.googlevideo.com/videoplayback?signature=3D948E752590007AE7A533F17B9511975B866189.28BD91D2E4A51CCB6D9D38A992DE441D31E39958&requiressl=yes&sparams=clen%2Cctier%2Cdur%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cupn%2Cexpire&itag=251&key=yt6&mime=audio%2Fwebm&pl=48&ctier=L&source=youtube&gir=yes&upn=9VRUC8A3WfE&expire=1489184188&lmt=1489124403610268&id=o-AKQ1igC_ieTjMKoReV23HxrAoDALEsCD_Xx5ukT3Td16&mm=31&ipbits=0&mn=sn-tt17dnee&initcwndbps=9302500&clen=476348&ip=2620%3A0%3A100c%3A0%3A1c77%3A8670%3A97b5%3A23f6&ms=au&mt=1489162474&dur=30.041&mv=m&alr=yes&keepalive=yes&ratebypass=yes&cpn=T9-mLeZmnXdFvE4V&c=WEB&cver=1.20170309&range=0-76749&rn=1&rbuf=0 VERBOSE 8900 9192 00:02:14-602 0 To buffer: https://r13---sn-tt17dnee.googlevideo.com/videoplayback?signature=B5C08FAF33465808ABFAF56D6B807105077462BA.81D365841A06DBBCC0BD564BF47C0B33275EE9AB&requiressl=yes&sparams=clen%2Cctier%2Cdur%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cupn%2Cexpire&itag=243&key=yt6&mime=video%2Fwebm&pl=48&ctier=L&source=youtube&gir=yes&upn=9VRUC8A3WfE&expire=1489184188&lmt=1489124790570795&id=o-AKQ1igC_ieTjMKoReV23HxrAoDALEsCD_Xx5ukT3Td16&mm=31&ipbits=0&mn=sn-tt17dnee&initcwndbps=9302500&clen=1572274&ip=2620%3A0%3A100c%3A0%3A1c77%3A8670%3A97b5%3A23f6&ms=au&mt=1489162474&dur=29.988&mv=m&alr=yes&keepalive=yes&ratebypass=yes&cpn=T9-mLeZmnXdFvE4V&c=WEB&cver=1.20170309&range=134290-247017&rn=2&rbuf=2899 VERBOSE 8900 9192 00:02:14-621 0 To buffer: https://r13---sn-tt17dnee.googlevideo.com/videoplayback?signature=3D948E752590007AE7A533F17B9511975B866189.28BD91D2E4A51CCB6D9D38A992DE441D31E39958&requiressl=yes&sparams=clen%2Cctier%2Cdur%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cupn%2Cexpire&itag=251&key=yt6&mime=audio%2Fwebm&pl=48&ctier=L&source=youtube&gir=yes&upn=9VRUC8A3WfE&expire=1489184188&lmt=1489124403610268&id=o-AKQ1igC_ieTjMKoReV23HxrAoDALEsCD_Xx5ukT3Td16&mm=31&ipbits=0&mn=sn-tt17dnee&initcwndbps=9302500&clen=476348&ip=2620%3A0%3A100c%3A0%3A1c77%3A8670%3A97b5%3A23f6&ms=au&mt=1489162474&dur=30.041&mv=m&alr=yes&keepalive=yes&ratebypass=yes&cpn=T9-mLeZmnXdFvE4V&c=WEB&cver=1.20170309&range=76750-258157&rn=3&rbuf=4727 VERBOSE 8900 9192 00:02:14-644 0 To buffer: https://r13---sn-tt17dnee.googlevideo.com/videoplayback?signature=B5C08FAF33465808ABFAF56D6B807105077462BA.81D365841A06DBBCC0BD564BF47C0B33275EE9AB&requiressl=yes&sparams=clen%2Cctier%2Cdur%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cupn%2Cexpire&itag=243&key=yt6&mime=video%2Fwebm&pl=48&ctier=L&source=youtube&gir=yes&upn=9VRUC8A3WfE&expire=1489184188&lmt=1489124790570795&id=o-AKQ1igC_ieTjMKoReV23HxrAoDALEsCD_Xx5ukT3Td16&mm=31&ipbits=0&mn=sn-tt17dnee&initcwndbps=9302500&clen=1572274&ip=2620%3A0%3A100c%3A0%3A1c77%3A8670%3A97b5%3A23f6&ms=au&mt=1489162474&dur=29.988&mv=m&alr=yes&keepalive=yes&ratebypass=yes&cpn=T9-mLeZmnXdFvE4V&c=WEB&cver=1.20170309&range=247018-518226&rn=4&rbuf=5338 VERBOSE 8900 9192 00:02:15-857 0 To buffer: https://r13---sn-tt17dnee.googlevideo.com/videoplayback?signature=B5C08FAF33465808ABFAF56D6B807105077462BA.81D365841A06DBBCC0BD564BF47C0B33275EE9AB&requiressl=yes&sparams=clen%2Cctier%2Cdur%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cupn%2Cexpire&itag=243&key=yt6&mime=video%2Fwebm&pl=48&ctier=L&source=youtube&gir=yes&upn=9VRUC8A3WfE&expire=1489184188&lmt=1489124790570795&id=o-AKQ1igC_ieTjMKoReV23HxrAoDALEsCD_Xx5ukT3Td16&mm=31&ipbits=0&mn=sn-tt17dnee&initcwndbps=9302500&clen=1572274&ip=2620%3A0%3A100c%3A0%3A1c77%3A8670%3A97b5%3A23f6&ms=au&mt=1489162474&dur=29.988&mv=m&alr=yes&keepalive=yes&ratebypass=yes&cpn=T9-mLeZmnXdFvE4V&c=WEB&cver=1.20170309&range=518227-1174039&rn=5&rbuf=9767 VERBOSE 8900 9192 00:02:15-881 0 Scheduling upload for between 60 and 300 seconds from now. VERBOSE 8900 9192 00:02:17-895 0 To buffer: https://r13---sn-tt17dnee.googlevideo.com/videoplayback?signature=3D948E752590007AE7A533F17B9511975B866189.28BD91D2E4A51CCB6D9D38A992DE441D31E39958&requiressl=yes&sparams=clen%2Cctier%2Cdur%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cupn%2Cexpire&itag=251&key=yt6&mime=audio%2Fwebm&pl=48&ctier=L&source=youtube&gir=yes&upn=9VRUC8A3WfE&expire=1489184188&lmt=1489124403610268&id=o-AKQ1igC_ieTjMKoReV23HxrAoDALEsCD_Xx5ukT3Td16&mm=31&ipbits=0&mn=sn-tt17dnee&initcwndbps=9302500&clen=476348&ip=2620%3A0%3A100c%3A0%3A1c77%3A8670%3A97b5%3A23f6&ms=au&mt=1489162474&dur=30.041&mv=m&alr=yes&keepalive=yes&ratebypass=yes&cpn=T9-mLeZmnXdFvE4V&c=WEB&cver=1.20170309&range=258158-476347&rn=6&rbuf=13099 VERBOSE 8900 9192 00:02:21-395 0 To buffer: https://r13---sn-tt17dnee.googlevideo.com/videoplayback?signature=B5C08FAF33465808ABFAF56D6B807105077462BA.81D365841A06DBBCC0BD564BF47C0B33275EE9AB&requiressl=yes&sparams=clen%2Cctier%2Cdur%2Cgir%2Cid%2Cinitcwndbps%2Cip%2Cipbits%2Citag%2Clmt%2Cmime%2Cmm%2Cmn%2Cms%2Cmv%2Cpl%2Crequiressl%2Csource%2Cupn%2Cexpire&itag=243&key=yt6&mime=video%2Fwebm&pl=48&ctier=L&source=youtube&gir=yes&upn=9VRUC8A3WfE&expire=1489184188&lmt=1489124790570795&id=o-AKQ1igC_ieTjMKoReV23HxrAoDALEsCD_Xx5ukT3Td16&mm=31&ipbits=0&mn=sn-tt17dnee&initcwndbps=9302500&clen=1572274&ip=2620%3A0%3A100c%3A0%3A1c77%3A8670%3A97b5%3A23f6&ms=au&mt=1489162474&dur=29.988&mv=m&alr=yes&keepalive=yes&ratebypass=yes&cpn=T9-mLeZmnXdFvE4V&c=WEB&cver=1.20170309&range=1174040-1572273&rn=7&rbuf=14687 VERBOSE 8900 9576 00:02:23-521 0 Committing from types Bookmarks, Preferences, Passwords, Autofill Profiles, Autofill, Autofill Wallet, Autofill Wallet Metadata, Themes, Typed URLs, Extensions, Search Engines, Sessions, Apps, App settings, Extension settings, History Delete Directives, Dictionary, Favicon Images, Favicon Tracking, Device Info, Priority Preferences, Managed Users, Managed User Shared Settings, Encryption Keys, Experiments VERBOSE 8900 9192 00:02:23-633 0 HttpBridge received response code: 200 VERBOSE 8900 9192 00:02:24-654 0 To buffer: https://youtubei.youtube.com/youtubei/v1/log_interaction?alt=json&key=AIzaSyAO_FJ2SlqU8Q4STEHLGCilw_Y9_11qcW8
,
Mar 21 2017
,
Mar 21 2017
I tried ETW with the instructions from this web page: http://geekswithblogs.net/akraus1/archive/2016/03/14/173308.aspx. This works like a charm for short sessions, but I suspect the data volume will be ugly for catching a once-every-two-weeks event. The stack traces I get are nice and deep, though, so this looks to be a better data source than e.g. !htrace, which captures fairly shallow stack traces.
,
Mar 21 2017
Okay, I did some spelunking through UMA data to see when this starts on appearing. It starts on Aug 3 on canary, and Aug 10th on dev. For beta and stable it starts with M54. Narrowing it down to a specific change, it appears to be between 54.0.2816.0 and 54.0.2817.0, with the revision log here: https://chromium.googlesource.com/chromium/src/+log/54.0.2816.0..54.0.2817.0?pretty=fuller&n=10000 There's a particularly suspicious CL here: https://chromium.googlesource.com/chromium/src/+/5f2967d15fe53b2bd0cec15dd22fdff3b86e1ca9 Digging further...
,
Mar 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/00c0c98ea999d1d8edde6d1a2c9d3e4ff1449117 commit 00c0c98ea999d1d8edde6d1a2c9d3e4ff1449117 Author: scottmg <scottmg@chromium.org> Date: Tue Mar 21 19:38:37 2017 Revert of Allow handles through for official builds too (partial revert) (patchset #3 id:40001 of https://codereview.chromium.org/2198603002/ ) Reason for revert: Potentially causing https://crbug.com/645319 . Graph to watch to see if it helps: https://goto.google.com/nezdg BUG= 645319 Original issue's description: > Allow handles through for official builds too (partial revert) > > This is a (very) partial revert of > https://codereview.chromium.org/1291553003 to fix the crash reported in > https://bugs.chromium.org/p/chromium/issues/detail?id=625172#c21. I > started trying to dig into making renderers log, but it's still a hairy > mess. > > R=wfh@chromium.org > BUG= 625172 , 358267, 579196 > TEST=out\Release\chrome.exe http://www.bbc.co.uk/news/world-asia-china-36189636 --enable-logging=stderr > log.txt 2>&1 where chrome is is_official_build=true shouldn't crash. > > Committed: https://crrev.com/5f2967d15fe53b2bd0cec15dd22fdff3b86e1ca9 > Cr-Commit-Position: refs/heads/master@{#409318} TBR=wfh@chromium.org,sky@chromium.org,thestig@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 625172 , 358267, 579196 Review-Url: https://codereview.chromium.org/2767663002 Cr-Commit-Position: refs/heads/master@{#458513} [modify] https://crrev.com/00c0c98ea999d1d8edde6d1a2c9d3e4ff1449117/base/process/launch_win.cc [modify] https://crrev.com/00c0c98ea999d1d8edde6d1a2c9d3e4ff1449117/content/common/sandbox_win.cc [modify] https://crrev.com/00c0c98ea999d1d8edde6d1a2c9d3e4ff1449117/content/test/content_browser_test_test.cc
,
Mar 22 2017
An update on AppVerifier's handle verifier: libusb is making this a bit harder to diagnose because we have to suppress all invalid handle exceptions to make Chrome usable. Unfortunately, this means we likely miss some real uses of invalid handles. The issue reproed with this exception suppressed under AppVerifier.
,
Mar 23 2017
scottmg's revert landed in the 3048.0 canary. Looking at 1 day aggregation from 3047 to 3048, the numbers appear to be lower (both in user count, and in failure count). But, the numbers are really sparse so I wouldn't take this as gospel. Wait for another full day of data at least?
,
Mar 24 2017
Looking at https://uma.googleplex.com/p/chrome/timeline_v2/?sid=67a1943b706aaca98065e6d0178b6160 I'm not feeling enthusiastic that the revert helped. Anyone else seeing something different?
,
Mar 27 2017
Timeline doesn't show you the count as broken down by a particular value of that large enum. We're specifically interested in the % of users reporting INVALID_PARAMETER. The "Usage" histogram gets you this, but is not updated as often as timeline or the primary histograms, so it requires more days of data. It also blends together all versions that may report on the same day, so there's still people on 3047 and earlier contributing samples. Manually calculating this shows that the metric has in fact gone down. Since 3048.0 landed every canary has had a INVALID_PARAMETER usage rate of 0.00% or 0.01%. This is "back to normal", and down from the rough rate of 0.10% - 0.20% seen since the troubles started. Here's an example of the query I'm using: http://go/nztkd The real question is *why* does this change have any effect?
,
Mar 27 2017
I still don't quite understand the UMA, but I'll take your esteemed word for it. :) As far as cause I have no idea. In https://codereview.chromium.org/2198603002 base/process/launch_win.cc is used when --enable-logging is specified, so I think we can discard that. content/test/content_browser_test_test.cc is test-only, so that's not it. So, that leaves https://codereview.chromium.org/2198603002/diff/40001/content/common/sandbox_win.cc?context=10&column_width=80&tab_spaces=10 I have no clue how stdout/stderr got changed to a ttf handle per https://bugs.chromium.org/p/chromium/issues/detail?id=645319#c34 . I guess if PreSpawnTarget() is closing stdout or stderr in some delegates, then it's a race on reusing that handle to something non-inheritable. But I don't see where that's happening.
,
Mar 27 2017
one other idea would be to disable the handle closer (on canary) - this does cause a security regression but it would probably be okay to land for a few canaries to see the impact. Although, to be honest, the handle closer runs in child process and I can't see how closing handles in child could affect those in browser (unless this is a subtle race in Windows). This would probably be better done with the original https://codereview.chromium.org/2198603002/ back in place, to make it easier to know if this is 'fixing' things...
,
Mar 27 2017
I was wondering about the handle closer given its name, but it doesn't look like it does much other than add to a map until LowerToken in the target, is that right? A couple other moderately interesting things I noticed poking around: - The dll blacklist, but similar to to the handle closer I think it shouldn't be doing anything in the browser. - I'm relatively sure this is unused in chrome-proper, but it'd be called at the right time PreSpawnTarget() -> PreSpawnRenderer(), and suspiciously involves font files. But it should just be adding font file names policy rule. https://cs.chromium.org/chromium/src/content/shell/browser/shell_content_browser_client.cc?rcl=7bc9bd3a6fa05f39eab314b3193e2ba382e96b45&l=364 Is the handle verifier still in use? There's ScopedHandles getting created along the way too. Now that we have a solid suspect, we could also check here https://cs.chromium.org/chromium/src/sandbox/win/src/broker_services.cc?rcl=3321d7f0ab6782ed8e316e5b94b21aad2c77a603&l=352 that stdout/stderr are still inheritable (rather than when they're assigned to the policy).
,
Mar 27 2017
The handle verifier is not enabled > dev channel (in beta and stable) and yet I think people were reporting issues in stable... Yup agree about your summary of both handle closer agent and dll blacklist, they shouldn't be doing anything in the browser.
,
Mar 27 2017
#52 > I can't see how closing handles in child could affect those in browser (unless this is a subtle race in Windows). Agreed on that one, especially since each process has its own handle table and IIRC refcounts to resources associated with the handle are tracked separately. One exercise that might be interesting is attempting to induce this case in a test program given that we know from both the KD and the revert that handles are highly likely involved.
,
Mar 27 2017
My general view is that something has a UaF on a handle so that it closes the same one twice or closes a bogus value. The odd thing is it must occur after setting up the inheritable handle list (which does some checking) but before we spawn the new target which should narrow it down slightly. Although bearing in the mind that the handle table is global to the process it's possible that its due to another thread somewhere. One possible idea is to add some debug code for Canary or similar which sets the HANDLE_FLAG_PROTECT_FROM_CLOSE flag on any handle we're going to inherit . That way if something does close the handle while the policy is still in-memory then we should get an exception (and hopefully at the bad call site). Might take some playing around with scoping to ensure it doesn't blow up in other places. I suppose if it's stdout/stderr then setting this once would be sufficient as nothing should be closing those handles.
,
Mar 30 2017
I guess I should merge the revert back to 58, despite our lack of explanatory powerz.
,
Mar 30 2017
This bug requires manual review: Reverts referenced in bugdroid comments after merge request. Please contact the milestone owner if you have questions. Owners: amineer@(Android), cmasso@(iOS), bhthompson@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Mar 31 2017
We've now got 1.5 weeks of data from the UMA usage dashboard. The instance of this particular child process launch failure has dropped by more than half, but it's still higher than it was prior to August. In addition to lacking explanatory power, there's also another factor at play? :/
,
Mar 31 2017
I guess if we suspect it's handle misuse and racy reuse then it'd make some sense that it only reduces the "hit rate". Maybe we should be looking at the handle verifier again.
,
Apr 3 2017
Looks like the hit rate continues to fall slowly, but doesn't look like we're getting to pre-issue rates. Agreed that handle verifier might be a good idea here.
,
Apr 6 2017
Before we approve merge to M58, could you please confirm revert listed at #46 will be a safe revert to merge?
,
Apr 10 2017
This should be a safe revert to merge. It's been running on dev and canary for quite some time: D:\src>git tag --contains 00c0c98ea999d1d8edde6d1a2c9d3e4ff1449117 59.0.3048.0 [... Through ...] 59.0.3067.2 Today's canary is 59.0.3067.0.
,
Apr 10 2017
Approving merge to M58 branch 3029 based on comment #63. Please merge ASAP. Thank you.
,
Apr 10 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7b8448f77e91f2ef81e4c36929454c321895ecd7 commit 7b8448f77e91f2ef81e4c36929454c321895ecd7 Author: Scott Graham <scottmg@chromium.org> Date: Mon Apr 10 20:25:14 2017 Revert of Allow handles through for official builds too (partial revert) (patchset #3 id:40001 of https://codereview.chromium.org/2198603002/ ) Reason for revert: Potentially causing https://crbug.com/645319 . Graph to watch to see if it helps: https://goto.google.com/nezdg BUG= 645319 Original issue's description: > Allow handles through for official builds too (partial revert) > > This is a (very) partial revert of > https://codereview.chromium.org/1291553003 to fix the crash reported in > https://bugs.chromium.org/p/chromium/issues/detail?id=625172#c21. I > started trying to dig into making renderers log, but it's still a hairy > mess. > > R=wfh@chromium.org > BUG= 625172 , 358267, 579196 > TEST=out\Release\chrome.exe http://www.bbc.co.uk/news/world-asia-china-36189636 --enable-logging=stderr > log.txt 2>&1 where chrome is is_official_build=true shouldn't crash. > > Committed: https://crrev.com/5f2967d15fe53b2bd0cec15dd22fdff3b86e1ca9 > Cr-Commit-Position: refs/heads/master@{#409318} TBR=wfh@chromium.org,sky@chromium.org,thestig@chromium.org BUG= 625172 , 358267, 579196 Review-Url: https://codereview.chromium.org/2767663002 Cr-Commit-Position: refs/heads/master@{#458513} (cherry picked from commit 00c0c98ea999d1d8edde6d1a2c9d3e4ff1449117) Review-Url: https://codereview.chromium.org/2810903002 . Cr-Commit-Position: refs/branch-heads/3029@{#653} Cr-Branched-From: 939b32ee5ba05c396eef3fd992822fcca9a2e262-refs/heads/master@{#454471} [modify] https://crrev.com/7b8448f77e91f2ef81e4c36929454c321895ecd7/base/process/launch_win.cc [modify] https://crrev.com/7b8448f77e91f2ef81e4c36929454c321895ecd7/content/common/sandbox_win.cc [modify] https://crrev.com/7b8448f77e91f2ef81e4c36929454c321895ecd7/content/test/content_browser_test_test.cc
,
Apr 14 2017
Fixed?
,
Apr 14 2017
Not quite. We still need to figure out why this happened. The revert mitigates the symptom.
,
Apr 14 2017
Could this be linked to specific accounts somehow? I've seen some people report this for ChromeOS, Linux and Windows, meaning they likely tried it on all of those and found the issue persisted. I myself have been experiencing this on both my laptop (Win10 64-bit, Ivy Bridge-generation hardware) and on a brand new NUC (Win10 64-bit, Skylake-generation hardware). Both had it happen right after a clean Win10 install, with or without any Chrome extensions enabled. At the same time, no-one else I know seems to be having this issue, including people with very similar setups, so obviously it doesn't happen to everyone. The only difference I can think of is the logged in account. I've switched to Canary now because of above fix (I can confirm it doesn't happen in Canary), but if it helps I could reinstall Stable and test if it happens if I don't add my account.
,
Apr 14 2017
This particular issue tracks Windows where CreateProcess fails with STATUS_INVALID_PARAMETER. We do not have a consistent repro for this issue. If you are encountering this on other platforms, or you are seeing something other than CreateProcess+STATUS_INVALID_PARAMETER, I would encourage you to file bugs for each of those issues.
,
May 11 2017
I decided to take another look at the key metric here, and with M60 the crash rate has dropped even further. We're nearly back down to M53 levels, but this is still slightly elevated compared to the historical rate. Would love to know what's going on here. *sigh*
,
Sep 7 2017
I just wanted to check on this, any new news? We are still seeing this widespread within our enterprise. (Originally reported here: https://bugs.chromium.org/p/chromium/issues/detail?id=698307)
,
Nov 22 2017
We haven't seen this specific issue for quite some time. Has anyone else seen this?
,
May 30 2018
This issue seems to have stopped, so resolving WontFix. If this is still an issue, feel free to comment/reopen. |
|||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||
Comment 1 by wfh@chromium.org
, Sep 9 2016