No relaunch on failed browser rendezvous. |
||||||||||
Issue descriptionVersion: 49.0.2623.110 OS: Windows 7 What steps will reproduce the problem? (1) Launch chrome. Hang it by going to chrome://uithreadhang (2) Launch chrome again. Get offered the relaunch dialog. Click yes. What is the expected output? Chrome relaunches. What do you see instead? The hung chrome gets killed but does not relaunch.
,
Apr 14 2016
(assuming that's the same dialog, not entirely sure)
,
Apr 14 2016
In this case I do see the dialog, only it doesn't relaunch.
,
Apr 14 2016
Oh, sorry, I misread.
,
Apr 15 2016
I think the process singleton is broken. NotifyOtherProcessOrCreate appears to return PROFILE_IN_USE after it terminates the other process. Instead, it should create the singleton and become the new browser process. Looks like this regressed with r171105.
,
Apr 15 2016
Are you sure it regressed with r171105? Didn't fully re-read the CL yet but IIRC the process singleton was broken back then in failed rendez-vous scenarios as well (in particular during shutdown). That CL improved the metro rendez-vous but I don't expect it would have affected failed rendez-vous. Either way, I won't have cycles to address this soon. Hoping Patrick or Pierre-Antoine can take a look as part of the rdv hang analysis. I'm happy to consult on all the situations I'm aware of under which process singleton can be incorrect today.
,
Apr 15 2016
Explicitly +grt for question above
,
Apr 15 2016
It's possible that it was broken before your change. What I see happening now is that if creation of the singleton fails (because another process has it), the other process is terminated but the new process still reports PROFILE_IN_USE. Have a look at NotifyOtherProcessOrCreate to see what I mean.
,
Apr 18 2016
Ah ok, here it is, I see why it looks like r171105 broke this but it didn't. Looking at the change in that CL to NotifyOtherProcessOrCreate() it looks like the logic was flipped from "attempt to NotifyOtherProcess() and on failure Create() -- w/ NotifyOtherProcess() resulting in a kill as ultimate failure mode" to "attempt to Create() and on failure to do that NotifyOtherProcess()" which would mean we fail to call Create() after NotifyOtherProcess()'s kill. But in practice that CL merely moved the Create() logic from the constructor to Create() -- prior to it Create() was a no-op on Windows and the logic was still Create() first, NotifyOtherProcess() second. So prior to r171105 a hung remote process would result in the constructor failing to "create" (resulting in |window_ == NULL|), then in NotifyOtherProcess() doing a kill and lastly in the mock Create() returning a failure per |return window_ != NULL| and NotifyOtherProcessOrCreate() interpreting that failure into PROFILE_IN_USE. So in practice that CL didn't change that logic at all, it merely moved the create logic from constructor to "Create" and then at first sight swapped logic in NotifyOtherProcessOrCreate(), but in practice that was to keep it the same. tl;dr; this has been broken forever but we should fix it, the workaround to this issue for now is to try to launch Chrome again (the kill in first pass should have taken effect and second pass should work...), i.e. this shouldn't be the "internet is broken bug, need to reboot".
,
Apr 18 2016
Can you validate fixing the current behavior is what we want to do? The other option of changing the dialog string indeed seems like a poorer experience. How risky do we feel touching the RDV logic is? How hard is it to add a test for the failed rdv: start chrome, go to chrome://uithreadhang, start chrome again, see if it died and got relaunched?
,
Apr 18 2016
I'm happy to help explain the RDV logic and review changes to it, it's not that bad and I welcome changes to it. Testing wise looks like there is an attempt at something @ [1] but it's disabled... Re-enabling that sounds like a worthwhile goal. [1] https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/process_singleton_browsertest.cc
,
Apr 18 2016
by "try to launch Chrome again" do you (Gab) mean "try to acquire the singleton again and proceed with browser startup"? by "is what we want to do" do you (P-A) mean "is starting Chrome for the user when a hung browser is killed the right thing?" it seems like a clear "yes" to me. we want Chrome to work for the user with as little friction as possible. is there another option i'm not considering? as for testing, there is a unittest for ProcessSingletonWin. it looks like it's missing the "was the singleton acquired following a kill?" part of the test.
,
Apr 18 2016
Re. "by "try to launch Chrome again" do you (Gab) mean "try to acquire the singleton again and proceed with browser startup"?" That's what the code should do yes, but what I meant to highlight right now is that a user workaround to this bug is to manually launch Chrome again (after the initial failed launch which should have killed the hung process and freed the singleton). That user workaround should work unless the hang is somehow during startup and the new chrome thus hangs as well...
,
Apr 18 2016
Re: by "is what we want to do" do you (P-A) mean "is starting Chrome for the user when a hung browser is killed the right thing?" +1 to relaunching being the right thing. I meant we're guilty of two offenses right now: 1) not relaunching (poor experience) 2) saying we'll do something and not doing it. I meant if fixing (1) is complicated/lengthy, it's always possible to fix (2) in the meantime.
,
Apr 18 2016
But from what I hear, we can just fix (1).
,
Apr 18 2016
Yes, fixing (1) is the right thing. Let's not try to fix (2) via a strings change.
,
Apr 18 2016
Which strings is (2) referring to? Is there a dialog on failure which I forget about?
,
Apr 18 2016
If the process is hung and there are visible windows, then the user is asked for confirmation (DisplayShouldKillMessageBox): https://code.google.com/p/chromium/codesearch#chromium/src/chrome/browser/process_singleton_win.cc&l=187
,
Apr 18 2016
Ah ok, I agree with grt then, if the user accepts to kill, we should also relaunch post-kill (just like we should do if the hung process didn't have any windows). Fixing (1) addresses both of these situations. PS: We should also add UMA all over this code now that it's a possibility thanks to Brian's work.
,
Jun 1 2016
Moving this nonessential bug to the next milestone. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 13 2016
This issue has been moved once and is lower than Pri-1. Removing the milestone. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Aug 22 2016
Assuming pmonette@ hasn't started on this and assigning to gcomanici@. gcomanici: - the current logic is 1) attempt to create a new singleton, if it fails then 2) attempt to notify the existing one, which may lead to killing an existing hung one. - We need to add a step 3), which is attempt to relaunch
,
Aug 30 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/668156784598d74a8747b7767c1e565e03d9dc89 commit 668156784598d74a8747b7767c1e565e03d9dc89 Author: gcomanici <gcomanici@chromium.org> Date: Tue Aug 30 04:54:10 2016 Continue browser startup after killing a hung browser instance. Chrome will defer to an existing browser via the process singleton during startup. When the existing browser appears hung, Chrome will prompt the user to kill it. If the user consents, Chrome will now try to acquire the singleton and continue with startup rather than exit. BUG=603698 TEST=see bug Review-Url: https://codereview.chromium.org/2271833002 Cr-Commit-Position: refs/heads/master@{#415121} [modify] https://crrev.com/668156784598d74a8747b7767c1e565e03d9dc89/chrome/browser/chrome_browser_main.cc [modify] https://crrev.com/668156784598d74a8747b7767c1e565e03d9dc89/chrome/browser/process_singleton_win.cc [modify] https://crrev.com/668156784598d74a8747b7767c1e565e03d9dc89/chrome/browser/process_singleton_win_unittest.cc
,
Aug 30 2016
+siggi as FYI that we should see a drop in PROFILE_IN_USE with this change.
,
Jan 22 2018
,
Today
(19 hours ago)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by scottmg@chromium.org
, Apr 14 2016