hang monitor timeout on desktop should match android |
||
Issue descriptionSee https://bugs.chromium.org/p/chromium/issues/detail?id=148136#c19 and https://cs.chromium.org/chromium/src/content/common/content_constants_internal.cc?type=cs&sq=package:chromium&l=16. Now that flash deprecation is so far along, we should try making desktop match android's 5 second timeout. Avi: Assigning to you since you kind of volunteered on chat. Feel free to not do though. :)
,
Mar 19 2018
I'd prefer we try 5s and add a histogram while we're at it to let us reason about this in the future based off actually how often it will trigger.
,
Mar 19 2018
,
Mar 19 2018
The hang count does show up in stability metrics (see https://cs.chromium.org/search/?q=NOTIFICATION_RENDER_WIDGET_HOST_HANG) though I don't know offhand where to find those.
,
Mar 19 2018
Yeah, the histogram I had in mind is not just hang count, but also to have TimeoutMonitor histogram the durations in Restart so we can see how much more it would trigger if we further tightened it.
,
Mar 20 2018
I wouldn't put UMA into TimeoutMonitor. It's used for the hung renderer dialog, the new content rendering timeout, the close timeout, the beforeunload timeout, and the swapout timeout. Bad idea to conflate five different timers. If I had to UMA, I would tag RenderWidgetHostImpl's StartHangMonitorTimeout()/RestartHangMonitorTimeoutIfNecessary()/StopHangMonitorTimeout()/RendererIsUnresponsive()/OnRenderProcessGone(). Here are the scenarios: - Hang starts and resolves with no dialog. We start timing in StartHangMonitorTimeout and stop in StopHangMonitorTimeout. - Hang starts and resolves while dialog up with no user click. We start timing in StartHangMonitorTimeout, it escalates in RendererIsUnresponsive, we stop in StopHangMonitorTimeout. - Hang starts, dialog shows, user kills process. We start timing in StartHangMonitorTimeout, dialog shows in RendererIsUnresponsive, process killed in OnRenderProcessGone. - Hang starts, dialog shows, user restarts hang timer. We start timing in StartHangMonitorTimeout, dialog shows in RendererIsUnresponsive, timer restarted in RestartHangMonitorTimeoutIfNecessary, we start all over again. Those are four metrics. We could treat them all as separate UMAs, or we could join all the data into one big "time from hang timer start to unhang or process death". The first gives us precise data, the second gives us a simple answer to the question of how we'd be affected by adjusting dialog trigger times.
,
Mar 20 2018
I'm fine with whatever set of UMA you think makes sense. I think probably the merged one seems most useful, but you're right that it would be hard to reason about if we need to do any diagnostic work. My intuition is that the combined metric is enough, but I tend to have terrible intuition with metrics. :)
,
Mar 20 2018
As per the stability team email, this would require a Finch A/B test so that a "crash" jump could properly be attributed to this change and not cause them to lose track of other concurrent crashes.
,
Mar 21 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a82d6decfeb4794681e0c25acc4f9ed1d6a574db commit a82d6decfeb4794681e0c25acc4f9ed1d6a574db Author: Avi Drissman <avi@chromium.org> Date: Wed Mar 21 21:41:17 2018 Make RenderWidgetHostImpl's clock switchable for testing. BUG=823087 TEST=none yet Change-Id: I5c3c57fb380b2d7ec863d43b2e317f634ed40e5b Reviewed-on: https://chromium-review.googlesource.com/973899 Reviewed-by: Will Harris <wfh@chromium.org> Commit-Queue: Avi Drissman <avi@chromium.org> Cr-Commit-Position: refs/heads/master@{#544856} [modify] https://crrev.com/a82d6decfeb4794681e0c25acc4f9ed1d6a574db/content/browser/renderer_host/render_widget_host_impl.cc [modify] https://crrev.com/a82d6decfeb4794681e0c25acc4f9ed1d6a574db/content/browser/renderer_host/render_widget_host_impl.h
,
Mar 22 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/aa3bb574c4b2ef27368d30e54b25bba6532e689a commit aa3bb574c4b2ef27368d30e54b25bba6532e689a Author: Avi Drissman <avi@chromium.org> Date: Thu Mar 22 00:24:32 2018 Add UMA for durations of hung render processes. BUG=823087 Change-Id: I8f01414bbeb1472f2fe062afa5b4c66f4bf1f621 Reviewed-on: https://chromium-review.googlesource.com/972184 Reviewed-by: Ilya Sherman <isherman@chromium.org> Reviewed-by: Will Harris <wfh@chromium.org> Commit-Queue: Avi Drissman <avi@chromium.org> Cr-Commit-Position: refs/heads/master@{#544920} [modify] https://crrev.com/aa3bb574c4b2ef27368d30e54b25bba6532e689a/content/browser/renderer_host/render_widget_host_impl.cc [modify] https://crrev.com/aa3bb574c4b2ef27368d30e54b25bba6532e689a/content/browser/renderer_host/render_widget_host_impl.h [modify] https://crrev.com/aa3bb574c4b2ef27368d30e54b25bba6532e689a/content/browser/renderer_host/render_widget_host_unittest.cc [modify] https://crrev.com/aa3bb574c4b2ef27368d30e54b25bba6532e689a/tools/metrics/histograms/histograms.xml
,
Apr 7 2018
We have some very rough numbers from this UMA. Roughly 2.5% of page hangs (longer than 5s) are >30s. If we dropped the hung page timeout to 5s, we're talking about a 40× increase in hung pages. If we dropped it to 10s, it would be about an 8× increase.
,
Apr 9 2018
It's hard to reason about whether users will care about a 40x increase without understanding that relative to other numbers like as a percentage of page loads. If it's 40x of a sufficiently small number, then we can safely say that users won't be more frustrated by constant prompts. Anecdotally, I don't remember the last time I saw a hang monitor prompt, so I'm inclined to think that it's 40x of a sufficiently small number. With other prompts we look at DPM, which is decisions / thousand page loads I believe. I expect this won't have a statistically meaningful impact on the real world DPM for users (that other prompts hugely drown this out). It would be ideal to have desktop and mobile match (5s) IMO, but anything shorter than 30s seems worth a try to me. |
||
►
Sign in to add a comment |
||
Comment 1 by a...@chromium.org
, Mar 19 2018