Rietveld ridiculously slow on canary |
|||||||||||
Issue descriptionI generally run canary on windows. I've noticed this for a few weeks now, and am I finally filing a bug. I couldn't tell you exactly when it started occurring. I'm currently running 55.0.2879.0 on windows. The symptom is Rietveld is unduly slow. I literally see parts of the page load and display, say the part on the left hand side with 'Edit issue, Publish+Mail Comments'..., then a second or two later the description, then a second or two later portions of the files. This isn't all the time, but more often than not. I had switched back to stable last week and did not notice this behavior. I'm a bit unsure as to who to file this again.
,
Oct 4 2016
Sounds very suspiciously like issue 637640 , adding dtapuska just in case. If you can record a trace with all the various scheduler options on, it might help further triage.
,
Oct 4 2016
The problem is it doesn't happen all the time, making a bisect impossible. Generally with canary I try to restart every day(ish). When I was running stable last week I left Chrome running for most of the week and didn't restart.
,
Oct 7 2016
Loading triager here. I'm not able to reproduce on a ToT build. Any chance you could attach a trace?
,
Oct 28 2016
I switched to stable shortly after filing this (this bug is partly to blame) and hadn't noticed the issue. A new version was pushed to stable recently, 54.0.2840.71, and after restarting into I hit at least one really slow page load. Of course I can't repro when trying again. I'll try to remember to start tracing when I load codereview pages.
,
Nov 2 2016
I caught a trace of this today. 'Code review, ui/views/mus/BUILD.gn' is the tab that was slow, pid 13484 in the trace. I'm removing NeedsFeedback. If you need anything more from me just ask.
,
Nov 3 2016
The trace looks *exactly* like issue 637640 : html parser tasks scheduled 1s apart, with the main thread completely free to do work. dtapuska, do you mind taking a look? The problem seems way to similar for it to be different (bad scheduling particularly on codereview). sky: if this happens again could you turn on the disabled by default scheduling options? That's renderer.scheduler and renderer.scheduler.debug. also +skyostil
,
Nov 3 2016
csharrison: what you do mean by 'renderer.scheduler and renderer.scheduler.debug'? I don't see those as trace options (I'm on stable, 54.0.2840.71).
,
Nov 3 2016
You need to go into "Manually select settings" and under the "edit categories" dropdown there is a huge list of checkboxes with the various tracing categories.
,
Nov 3 2016
Things were slow again and I recorded another trace with the two renderer categories added. Look for the tab 'json_file_value_serializer.cc'
,
Nov 3 2016
Adding Blink>Scheduling to please take a look at the trace.
,
Nov 3 2016
For whatever reason I seem to be triggering this a ton today.
,
Nov 4 2016
+Blink>Input to get a few more eyes on this. Also +mmenke who got bit by this earlier.
,
Nov 4 2016
Thanks for the trace. Looks like the scheduler thinks we're in the middle of a main thread scroll gesture and is throttling expensive tasks because of that. Let me see if I can reproduce.
,
Nov 4 2016
The code for this seems fairly fragile (And maybe hard to sufficiently cover in tests?). Don't suppose there's some way we could rearchitect the code to avoid this sort of problem?
,
Nov 4 2016
Does anyone have steps for triggering this? I can't seem to repro on ToT after browsing around various Rietvelt issues. Also, a quick check to see if this happens without extensions (incognito) would be useful. I'll try 54.0.2840.71 next. #15: Agreed that our heuristics here a bit complex. We're going to try simplifying things a bit by making some parts of this intervention more rational (e.g., forced passive event listeners), which should limit the number of special cases we need to handle. However the fundamental problem is that there's no golden source of truth about what the user is trying to do, so we'll still need to infer things as best as we can.
,
Nov 4 2016
I do not have consistent steps to repro. I've only noticed this on codereviews and I see a lot of them. Yesterday it was reproducing more readily when using 'j' and 'k' to navigate through files, but again, it isn't consistent. If I had consistent repro steps I would try incognito, but it doesn't happen all the time. The only extensions I have are those forced on by corp policies, but also Docs, Docs Offline, Sheets and Slides.
,
Nov 7 2016
Still no luck reproducing but I think I can see what's going on from the trace. Somehow we're getting an unbalanced count of pending and processed main thread input events, so we perpetually think that an input event is in the queue and deprioritize other stuff as a result. Dave, can you think of any changes recently that might have caused RendererScheduler::DidHandleInputEventOnMainThread() to not always get called for input forwarded to the main thread? In particular do we do something different for blocking vs. nonblocking events[2]? [1] https://cs.chromium.org/chromium/src/third_party/WebKit/public/platform/scheduler/renderer/renderer_scheduler.h?l=113&gs=cpp%253Ablink%253A%253Ascheduler%253A%253Aclass-RendererScheduler%253A%253ADidHandleInputEventOnMainThread(const%2Bblink%253A%253AWebInputEvent%2B%2526)%2540chromium%252F..%252F..%252Fthird_party%252FWebKit%252Fpublic%252Fplatform%252Fscheduler%252Frenderer%252Frenderer_scheduler.h%257Cdecl&gsn=DidHandleInputEventOnMainThread&ct=xref_usages [2] https://cs.chromium.org/chromium/src/content/renderer/input/main_thread_event_queue.cc?rcl=1478435172&l=214
,
Nov 7 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b12cd2ca201c68f2a127c1af0ec0588f091706bb commit b12cd2ca201c68f2a127c1af0ec0588f091706bb Author: skyostil <skyostil@chromium.org> Date: Mon Nov 07 16:05:01 2016 scheduler: Reset pending input event count on navigation When a navigation happens, we should reset the number of pending input events in the user model to avoid waiting for an acknowledgement which may never arrive. This is a speculative mitigation for crbug.com/652294 . BUG= 652294 Review-Url: https://codereview.chromium.org/2476263002 Cr-Commit-Position: refs/heads/master@{#430281} [modify] https://crrev.com/b12cd2ca201c68f2a127c1af0ec0588f091706bb/third_party/WebKit/Source/platform/scheduler/renderer/user_model.cc [modify] https://crrev.com/b12cd2ca201c68f2a127c1af0ec0588f091706bb/third_party/WebKit/Source/platform/scheduler/renderer/user_model_unittest.cc
,
Nov 28 2016
Are you folks still seeing this problem? I still haven't been able to reproduce it.
,
Nov 28 2016
After the bug hit stable and your fix landed I switched back to canary. I haven't seen the issue since then. I've been using canary for at least a week.
,
Dec 8 2016
Alright, sounds like the workaround is...working around. Let's backport to 56 then.
,
Dec 8 2016
[Automated comment] Commit may have occurred before M56 branch point (11/17/2016), needs manual review.
,
Dec 8 2016
Ah, my bad. This is already in M56. |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by cbiesin...@chromium.org
, Oct 4 2016