Issue metadata
Sign in to add a comment
|
[Chrome][M61][Android7.1.2] Performance regression on drag and scroll
Reported by
seiyon.p...@gmail.com,
Oct 23 2017
|
||||||||||||||||||||
Issue descriptionSteps to reproduce the problem: 1. Open up Chrome 2. Go to http://m.shopping.daum.net/m/event/detail/3257 3. Scroll the page up & down What is the expected behavior? There is no performance regression. What went wrong? Scroll performance gets bad on 61.0.3163.98. Did this work before? Yes M60 Chrome version: 61.0.3163.98 Channel: stable OS Version: 7.1.2 Flash Version:
,
Oct 24 2017
,
Oct 26 2017
Hi seiyon.park@, could you please provide a performance trace using the instructions here: https://www.chromium.org/developers/how-tos/submitting-a-performance-bug Ideally, both on M61 where you see the issue and on M60 where you do not.
,
Oct 30 2017
,
Oct 30 2017
Thank you for providing more feedback. Adding requester "msrchandra@chromium.org" to the cc list and removing "Needs-Feedback" label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Oct 30 2017
Dear Chrome team I've upload log files, but they are slightly different from what you asked in Comment #3 and hopefully they all would be helpful for your investigation. - Chrome60.png - Chrome60_normal.png - Chrome61.png - Chrome61_frame_drop.png
,
Oct 30 2017
Thanks but unfortunately screenshots aren't very helpful in this case. Could you use the instructions in #3?
,
Oct 31 2017
@bokan, here are the performance traces.
,
Nov 6 2017
Thanks for providing the traces. The main and compositor threads look like they're ticking along just fine in both traces. I do notice some slower ScrollUpdateLatency in the 61 trace, those coincides with heavy decoding on the GPU process. Over to GPU team to further triage.
,
Nov 10 2017
I think by decoding you mean GLES2DecoderImpl? That is the class that runs all the GL commands in the command buffer. Or did you mean image decoding or something? It sounds like maybe you're seeing the GPU process "more busy". The compositor worker thread is running the same number of tasks, about 183 per minute. On M60, GpuMain for 1 second looks like: Name Wall Cpu Self Cpu Self Avg Cpu Occurances MessageLoop::RunTask 205.249 ms 138.190 ms 8.738 ms 7.491 ms 1.439 ms 96 GpuCommandBufferMsg_AsyncFlush 196.511 ms 130.699 ms 5.017 ms 1.596 ms 1.376 ms 95 GpuCommandBufferStub::OnAsyncFlush 191.494 ms 129.103 ms 1.339 ms 1.259 ms 1.359 ms 95 CommandBufferService:PutChanged 190.155 ms 127.844 ms 97.125 ms 75.111 ms 1.346 ms 95 GLES2DecoderImpl::DoCreateAndConsumeTextureINTERNAL 10.054 ms 4.087 ms 10.054 ms 4.087 ms 0.011 ms 386 SwapBuffersLatency 0.000 ms 0.000 ms 0.000 ms 0.000 ms 0.000 ms 59 GLES2DecoderImpl::DoSwapBuffers 82.976 ms 48.646 ms 8.850 ms 8.482 ms 0.825 ms 59 NativeViewGLSurfaceEGL:RealSwapBuffers 68.890 ms 36.808 ms 68.890 ms 36.808 ms 0.624 ms 59 LatencyInfo.Flow 0.476 ms 0.288 ms 0.476 ms 0.288 ms 0.003 ms 99 SyncChannel::Send 4.760 ms 3.068 ms 4.760 ms 3.068 ms 0.052 ms 59 Totals 950.565 ms 618.733 ms 205.249 ms 138.190 ms 0.561 ms 1102 On M61 looks like: MessageLoop::RunTask 199.408 ms 136.395 ms 3.839 ms 3.525 ms 0.784 ms 174 Scheduler::RunNextTask 195.569 ms 132.870 ms 5.300 ms 5.246 ms 1.493 ms 89 GpuCommandBufferMsg_AsyncFlush 189.721 ms 127.129 ms 1.529 ms 1.499 ms 1.496 ms 85 GLES2DecoderImpl::RestoreState 0.367 ms 0.366 ms 0.367 ms 0.366 ms 0.061 ms 6 GpuCommandBufferMsg_RegisterTransferBuffer 0.174 ms 0.121 ms 0.021 ms 0.020 ms 0.040 ms 3 GpuCommandBufferMsg_SetGetBuffer 0.007 ms 0.008 ms 0.004 ms 0.005 ms 0.008 ms 1 GpuCommandBufferStub::OnAsyncFlush 188.192 ms 125.630 ms 1.226 ms 1.095 ms 1.478 ms 85 GpuCommandBufferStub::OnRegisterTransferBuffer 0.153 ms 0.101 ms 0.153 ms 0.101 ms 0.034 ms 3 GpuCommandBufferStub::OnSetGetBuffer 0.003 ms 0.003 ms 0.003 ms 0.003 ms 0.003 ms 1 CommandBufferService:PutChanged 186.966 ms 124.535 ms 99.622 ms 74.690 ms 1.465 ms 85 GLES2DecoderImpl::DoCreateAndConsumeTextureINTERNAL 1.931 ms 1.652 ms 1.931 ms 1.652 ms 0.010 ms 161 SwapBuffersLatency 0.000 ms 0.000 ms 0.000 ms 0.000 ms 0.000 ms 61 GLES2DecoderImpl::DoSwapBuffers 85.364 ms 48.143 ms 10.768 ms 9.919 ms 0.789 ms 61 GLES2DecoderImpl::HandleTexImage2D 0.042 ms 0.042 ms 0.042 ms 0.042 ms 0.042 ms 1 GLES2DecoderImpl::DoProduceTextureDirectCHROMIUM 0.007 ms 0.008 ms 0.007 ms 0.008 ms 0.008 ms 1 NativeViewGLSurfaceEGL:RealSwapBuffers 67.657 ms 34.427 ms 67.657 ms 34.427 ms 0.564 ms 61 LatencyInfo.Flow 0.382 ms 0.349 ms 0.382 ms 0.349 ms 0.003 ms 112 SyncChannel::Send 6.557 ms 3.448 ms 6.557 ms 3.448 ms 0.057 ms 61 Totals 1,122.500 ms 735.227 ms 199.408 ms 136.395 ms 0.700 ms 1051 61 has gpu scheduler there, and is spending more time doing work on the main thread.
,
Nov 13 2017
Yeah, I saw GLES2DecoderImpl and assumed it had something to do with image decoding. I did in fact mean "GPU proc is more busy" :) For my own benefit, you mean more work on the _GPU_ main thread, right?
,
Nov 14 2017
> For my own benefit, you mean more work on the _GPU_ main thread, right? Oh, yes sorry :)
,
Dec 4 2017
Dear Chrome team, kindly pinging.
,
Dec 6 2017
There isn't much rasterization going on and the GPU thread looks mostly idle except when waiting to be able to swap. What I notice different is input event alignment relative to VSync markers in the trace. In the chrome60 trace the InputLatency::TouchMove starts ~5ms before VSync, then gets sent early in the next frame. In the chrome61 trace the InputLatency::TouchMove starts ~1ms after VSync and looks like gets queued for an extra frame. +dtapuska@, +tdresser@, thoughts on this?
,
Dec 6 2017
+dmazzoni@: Is RenderAccessibilityImpl::SendPendingAccessibilityEvents on RendererMain something you're familiar with? It looks like this takes 3 to 4ms on this site longer than paint lifecycle. Is this something we should look into optimizing?
,
Dec 6 2017
Screenshots of the event timing difference I described in #14.
,
Dec 6 2017
Re-assigning to Input for now.
,
Dec 18 2017
Ping, Dave do you know what could have caused this timing change?
,
Dec 18 2017
seiyon.park, could you please tell us the device details you tested on?
,
Dec 18 2017
The timing changes are related to the vsync aligned gesture scroll updates. We disabled the android alignment of the touch events so they now aren't held on the Android side anymore and sent when the driver sends them. The queuing appears in the trace now (where it wouldn't have before because we never saw them until after android flushed them). I didn't see a perf regression when trying this page myself. reporter: Now that Chrome 63 has shipped are you still seeing a perf regression?
,
Jan 5 2018
Dear Chrome team, it's still reproducible on v63. As I go to "m.naver.com" and scroll the pages and compare the performance of v55 and v63, I can see the performance regression on v63. Could you prioritize this regression since it have a bad impact on user experience?
,
Jan 9 2018
Dear Chrome team, kindly pinging.
,
Jan 9 2018
Alexander can you look at this trace? The traces don't seem to be under load I wonder if this is a setTimeout delayed kind of problem. Are you able to tell from these traces?
,
Jan 9 2018
We've significantly improved scheduling-related tracing since 61. I wonder if you could grab a trace using latest Chrome Canary and share it with me? Please ensure that renderer.scheduler and disabled-by-default-renderer.scheduler categoties are enabled (and disabled-by-default-renderer.scheduler.debug if it doesn't make the trace too big).
,
Jan 9 2018
> +dmazzoni@: Is RenderAccessibilityImpl::SendPendingAccessibilityEvents on RendererMain something you're familiar with? It looks like this takes 3 to 4ms on this site longer than paint lifecycle. Is this something we should look into optimizing? Definitely room for optimization but there's not a lot of low-hanging fruit. We have several long-term refactorings in progress that should help. Just like with paint, there are lots of individual sites This is a known issue for some sites. Keep in mind that for blind users, updating the accessibility tree is kind of the equivalent of a paint. The real question is, do we know why this user has accessibility enabled? Ideally if they're not using a screen reader or other legit accessibility services like switch access, voice control, etc. then we just shouldn't be enabling accessibility and incurring that performance cost at all.
,
Jan 11 2018
Dear Chrome team. I can see a performance regression without enabling accessibility setting. Hence as dmazzoni mentioned mentioned RenderAccessibilityImpl::SendPendingAccessibilityEvents on RendererMain takes 4 ms, it is needed to take a look at RendererMain and investigate a reason why it takes 4ms without enabling accessibility setting.
,
Jan 12 2018
Dear Chrome team, kindly to ping.
,
Jan 12 2018
I'm unable to reproduce RenderAccessibilityImpl::SendPendingAccessibilityEvents showing up. Can you try reproducing this with a fresh user data directory? https://chromium.googlesource.com/chromium/src/+/master/docs/user_data_dir.md#command-line
,
Jan 12 2018
It maybe an application requesting accessibility like a password manager. Are any of those apps installed?
,
Jan 15 2018
Dear Chrome team, Regarding accessibility stuff, I will circle back to you with a test result as requested in https://bugs.chromium.org/p/chromium/issues/detail?id=777296#c28. And regardless of accessibility stuff, have you ever found out any clues about a performance regression?
,
Jan 16 2018
Dear Chrome team, I've attached traces files for you.
,
Jan 16 2018
Thanks for the traces. Can you try retaking the traces with "manually select settings" and all default categories enabled? See https://www.chromium.org/developers/how-tos/submitting-a-performance-bug.
,
Jan 19 2018
tdresser@, I retook the trace as you asked in comment 32. Could you take a look?
,
Jan 19 2018
,
Jan 19 2018
,
Jan 19 2018
,
Jan 19 2018
,
Jan 19 2018
,
Jan 19 2018
,
Jan 19 2018
,
Jan 19 2018
,
Jan 19 2018
Thanks for the traces. I dug around a bit, and wasn't able to identify anything that looked like M63 was slower then M58. Alexander, any thoughts on these traces?
,
Jan 19 2018
Thanks for the traces! The immediate problem is that the frequency of BeginMainFrames on the main thread dropped considerably. It seems that it takes ages (~70ms) for compositor thread to submit pending frames, due to DisplayScheduler:pending_swaps taking a very long time. Dana, could you take a look?
,
Jan 19 2018
DisplayScheduler:pending_swaps is an async trace event that is started when we SwapBuffers to the gpu, and ends when we receive a swap ack. If we're spending 70ms in there the gpu is just busy doing other work we gave to it first? When I first looked at this I found a lot more cpu time being spent in the gpu scheduler path. This also sounds like unfortunate gpu scheduling? Going to pass it back to the folks working on that.
,
Jan 19 2018
I code searched to learn what the async trace event was btw, and it's defined here: https://cs.chromium.org/chromium/src/components/viz/service/display/display_scheduler.cc?rcl=a4ae9dfc49f80a105360d5429d884171ab91bba2&l=512
,
Jan 24 2018
Dear Chrome team, kindly pinging to share a update if you have any investigation results.
,
Jan 29 2018
Dear Chrome team, kindly pinging to share a update.
,
Jan 31 2018
Dear Chrome team, kindly pinging to share a update.
,
Feb 7 2018
Dear Chrome team, could you please update me on this issue?
,
Feb 20 2018
Dear Chrome team, kindly pinging
,
Mar 20 2018
Dear Chrome team, kindly pinging
,
Apr 23 2018
Dear Chrome team, I can see the performance regression on Chrome V66, as well. This performance regression has a bad impact on user experience. Could you take a look at this? - Device : Nexus5X - OS : Android 8.0 - Chrome : 66.0.3359.106 - URL : www.naver.com
,
Apr 23 2018
,
May 9 2018
,
May 9 2018
Took a look at the attached traces and video. After reading through the comments and the traces, I don't see anything that looks immediately worse in 61+ vs 60. The video does seem to show dropped frames, but I wonder if that's due to screenrecord (I found that using screenrecord caused my device to drop frames as well). I tried to reproduce this on an N5X running Android 8.1, testing M60 (reported good in this bug) and M66 (reported bad). I actually found that we had a smoother experience on M66 than M60 - please see the attached videos. M60 - https://drive.google.com/file/d/1JpuT8wZr9DP7ypPWqWLjw7PDrpFvDEWp/view?usp=sharing M66 - https://drive.google.com/file/d/1f1QHRnbDF8Tvtw1hIae-vNeB7DXQqtCq/view?usp=sharing Locally, testing on m.naver.com, M66 seems to stay at 60fps when scrolling on the N5X. In order to fix this, we'll likely need to get a local repro, which we don't yet have. Can you please provide a copy of "chrome://version" from a device which reproduces the issue - this will let us ensure our setup matches your exactly.
,
May 9 2018
Issue 836082 has been merged into this issue.
,
May 10 2018
ericrk@ regarding the screen record, I can confirm that the screen record has nothing to do with performance regression. Comparing the video I've uploaded in Comment 52 & 53 and the video in Comment 55, I can see the different performances and if you try to keep scrolling, suddenly you will be able to reproduce it. The video in Comment 52 & 53 : Janky rendering The video in Comment 55 : No rendering problem And please check the attachment "Chrome Version Information.txt" to ensure your setup matched ours.
,
May 10 2018
I've matched my Android / Chrome versions to yours and tried to reproduce for quite some time - I'm unable to trigger the laggy behavior you're seeing. Can you please provide a bit more info: - Does this behavior typically happen quickly? Or do I have to scroll around for some time to trigger it? - Do I need to do anything other than navigate to naver.com and start scrolling? And just to confirm, you aren't running with DevTools open when you see this? DevTools can lead to slowdowns such as what you're describing.
,
May 11 2018
ericrk@ Could you please try below steps? 1. Scroll up&down, up&down a bit at slow speed 2. Suddenly speed up the scrolling 3. Again scroll up&down, up&down a bit at slow speed And I was not running with DevTools when I saw this issue.
,
May 17 2018
I'm ooo for the next couple weeks. Vikassoni@, can you please get a nexus5x and attempt to reproduce this issue using the steps in #59.
,
May 22 2018
Sure. i am looking at it.
,
May 23 2018
I tried to repo according to comment #59 on - Device : Nexus5X - OS : Android 8.1.0 - Chrome : 66.0.3359.106 i couldn't reproduce it after multiple attempts.
,
May 24 2018
vikassoni@ the frequent rate is 1 out of 10 trials. How many times did you attempt to reproduce it?
,
May 24 2018
seiyon.park, can you take before and after traces like you did in #33 with all default categories but NOT the disabled-by-default categories (right side pane)? The disabled-by-default categories affect performance substantially and bloat the trace very quickly. For example in the M63 trace, GPUStateTracer::TakeSnapshotWithCurrentFramebuffer takes up a lot of time inside SwapBuffers in the GPU process, and that causes the rest of compositing pipeline to be stalled (cc::Scheduler::pending_submit_frames).
,
May 24 2018
Hi seiyon, I tried more than 20 times but couldn't reproduce it. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by msrchandra@chromium.org
, Oct 24 2017Labels: Needs-triage-Mobile Needs-Feedback Triaged-Mobile