Issue metadata
Sign in to add a comment
|
Country code drop down takes longer to close in Amazon Shopping app.
Reported by
ngo...@etouch.net,
Jan 13 2017
|
|||||||||||||||||||||||||
Issue descriptionDevice name: ASUS_Z010D /(MMB29P), Samsung Galaxy J5/MMB29M, Pixel XL/(NDE63V), Gionee F103/(LRX21M), Samsung Galaxy J2/(LMY47X), Nexus 5X/(NRD90R) Application: Amazon Shopping Application Version: version 8.9.3.300 WebView version: 57.0.2980.0 Steps to reproduce: 1. Clear data of Amazon App >> Launch Amazon app >> Click on "New to Amazon.in? Create an account". 2. On Sign up page click on Country Code drop down >> Scroll up and down quickly and while it scrolls, close the drop down. 3. Observe. Frequency : 3/3 This is a regression issue broken in 'M-56' and below is the bisect range. Good build: 56.0.2913.0 Bad build : 56.0.2915.0 Bisect Range: https://chromium.googlesource.com/chromium/src/+log/56.0.2913.0..56.0.2915.0?pretty=fuller&n=10000 Actual result: Drop down takes longer duration to close. Expected result : Drop down should close down quickly.
,
Jan 18 2017
It is regression and Amazon is top App. Marking as RBS for review. Thanks!
,
Jan 19 2017
I am suspicious this is an async draw related thingy.
,
Jan 19 2017
trace please also retest with --sync-on-draw-hardware
,
Jan 19 2017
adding the option did not help. Attached is the trace, is there any clue?
,
Jan 19 2017
Bo is off the hook. continueing my hunt.
,
Jan 20 2017
no luck yet, this thing is a time sink.
,
Jan 20 2017
Tima can you please bisect the range, thanks!!!
,
Jan 21 2017
The offending CL is https://codereview.chromium.org/2479023003
,
Jan 21 2017
I tested with Nexus 6P and O, angler-userdebug O ORG09B 3623808 dev-keys I do not thing the specific device and Android version matter in this case, just for the record.
,
Jan 21 2017
,
Jan 21 2017
thanks Tima!!. Amineer should decide the fate of this CL.
,
Jan 21 2017
,
Jan 21 2017
lanwei@, tdresser@, this appears to be a regression in M56 in the Amazon app related to flinging. We've only seen it in this one specific case, so it's possible that Amazon has some bad code in here. Can you PTAL and let me know what you think? We're building our M56 stable candidate next Tuesday at 5 PM, so we have very little time to address this if we think this is a serious bug and should require a code fix. FWIW, that seems unlikely - boliu@ and I spoke about this and it seems we should probably remove the RB-Stable label, but wanted to consult with you first.
,
Jan 21 2017
Issue 683449 has been merged into this issue.
,
Jan 23 2017
Lan - can you take a look at this? I agree that this is P1. It's reasonably likely that this is working as intended - prioritizing scrolling during fling over responding to main thread events, but it is surprising to me how badly this behaves. +dtapuska
,
Jan 23 2017
Sure, I am looking at it now.
,
Jan 23 2017
,
Jan 24 2017
I built a webview on a version of the branch_2924 without my CL, and used it on an Android Device with version 7.1.2 on the Amazon shopping app, I still see the regression. I also checked my CL again, first it was behind a flag, and it is enabled in M57, also the code did not change the code about fling, it is just set some flags if there is an active fling.
,
Jan 24 2017
Well, I redid my experiment, and so far the result remains the same (i.e. c#9). I have compiled master, not M56 branch, however with that CL (src/ has Cr-commit-position #430640) I often see the issue and after I revert this CL, recompile and reinstall both SystemWebView.apk and the Amazon app I do not see it. The flag enable_fling_passive_listener_flag_ is false. However, with this CL I start receiving a new disposition DID_NOT_HANDLE_NON_BLOCKING_DUE_TO_FLING which always comes when I dismiss the dialog. Can it have an effect?
,
Jan 24 2017
You are encountering the intervention on. M56 doesn't have the intervention on. So there must be two causes of this bug. It sounds a lot like a scheduling issue that a timer is getting throttled after the dismiss of the dialog. We need a bisect on the m56 branch.
,
Jan 24 2017
It is a pure speculation at this point, but can it be that we erroneously take the pressing of close button on the dialog as fling in input_handler_proxy.cc ?
,
Jan 24 2017
Since I'm coming a little late today, may I ask someone else to bisect on M56 branch?
,
Jan 24 2017
There is another CL https://codereview.chromium.org/2471523002/ which depends on this one, so I am not sure if there will be a problem when you just revert the CL. DID_NOT_HANDLE_NON_BLOCKING_DUE_TO_FLING is just used when the flag is enabled, so in M56, it would not change anything.
,
Jan 24 2017
Theoretically the return value of of InputHandlerProxy::HandleTouchStart() can change independently of the value of the flag.
,
Jan 24 2017
In my opinion, since we only see the issue in a narrow use case, we can punt this to M57. Test team should be able to help with further bisecting if needed.
,
Jan 24 2017
We found out the regression can be reproduced on chrome as well, goo.gl/VQrnFg, I will run a bisect on chrome, will update soon.
,
Jan 24 2017
No problem, we can move this CL to M57, thanks.
,
Jan 24 2017
If we can bisect on Chrome, let's get an exact bisect please and evaluate for M56.
,
Jan 24 2017
For what it's worth it seems the bug disappears if I comment the block that replaces return value DID_NOT_HANDLE with DID_NOT_HANDLE_NON_BLOCKING_DUE_TO_FLING in https://codereview.chromium.org/2479023003/diff/60001/ui/events/blink/input_handler_proxy.cc (only tried on master branch synchronized with the CL position). It may matter because of https://cs.chromium.org/chromium/src/content/renderer/input/input_event_filter.cc?rcl=0&l=244
,
Jan 24 2017
After much deliberation, we'll remove the RB-Stable here, since we've only seen this on Amazon's page. That said we should work to understand why this code is getting triggered, and what's causing the regression, ASAP.
,
Jan 25 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/67a6fb85a5ac46aea25fdba28706432648ecb9dd commit 67a6fb85a5ac46aea25fdba28706432648ecb9dd Author: amineer <amineer@chromium.org> Date: Wed Jan 25 01:37:07 2017 Fix missing case statement from branch merge A previous cherry-pick was missing a necessary case statement to trigger fling behavior. Fix this on the M56 branch. NOPRESUBMIT=true NOTRY=true BUG=680889 Review-Url: https://codereview.chromium.org/2650963004 Cr-Commit-Position: refs/branch-heads/2924@{#861} Cr-Branched-From: 3a87aecc31cd1ffe751dd72c04e5a96a1fc8108a-refs/heads/master@{#433059} [modify] https://crrev.com/67a6fb85a5ac46aea25fdba28706432648ecb9dd/content/renderer/input/input_handler_manager.cc
,
Jan 25 2017
skyostil@ what should be our policy for touchevents that are passive. It seems that this Amazon website triggers a setTimeout inside its pointerdown listener and it appears that the timer can get delayed. See the following code: https://cs.chromium.org/chromium/src/content/renderer/input/input_handler_manager.cc?type=cs&sq=package:chromium&rcl=1485289510&l=244 We only let the scheduler know when we have not consumed an event or it is a non-blocking fling (the cause of this bug was that this condition wasn't there). If it is a passive event we don't tell the scheduler about it at all. On this amazon site it schedules some animation timers it appears on pointerdown and I believe those get throttled somehow.
,
Jan 25 2017
I didn't throttle input events during passive handling because the whole point of passive events is to provide smooth compositor scrolling. I wonder if we need a policy whereby a setTimeout is dispatched inside an event handler of an input event then it is not delayed but compositor input still takes priority? Is this a situation the scheduler can handle today?
,
Jan 25 2017
Issue still repro on latest M56 - 56.0.2924.77. Thanks
,
Jan 25 2017
,
Jan 25 2017
Re: c#35, yeah, we don't expect my change to fix the regression, just to get branch back into a similar state as trunk. We'll have to continue to investigate this particular bug but it won't block M56.
,
Jan 25 2017
DelayedTimer.png is from M56 and NotDelayed.png is from M55. You can see that the timer scheduled is clearly 2s in the DelayedTimer vs 100ms in the NotDelayed.png skyostil@ any input regarding this?
,
Jan 25 2017
Here is a trace with renderer scheduler enabled
,
Jan 25 2017
Here is the trace with all categories
,
Jan 25 2017
Thanks for the trace! Looks like what's happening here is that the scheduler thinks the page has both touch handlers and expensive timers and is trying to improve main thread responsiveness by throttling timers. The problem is that those handlers are actually passive, so there's really no need to take extreme measures to try and make them complete quickly. I think the fix here is to avoid reporting passive touch handlers to the scheduler, since it should really be ignoring them in the first place. That basically boils down to passing false to this interface: https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/scheduler/renderer/render_widget_signals.h?rcl=0&l=34
,
Jan 25 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a670b173bddc43c7ef4acd39a43e5d196b1b417d commit a670b173bddc43c7ef4acd39a43e5d196b1b417d Author: Alex Mineer <amineer@chromium.org> Date: Wed Jan 25 20:42:49 2017 Fix missing case statement from branch merge A previous cherry-pick was missing a necessary case statement to trigger fling behavior. Fix this on the M56 branch. NOPRESUBMIT=true NOTRY=true BUG=680889 R=lanwei@chromium.org Review-Url: https://codereview.chromium.org/2650963004 Cr-Original-Commit-Position: refs/branch-heads/2924@{#861} Cr-Original-Branched-From: 3a87aecc31cd1ffe751dd72c04e5a96a1fc8108a-refs/heads/master@{#433059} Committed: https://chromium.googlesource.com/chromium/src/+/67a6fb85a5ac46aea25fdba28706432648ecb9dd Review-Url: https://codereview.chromium.org/2650963004 . Cr-Commit-Position: refs/branch-heads/2924@{#867} Cr-Branched-From: 3a87aecc31cd1ffe751dd72c04e5a96a1fc8108a-refs/heads/master@{#433059} [modify] https://crrev.com/a670b173bddc43c7ef4acd39a43e5d196b1b417d/third_party/WebKit/Source/platform/fonts/FontCache.h [modify] https://crrev.com/a670b173bddc43c7ef4acd39a43e5d196b1b417d/third_party/WebKit/Source/platform/fonts/android/FontCacheAndroid.cpp [modify] https://crrev.com/a670b173bddc43c7ef4acd39a43e5d196b1b417d/third_party/WebKit/Source/platform/fonts/skia/FontCacheSkia.cpp [modify] https://crrev.com/a670b173bddc43c7ef4acd39a43e5d196b1b417d/third_party/WebKit/Source/platform/fonts/win/FontCacheSkiaWin.cpp
,
Jan 25 2017
Ignore c#42, that's a CP for a different issue that inadvertently got mixed in here due to a bad commit (sorry).
,
Jan 25 2017
Moving to M57 and setting release block stable.
,
Feb 22 2017
Just a reminder that M57 Stable launch is nearing. Since this is a release blocker, please get your fix in soon so it gets some soak time in Beta. Thanks.
,
Feb 27 2017
It is existing from M56 not a new regression and M57 stable release is very close by. So moving this bug to M58. Thanks!
,
Mar 20 2017
,
Jun 29 2017
we are able to repro this issue on latest M60: 60.0.3112.51 & M61: 61.0.3144.0, as M59 gone to stable moving to M61 Thanks!
,
Aug 9 2017
we are able to repro this issue on latest M62: 62.0.3180.0 moving to M62 Thanks!
,
Sep 27 2017
,
Dec 20 2017
we are able to repro this issue on latest M64: 64.0.3282.41 & M65: 65.0.3299.3, as M63 gone to stable moving to M65 Thanks!
,
Dec 20 2017
dtapuska@, not a blocker, but do we ever plan to fix this? If so, when? If not, can we Won't fix?
,
Dec 20 2017
We do plan on reworking some of the scheduler (and this is where the bug lies for this issue; so I don't want to loose it). But it will probably be a few months out.
,
Jun 27 2018
Issue still repro on latest M69: 69.0.3473.0 could you please look into this. Thanks!
,
Sep 8
No longer on the Chrome team, e-mail me @google.com if any attention still required from me here, otherwise good luck! |
||||||||||||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||||||||||||
Comment 1 by battun@chromium.org
, Jan 13 2017Status: Available (was: Unconfirmed)