external/wpt/pointerevents/pointerevent_touch-action-span-test_touch-manual.html is leaking on Linux Trusty Leak |
||||||||
Issue descriptionFlakiness Dashboard: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=external%2Fwpt%2Fpointerevents%2Fpointerevent_touch-action-span-test_touch-manual.html&testType=webkit_layout_tests First leak observed here: https://ci.chromium.org/buildbot/chromium.webkit/WebKit%20Linux%20Trusty%20Leak/24140
,
Sep 12
It's possible, though I'd then be curious why it isn't causing leaks in every other test that processes touch events. But the logs don't seem to give a lot of clues about what exactly is being leaked. In the case of my CL, the thing that might get leaked is if the queue memory isn't deallocated on shutdown, and possibly there are still (unhandled) elements in the queue. One of the other CLs, https://chromium-review.googlesource.com/c/chromium/src/+/1221312, seems to be modifying Blink's leak detector ... You can try rolling back my change if that will help diagnose it.
,
Sep 13
I tried to run leak-detector for each commits: e7b8fba33998e4c815487ed816b35efb809e1ba1 x (590733) f53e53277f925c7ba970e1614b573d31a746acc8 x (590716) 3ae41d050d4e3c6c878dc98b80c95765ec84df26 x (590714) 9cf33431b642cc060b29d4ebee927973cc0ae433 o (590713) Suspect 3ae41d050d4e3c6c878dc98b80c95765ec84df26: [bindings] Allow passing a stack state when requesting a testing GC https://chromium-review.googlesource.com/1221312 causes this flakiness.
,
Sep 13
The CL just moves the GC that is executed with a stack to the task where it is executed without stack. The number of GCs should stay the same as the count is increased by one. The detector is executed on a gazillion of tests, right? I don't see why just one would fail.
,
Sep 13
In fact, each of those GC calls triggers 5 rounds of GC, so we have 15 GCs from the event loop...
,
Sep 13
+haraken, may be you have a hint to check. So far I cannot reproduce locally.
,
Sep 13
,
Sep 13
I think, it is easier to use fast/events/pointerevents/ instead of external/wpt/pointerevents/. I retried. I ran: for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50; do ./third_party/blink/tools/run_web_tests.py --seed 4 --no-show-results --enable-leak-detection fast/events/pointerevents/ ; done 9cf33431b642cc060b29d4ebee927973cc0ae433 PASS 3ae41d050d4e3c6c878dc98b80c95765ec84df26 LEAK If using --batch-size=1, i.e. ./third_party/blink/tools/run_web_tests.py --seed 4 --no-show-results --enable-leak-detection --batch-size=1 fast/events/pointerevents/ 3ae41d050d4e3c6c878dc98b80c95765ec84df26 PASS --batch-size is important to reproduce this.
,
Sep 13
I've narrowed down the problem to the removal of V8GCController::CollectAllGarbageForTesting from BlinkLeakDetector::PerformLeakDetection. Also increasing the number_of_gc_needed_ to 4 seems to fix the issue. This is clearly just a timing issue. Judging from the fact that all the tests are pointer related, I am guessing the root cause is some kind of timing issue with synthesized pointer events. I will add the tests to LeakExpectations and look into it in the future.
,
Sep 13
Thanks for the repro, that helped a lot! I think this has to do with the fact that Oilpan is scheduled as a followup to V8. In the initial state we had: 1. V8 GC 2. Schedule Oilpan followup 3. Schedule timer In the new one we have 1. schedule timer 2. Checks for counter: e.g. schedule timer again if counter > 1, or report results 3. V8 GC 4. Schedule Oilpan followup I think this way the timer may overtake the scheduled Oilpan GC which means that for the last round we report results even though another GC is scheduled. Iff the detector is designed to exactly require this amount of GCs then this can flake. I was not able to reproduce the flake when re-ordering things a bit. Will send out a CL to involved people.
,
Sep 13
Keishi's observation makes also sense: Increasing the counter by one will mean that we get at least the amount of Oilpan GCs as with the old behavior.
,
Sep 13
Issue 883567 has been merged into this issue.
,
Sep 13
Issue 883708 has been merged into this issue.
,
Sep 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ce9d2392108f5591ac9f8e109845706884b2e272 commit ce9d2392108f5591ac9f8e109845706884b2e272 Author: Michael Lippautz <mlippautz@chromium.org> Date: Thu Sep 13 15:27:12 2018 leak detector: Reorder GC and scheduling new timer Bug: 883442 Change-Id: Id96e5173b709e1dd9ce98f273e6cda055c414032 Reviewed-on: https://chromium-review.googlesource.com/1224550 Reviewed-by: Kentaro Hara <haraken@chromium.org> Commit-Queue: Michael Lippautz <mlippautz@chromium.org> Cr-Commit-Position: refs/heads/master@{#591018} [modify] https://crrev.com/ce9d2392108f5591ac9f8e109845706884b2e272/third_party/blink/renderer/controller/blink_leak_detector.cc
,
Sep 14
Thanks. Looks like r591018 fixed the issue. But I still don't understand the root cause I checked the amount of GC a) before r590714 = 10 Conservative 4 Precise b) after r590714 = 10 Conservative 4 Precise c) after r591018 = 15 Conservative 4 Precise I think r591018 solved the issue by doing 5 additional Conservative GCs. I don't see any difference in a) and b) in terms of amount of GCs. So I still think the root cause is some kind of timing issue. This kind of problem is usually a task being processed after leak detection.
,
Oct 3
The issue is still here. https://ci.chromium.org/buildbot/chromium.webkit/WebKit%20Linux%20Trusty%20Leak/24880 Reopening.
,
Oct 3
Sorry if that leak is unrelated.
,
Oct 3
Filed more specific leaking bug: https://bugs.chromium.org/p/chromium/issues/detail?id=891575
,
Oct 4
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by adithyas@chromium.org
, Sep 12