New issue
Advanced search Search tips

Issue 883442 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 2
Type: Bug



Sign in to add a comment

external/wpt/pointerevents/pointerevent_touch-action-span-test_touch-manual.html is leaking on Linux Trusty Leak

Project Member Reported by adithyas@chromium.org, Sep 12

Issue description

Cc: wjmaclean@chromium.org
wjmaclean@: could https://crrev.com/c/1188436 have caused this leak? 
It's possible, though I'd then be curious why it isn't causing leaks in every other test that processes touch events. But the logs don't seem to give a lot of clues about what exactly is being leaked. In the case of my CL, the thing that might get leaked is if the queue memory isn't deallocated on shutdown, and possibly there are still (unhandled) elements in the queue.

One of the other CLs, https://chromium-review.googlesource.com/c/chromium/src/+/1221312, seems to be modifying Blink's leak detector ...

You can try rolling back my change if that will help diagnose it.
Cc: adithyas@chromium.org
Components: Blink>Bindings
Owner: mlippautz@chromium.org
I tried to run leak-detector for each commits:

e7b8fba33998e4c815487ed816b35efb809e1ba1	 x (590733)
f53e53277f925c7ba970e1614b573d31a746acc8	 x (590716)

3ae41d050d4e3c6c878dc98b80c95765ec84df26         x (590714)
9cf33431b642cc060b29d4ebee927973cc0ae433         o (590713)

Suspect 3ae41d050d4e3c6c878dc98b80c95765ec84df26:

    [bindings] Allow passing a stack state when requesting a testing GC
    https://chromium-review.googlesource.com/1221312

causes this flakiness.

The CL just moves the GC that is executed with a stack to the task where it is executed without stack. The number of GCs should stay the same as the count is increased by one.

The detector is executed on a gazillion of tests, right? I don't see why just one would fail.
In fact, each of those GC calls triggers 5 rounds of GC, so we have 15 GCs from the event loop...
Cc: haraken@chromium.org
+haraken, may be you have a hint to check. 

So far I cannot reproduce locally.
Cc: keishi@chromium.org
I think, it is easier to use fast/events/pointerevents/ instead of external/wpt/pointerevents/.

I retried.
I ran:
  for i in 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50; do ./third_party/blink/tools/run_web_tests.py --seed 4 --no-show-results --enable-leak-detection fast/events/pointerevents/ ; done

9cf33431b642cc060b29d4ebee927973cc0ae433  PASS
3ae41d050d4e3c6c878dc98b80c95765ec84df26  LEAK

If using --batch-size=1, i.e.

  ./third_party/blink/tools/run_web_tests.py --seed 4 --no-show-results --enable-leak-detection --batch-size=1 fast/events/pointerevents/ 

3ae41d050d4e3c6c878dc98b80c95765ec84df26  PASS

--batch-size is important to reproduce this.

Owner: keishi@chromium.org
I've narrowed down the problem to the removal of V8GCController::CollectAllGarbageForTesting from BlinkLeakDetector::PerformLeakDetection. Also increasing the number_of_gc_needed_ to 4 seems to fix the issue. This is clearly just a timing issue. Judging from the fact that all the tests are pointer related, I am guessing the root cause is some kind of timing issue with synthesized pointer events. I will add the tests to LeakExpectations and look into it in the future.
Thanks for the repro, that helped a lot!

I think this has to do with the fact that Oilpan is scheduled as a followup to V8. 

In the initial state we had:
1. V8 GC
2. Schedule Oilpan followup
3. Schedule timer

In the new one we have
1. schedule timer
2. Checks for counter: e.g. schedule timer again if counter > 1, or report results
3. V8 GC
4. Schedule Oilpan followup

I think this way the timer may overtake the scheduled Oilpan GC which means that for the last round we report results even though another GC is scheduled.

Iff the detector is designed to exactly require this amount of GCs then this can flake.

I was not able to reproduce the flake when re-ordering things a bit. Will send out a CL to involved people.

Keishi's observation makes also sense: Increasing the counter by one will mean that we get at least the amount of Oilpan GCs as with the old behavior. 
 Issue 883567  has been merged into this issue.
 Issue 883708  has been merged into this issue.
Project Member

Comment 14 by bugdroid1@chromium.org, Sep 13

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ce9d2392108f5591ac9f8e109845706884b2e272

commit ce9d2392108f5591ac9f8e109845706884b2e272
Author: Michael Lippautz <mlippautz@chromium.org>
Date: Thu Sep 13 15:27:12 2018

leak detector: Reorder GC and scheduling new timer

Bug: 883442
Change-Id: Id96e5173b709e1dd9ce98f273e6cda055c414032
Reviewed-on: https://chromium-review.googlesource.com/1224550
Reviewed-by: Kentaro Hara <haraken@chromium.org>
Commit-Queue: Michael Lippautz <mlippautz@chromium.org>
Cr-Commit-Position: refs/heads/master@{#591018}
[modify] https://crrev.com/ce9d2392108f5591ac9f8e109845706884b2e272/third_party/blink/renderer/controller/blink_leak_detector.cc

Status: Fixed (was: Assigned)
Thanks. Looks like r591018 fixed the issue.

But I still don't understand the root cause
I checked the amount of GC
a) before r590714 = 10 Conservative 4 Precise
b) after r590714 = 10 Conservative 4 Precise
c) after r591018 = 15 Conservative 4 Precise
I think r591018 solved the issue by doing 5 additional Conservative GCs. I don't see any difference in a) and b) in terms of amount of GCs. So I still think the root cause is some kind of timing issue. This kind of problem is usually a task being processed after leak detection.
Status: Assigned (was: Fixed)
The issue is still here.
https://ci.chromium.org/buildbot/chromium.webkit/WebKit%20Linux%20Trusty%20Leak/24880
Reopening.
Sorry if that leak is unrelated.
Labels: -Sheriff-Chromium

Sign in to add a comment