New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 810437 link

Starred by 6 users

Random timeouts in webkit_layout_tests

Project Member Reported by kbr@chromium.org, Feb 8 2018

Issue description

Individual webkit_layout_tests have been flaking on some trybots (http://crbug.com/810254 for example) but it looks like there may be random timeouts happening. Please see this tryjob:

https://ci.chromium.org/buildbot/tryserver.chromium.linux/linux_chromium_rel_ng/642780

These layout test timeouts:

* css3/filters/filter-repaint-composited-fallback.html
* external/wpt/IndexedDB/interleaved-cursors.html
* external/wpt/fetch/api/redirect/redirect-location.html
* fast/webgl/texImage-imageBitmap-from-offscreen-canvas-resize.html
* images/color-profile-layer-filter.html
* virtual/gpu/fast/canvas/canvas-filter-removed.html
* virtual/spv175/paint/invalidation/filters/filter-repaint-accelerated-on-accelerated-filter.html

(See https://test-results.appspot.com/data/layout_results/linux_chromium_rel_ng/642780/layout-test-results/results.html )

I'm 99% sure aren't caused by this revert.

Is there some inherent flakiness in the harness, or in the way the tests are run? Is it possible to make any progress on a random report like this one?


 

Comment 1 by e...@chromium.org, Feb 8 2018

Components: -Blink Blink>Image Blink>Storage
Labels: Test-Layout
Status: Available (was: Untriaged)

Comment 3 by jsb...@chromium.org, Feb 12 2018

Components: -Blink>Storage
pwnall@ reworked the IndexedDB/interleaved-cursors.html test -  issue 708175 


Comment 4 by kbr@chromium.org, Apr 5 2018

Cc: atotic@chromium.org qyears...@chromium.org
Labels: -Pri-2 OS-Windows Pri-1
Owner: dpranke@chromium.org
Status: Assigned (was: Available)
More timeouts:

https://ci.chromium.org/buildbot/tryserver.chromium.win/win7_chromium_rel_ng/138027
* virtual/layout_ng_experimental/fast/multicol/infinitely-tall-content-in-outer-crash.html


Upon retry without patch, two different tests flaked:
* fast/events/hr-timestamp/input-events.html
* html/tabular_data/col_width_resizing_table.html

Anecdotally, I've seen a lot of random failures of the webkit_layout_tests suite on tryjobs recently, though I haven't canvassed the tryservers and accumulated a list. I think the situation is severe enough to warrant more engineers dropping other work and looking into this. Upgrading this to P1. Dirk, can you mobilize forces and figure out how we can get to the bottom of this?

I just had two recent failures on the win7_chromium_rel_ng trybot for a CrOS only change in webkit_layout_tests. 

The following tests failed or flaked:

 fast/events/wheel/wheel-scroll-latching-on-scrollbar.html
 external/wpt/html/infrastructure/urls/resolving-urls/query-encoding/utf-16le.html
 virtual/mouseevent_fractional/fast/events/wheel/wheel-scroll-latching-on-scrollbar.html
 virtual/modern-media-controls/media/controls/modern/doubletap-to-jump-forwards-too-short.html

Combined with two failures on the same builder due to "not enough capacity" timeouts, this has been very frustrating.

This is far from the first time that multiple win7 failures have delayed a CL by more than a day.

@stevenjb - yes, the win failures have been particularly bad for a week or more.

Comment 7 by kbr@chromium.org, Apr 6 2018

Blocking: 829181
#5: Please include links to your try runs when reporting failures like these.

Note: the virtual/layout_ng_experimental/fast/multicol/infinitely-tall-content-in-outer-crash.html flake seemed like a real bug in that test, and the test was disabled in  Issue 829181 . Linking these together; if that bug is resolved then feel free to turn around the blocked on/blocking relationship.

Comment 8 by kbr@chromium.org, Apr 7 2018

Cc: jochen@chromium.org junov@chromium.org kainino@chromium.org jdarpinian@chromium.org
I think there might actually be something wrong with the harness – or something broken in content_shell. See this failed layout test run:

https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng/63996

(from definitely-unrelated CL https://chromium-review.googlesource.com/952778 )

and in particular this shard:
https://chromium-swarm.appspot.com/task?id=3cb5b33dcc342f10&refresh=10&show_raw=1

and these layout test results:
https://test-results.appspot.com/data/layout_results/linux_chromium_rel_ng/63996/layout-test-results/results.html

The fast/webgl/texImage-imageBitmap-from-canvas-resize.html failure is particularly odd. The console output is truncated partway through the test run. This could mean a bunch of different things, but the most likely in my opinion is that the renderer process hung while running the test.

I wonder whether we could improve the layout test runner to try to force the renderer to crash in this case in a way that will produce a minidump which will be symbolized. Note that I've been procrastinating doing the same for the Telemetry harness in  Issue 797368 ; in that case we've narrowed things down to either a browser process hang, or a hang of the renderer process's IO thread.

Any chance we can stop running webkit_layout_tests and site_per_process_webkit_layout_tests until this is fixed? Those failed 41 out of last 200 builds on https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng?limit=200
Cc: foolip@chromium.org
I manually and pseudo-randomly sampled some failures. Looks like there's a common theme around GPU/image/canvas. 

e.g. https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=webkit_layout_tests&tests=virtual%2Fgpu%2Ffast%2Fcanvas%2Fcanvas-filter
(This is from waterfall, not CQ, but you can still see some flakes.)

There might be some genuine regression.

It'd be a bad idea to disable the whole layout test suite. If we can't figure out the reason, I'd suggest disabling some individual layout tests or virtual test suites (e.g. virtual/gpu), which would also be an overkill but much better than disabling all layout tests.

Comment 11 by kbr@chromium.org, Apr 11 2018

Cc: xlai@chromium.org fs...@chromium.org fmalita@chromium.org bsalomon@chromium.org
Owner: junov@chromium.org
Justin, can you own the task of triaging the fast/canvas/ timeouts, perhaps with bsalomon@ and fmalita@? If there's been some destabilization of accelerated 2D canvas and/or Skia integration then we need to get to the bottom of it ASAP. Thanks.

I will go ahead and disable the flaky layout tests.
Blockedon: 831686

Comment 14 by kbr@chromium.org, Apr 11 2018

Blockedon: 831720
Blockedon: 831685
Blocking: 831496
Blocking: 831482
Blockedon: 831249
Blockedon: 831230
Blockedon: 829952
Project Member

Comment 21 by bugdroid1@chromium.org, Apr 11 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e81b180a7ad87e07c103d2dce227cc666de9d148

commit e81b180a7ad87e07c103d2dce227cc666de9d148
Author: Kevin Marshall <kmarshall@chromium.org>
Date: Wed Apr 11 22:57:11 2018

Build sheriff: disable a number of flaky layout tests on Linux.

TBR=junov@chromium.org

Bug:  831701 
Bug:  831686 
Bug:  831685 
Bug:  831673 
Bug:  831496 
Bug:  831482 
Bug:  831249 
Bug:  831230 
Bug: 829952
Bug:  829938 
Bug: 818426
Bug: 818324
Bug:  810437 
Change-Id: Id3f657e2c5d2d46456892069e61730689a5f733b
Reviewed-on: https://chromium-review.googlesource.com/1008411
Reviewed-by: Kevin Marshall <kmarshall@chromium.org>
Commit-Queue: Kevin Marshall <kmarshall@chromium.org>
Cr-Commit-Position: refs/heads/master@{#549964}
[modify] https://crrev.com/e81b180a7ad87e07c103d2dce227cc666de9d148/third_party/WebKit/LayoutTests/TestExpectations

#21 didn't help.
In last 200 runs the suit failed 20 times, each time around 20 different tests timed out.
That doesn't mean that disabling some of the tests didn't help, but perhaps we need better data to know for sure.

Comment 24 by kbr@chromium.org, May 10 2018

fast/events/hr-timestamp/input-events.html still observed to flakily timeout:
https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac10.10%20Tests/32008

Project Member

Comment 25 by bugdroid1@chromium.org, May 11 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c5a218ff691d52844a51aeb0d16d5ca85b8654ea

commit c5a218ff691d52844a51aeb0d16d5ca85b8654ea
Author: Kenneth Russell <kbr@chromium.org>
Date: Fri May 11 01:05:22 2018

Suppress layout test flakes.

Failures on all platforms:
  virtual/video-surface-layer/media/controls/modern/
    tap-to-hide-controls.html

Timeouts:
  fast/events/hr-timestamp/input-events.html

Bug:  810437 , 831720
Change-Id: I538493dc8a815e04effe5090ad432ad79fa477e8
Tbr: steimel@chromium.org
Tbr: skyostil@chromium.org
Reviewed-on: https://chromium-review.googlesource.com/1054849
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Kenneth Russell <kbr@chromium.org>
Cr-Commit-Position: refs/heads/master@{#557747}
[modify] https://crrev.com/c5a218ff691d52844a51aeb0d16d5ca85b8654ea/third_party/WebKit/LayoutTests/TestExpectations

Blocking: 831847
Blockedon: 846750
FYI, created issue 846750 to investigate hr-timestamp/input-event.html flakiness.


Project Member

Comment 28 by bugdroid1@chromium.org, May 29 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4e103290aaecba63db670aead636ee8733e10e35

commit 4e103290aaecba63db670aead636ee8733e10e35
Author: Majid Valipour <majidvp@chromium.org>
Date: Tue May 29 15:22:53 2018

Fix flakiness in hr-timestamp/input-events.html test

Update the test to expect the difference between expected
unclamped time with observed time to be at most 200us instead of 100us.


A recent change [1] made it so that the clamping can lead to up to 200us
of adjustment for when we are computing a time values that is relative
to time origin. The new logic, first clamps both the time origin and time value
and then computes their delta while the old logic computed the delta first and
then clamped it once.

## simplified old logic
double clamped_time_in_seconds = ClampTimeResolution(monotonic_time - time_origin);

## simplified new logic

double clamped_time_in_seconds =
      ClampTimeResolution(TimeTicksInSeconds(monotonic_time)) -
      ClampTimeResolution(TimeTicksInSeconds(time_origin));


Flakiness dashboard link: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=webkit_layout_tests&tests=fast%2Fevents%2Fhr-timestamp

[1] https://chromium-review.googlesource.com/c/chromium/src/+/849993/4/third_party/WebKit/Source/core/timing/PerformanceBase.cpp

TEST= locally ran the test 1000 times which all passed. run-webkit-tests --repeat-each=1000  fast/events/hr-timestamp/input-events.html

Bug: 846750,  810437 
Change-Id: I69547484838c3b21bc3c15441baec287db0f5e8e
Reviewed-on: https://chromium-review.googlesource.com/1072466
Commit-Queue: Majid Valipour <majidvp@chromium.org>
Reviewed-by: Sami Kyöstilä <skyostil@chromium.org>
Cr-Commit-Position: refs/heads/master@{#562430}
[modify] https://crrev.com/4e103290aaecba63db670aead636ee8733e10e35/third_party/WebKit/LayoutTests/fast/events/hr-timestamp/input-events.html

Owner: fs...@chromium.org

Comment 30 by fs...@chromium.org, May 29 2018

Blockedon: 847594

Comment 31 by fs...@chromium.org, May 29 2018

Blur filters on gpu seem to be particularly slow for some reason.

I've "fixed" the test (that doesn't depend on blur filters to exist) and create a separate bug to track the blur slowness.

Don't know who I should assign this back to.
Cc: sugoi@chromium.org
Could this be related to the switch to SwiftShader for GPU layout tests?
Project Member

Comment 33 by bugdroid1@chromium.org, May 29 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5c777ff51727a999e4a5b5f0dbc95faa7e19e53c

commit 5c777ff51727a999e4a5b5f0dbc95faa7e19e53c
Author: Fernando Serboncini <fserb@chromium.org>
Date: Tue May 29 21:03:31 2018

Fix test timeout on Canvas tests

This fixes the test, but not the underlying problem. A separate bug
has been created to track this

TBR=junov

Bug:  810437 ,847594
Change-Id: Id142b936af3378385b4fcc199ed193f5a0fb2241
Reviewed-on: https://chromium-review.googlesource.com/1077098
Reviewed-by: Fernando Serboncini <fserb@chromium.org>
Commit-Queue: Fernando Serboncini <fserb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#562583}
[modify] https://crrev.com/5c777ff51727a999e4a5b5f0dbc95faa7e19e53c/third_party/WebKit/LayoutTests/SlowTests
[modify] https://crrev.com/5c777ff51727a999e4a5b5f0dbc95faa7e19e53c/third_party/WebKit/LayoutTests/fast/canvas/canvas-filter-removed-expected.html
[modify] https://crrev.com/5c777ff51727a999e4a5b5f0dbc95faa7e19e53c/third_party/WebKit/LayoutTests/fast/canvas/canvas-filter-removed.html

Ping on this bug - what remains to be done?

Comment 35 by fs...@chromium.org, Jun 19 2018

I think this is done. 
The canvas ones were addressed, so were input, media and indexedDB.
I was going to close this, but then felt weird about all those "blocked on" bugs. WDYT?

Comment 36 by kbr@chromium.org, Jun 20 2018

Blockedon: -831249 -846750 -831230 -829952 -831686 -831720 -847594 -831685
Blocking: 831685 831249 846750 831230 829952 831686 831720 847594
Status: Fixed (was: Assigned)
Let's just move the "blocked-on" bugs to the "blocking" list. Thanks for investigating this.

The timeouts are not fully gone. Looking at a recent linux_chromium_rel_ng run [1],
there is more than 60 flaky tests that time out.

My suspicion is that there is a timing issue when contant_shell is being reused between tests. A large part of 
virtual/outofblink-cors/http/tests/fetch/workers/thorough/
suite times out consistently on the first run, and works on retry.
http/tests in general suffer from flaky timeouts, outofblink-cors might be triggering the underlying cause consistently.


[1] https://test-results.appspot.com/data/layout_results/linux_chromium_rel_ng/122335/site_per_process_webkit_layout_tests%20%28with%20patch%29/layout-test-results/results.html

Comment 38 by kbr@chromium.org, Jun 26 2018

Blocking: 856398
Project Member

Comment 39 by bugdroid1@chromium.org, Jul 6

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fa8b12920e8c2042e47816d004b113c155e81aee

commit fa8b12920e8c2042e47816d004b113c155e81aee
Author: Alexis Hetu <sugoi@google.com>
Date: Fri Jul 06 14:14:32 2018

Attempt to unmark some tests as slow/timeout

Now that SwiftShader is on Linux/Windows/MacOS and that a recent
performance improvement has been landed in SwiftShader, verify
which tests still require the Slow/Timeout markers and which don't.

TBR=kbr@chromium.org

Bug:chromium:24182  chromium:433711  chromium:763197  chromium:311482   chromium:243871   chromium:664857   chromium:9798   chromium:237270   chromium:241576   chromium:241869   chromium:246749   chromium:535478   chromium:363029   chromium:364225   chromium:552556   chromium:570656   chromium:584807  chromium:614910  chromium:791659   chromium:726075   chromium:808153  chromium:816045  chromium:693568  chromium:626703 chromium:703533 chromium:786641  chromium:799137   chromium:831686   chromium:831230  chromium:818324  chromium:810437   chromium:847205  chromium:848799  chromium:828962   chromium:849284   chromium:855055 

Change-Id: I5d36d20bd87b234fefe4da3ea7e4af039c0188cb
Reviewed-on: https://chromium-review.googlesource.com/1102341
Reviewed-by: Alexis Hétu <sugoi@chromium.org>
Commit-Queue: Alexis Hétu <sugoi@chromium.org>
Cr-Commit-Position: refs/heads/master@{#572962}
[modify] https://crrev.com/fa8b12920e8c2042e47816d004b113c155e81aee/third_party/WebKit/LayoutTests/SlowTests
[modify] https://crrev.com/fa8b12920e8c2042e47816d004b113c155e81aee/third_party/WebKit/LayoutTests/TestExpectations

Sign in to add a comment