Random timeouts in webkit_layout_tests |
|||||||||||||||||||||
Issue descriptionIndividual webkit_layout_tests have been flaking on some trybots (http://crbug.com/810254 for example) but it looks like there may be random timeouts happening. Please see this tryjob: https://ci.chromium.org/buildbot/tryserver.chromium.linux/linux_chromium_rel_ng/642780 These layout test timeouts: * css3/filters/filter-repaint-composited-fallback.html * external/wpt/IndexedDB/interleaved-cursors.html * external/wpt/fetch/api/redirect/redirect-location.html * fast/webgl/texImage-imageBitmap-from-offscreen-canvas-resize.html * images/color-profile-layer-filter.html * virtual/gpu/fast/canvas/canvas-filter-removed.html * virtual/spv175/paint/invalidation/filters/filter-repaint-accelerated-on-accelerated-filter.html (See https://test-results.appspot.com/data/layout_results/linux_chromium_rel_ng/642780/layout-test-results/results.html ) I'm 99% sure aren't caused by this revert. Is there some inherent flakiness in the harness, or in the way the tests are run? Is it possible to make any progress on a random report like this one?
,
Feb 9 2018
,
Feb 12 2018
pwnall@ reworked the IndexedDB/interleaved-cursors.html test - issue 708175
,
Apr 5 2018
More timeouts: https://ci.chromium.org/buildbot/tryserver.chromium.win/win7_chromium_rel_ng/138027 * virtual/layout_ng_experimental/fast/multicol/infinitely-tall-content-in-outer-crash.html Upon retry without patch, two different tests flaked: * fast/events/hr-timestamp/input-events.html * html/tabular_data/col_width_resizing_table.html Anecdotally, I've seen a lot of random failures of the webkit_layout_tests suite on tryjobs recently, though I haven't canvassed the tryservers and accumulated a list. I think the situation is severe enough to warrant more engineers dropping other work and looking into this. Upgrading this to P1. Dirk, can you mobilize forces and figure out how we can get to the bottom of this?
,
Apr 6 2018
I just had two recent failures on the win7_chromium_rel_ng trybot for a CrOS only change in webkit_layout_tests. The following tests failed or flaked: fast/events/wheel/wheel-scroll-latching-on-scrollbar.html external/wpt/html/infrastructure/urls/resolving-urls/query-encoding/utf-16le.html virtual/mouseevent_fractional/fast/events/wheel/wheel-scroll-latching-on-scrollbar.html virtual/modern-media-controls/media/controls/modern/doubletap-to-jump-forwards-too-short.html Combined with two failures on the same builder due to "not enough capacity" timeouts, this has been very frustrating. This is far from the first time that multiple win7 failures have delayed a CL by more than a day.
,
Apr 6 2018
@stevenjb - yes, the win failures have been particularly bad for a week or more.
,
Apr 6 2018
#5: Please include links to your try runs when reporting failures like these. Note: the virtual/layout_ng_experimental/fast/multicol/infinitely-tall-content-in-outer-crash.html flake seemed like a real bug in that test, and the test was disabled in Issue 829181 . Linking these together; if that bug is resolved then feel free to turn around the blocked on/blocking relationship.
,
Apr 7 2018
I think there might actually be something wrong with the harness – or something broken in content_shell. See this failed layout test run: https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng/63996 (from definitely-unrelated CL https://chromium-review.googlesource.com/952778 ) and in particular this shard: https://chromium-swarm.appspot.com/task?id=3cb5b33dcc342f10&refresh=10&show_raw=1 and these layout test results: https://test-results.appspot.com/data/layout_results/linux_chromium_rel_ng/63996/layout-test-results/results.html The fast/webgl/texImage-imageBitmap-from-canvas-resize.html failure is particularly odd. The console output is truncated partway through the test run. This could mean a bunch of different things, but the most likely in my opinion is that the renderer process hung while running the test. I wonder whether we could improve the layout test runner to try to force the renderer to crash in this case in a way that will produce a minidump which will be symbolized. Note that I've been procrastinating doing the same for the Telemetry harness in Issue 797368 ; in that case we've narrowed things down to either a browser process hang, or a hang of the renderer process's IO thread.
,
Apr 11 2018
Any chance we can stop running webkit_layout_tests and site_per_process_webkit_layout_tests until this is fixed? Those failed 41 out of last 200 builds on https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng?limit=200
,
Apr 11 2018
I manually and pseudo-randomly sampled some failures. Looks like there's a common theme around GPU/image/canvas. e.g. https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=webkit_layout_tests&tests=virtual%2Fgpu%2Ffast%2Fcanvas%2Fcanvas-filter (This is from waterfall, not CQ, but you can still see some flakes.) There might be some genuine regression. It'd be a bad idea to disable the whole layout test suite. If we can't figure out the reason, I'd suggest disabling some individual layout tests or virtual test suites (e.g. virtual/gpu), which would also be an overkill but much better than disabling all layout tests.
,
Apr 11 2018
Justin, can you own the task of triaging the fast/canvas/ timeouts, perhaps with bsalomon@ and fmalita@? If there's been some destabilization of accelerated 2D canvas and/or Skia integration then we need to get to the bottom of it ASAP. Thanks.
,
Apr 11 2018
I will go ahead and disable the flaky layout tests.
,
Apr 11 2018
,
Apr 11 2018
,
Apr 11 2018
,
Apr 11 2018
,
Apr 11 2018
,
Apr 11 2018
,
Apr 11 2018
,
Apr 11 2018
,
Apr 11 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e81b180a7ad87e07c103d2dce227cc666de9d148 commit e81b180a7ad87e07c103d2dce227cc666de9d148 Author: Kevin Marshall <kmarshall@chromium.org> Date: Wed Apr 11 22:57:11 2018 Build sheriff: disable a number of flaky layout tests on Linux. TBR=junov@chromium.org Bug: 831701 Bug: 831686 Bug: 831685 Bug: 831673 Bug: 831496 Bug: 831482 Bug: 831249 Bug: 831230 Bug: 829952 Bug: 829938 Bug: 818426 Bug: 818324 Bug: 810437 Change-Id: Id3f657e2c5d2d46456892069e61730689a5f733b Reviewed-on: https://chromium-review.googlesource.com/1008411 Reviewed-by: Kevin Marshall <kmarshall@chromium.org> Commit-Queue: Kevin Marshall <kmarshall@chromium.org> Cr-Commit-Position: refs/heads/master@{#549964} [modify] https://crrev.com/e81b180a7ad87e07c103d2dce227cc666de9d148/third_party/WebKit/LayoutTests/TestExpectations
,
Apr 13 2018
#21 didn't help. In last 200 runs the suit failed 20 times, each time around 20 different tests timed out.
,
Apr 14 2018
That doesn't mean that disabling some of the tests didn't help, but perhaps we need better data to know for sure.
,
May 10 2018
fast/events/hr-timestamp/input-events.html still observed to flakily timeout: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac10.10%20Tests/32008
,
May 11 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c5a218ff691d52844a51aeb0d16d5ca85b8654ea commit c5a218ff691d52844a51aeb0d16d5ca85b8654ea Author: Kenneth Russell <kbr@chromium.org> Date: Fri May 11 01:05:22 2018 Suppress layout test flakes. Failures on all platforms: virtual/video-surface-layer/media/controls/modern/ tap-to-hide-controls.html Timeouts: fast/events/hr-timestamp/input-events.html Bug: 810437 , 831720 Change-Id: I538493dc8a815e04effe5090ad432ad79fa477e8 Tbr: steimel@chromium.org Tbr: skyostil@chromium.org Reviewed-on: https://chromium-review.googlesource.com/1054849 Reviewed-by: Kenneth Russell <kbr@chromium.org> Commit-Queue: Kenneth Russell <kbr@chromium.org> Cr-Commit-Position: refs/heads/master@{#557747} [modify] https://crrev.com/c5a218ff691d52844a51aeb0d16d5ca85b8654ea/third_party/WebKit/LayoutTests/TestExpectations
,
May 23 2018
,
May 25 2018
,
May 29 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4e103290aaecba63db670aead636ee8733e10e35 commit 4e103290aaecba63db670aead636ee8733e10e35 Author: Majid Valipour <majidvp@chromium.org> Date: Tue May 29 15:22:53 2018 Fix flakiness in hr-timestamp/input-events.html test Update the test to expect the difference between expected unclamped time with observed time to be at most 200us instead of 100us. A recent change [1] made it so that the clamping can lead to up to 200us of adjustment for when we are computing a time values that is relative to time origin. The new logic, first clamps both the time origin and time value and then computes their delta while the old logic computed the delta first and then clamped it once. ## simplified old logic double clamped_time_in_seconds = ClampTimeResolution(monotonic_time - time_origin); ## simplified new logic double clamped_time_in_seconds = ClampTimeResolution(TimeTicksInSeconds(monotonic_time)) - ClampTimeResolution(TimeTicksInSeconds(time_origin)); Flakiness dashboard link: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=webkit_layout_tests&tests=fast%2Fevents%2Fhr-timestamp [1] https://chromium-review.googlesource.com/c/chromium/src/+/849993/4/third_party/WebKit/Source/core/timing/PerformanceBase.cpp TEST= locally ran the test 1000 times which all passed. run-webkit-tests --repeat-each=1000 fast/events/hr-timestamp/input-events.html Bug: 846750, 810437 Change-Id: I69547484838c3b21bc3c15441baec287db0f5e8e Reviewed-on: https://chromium-review.googlesource.com/1072466 Commit-Queue: Majid Valipour <majidvp@chromium.org> Reviewed-by: Sami Kyöstilä <skyostil@chromium.org> Cr-Commit-Position: refs/heads/master@{#562430} [modify] https://crrev.com/4e103290aaecba63db670aead636ee8733e10e35/third_party/WebKit/LayoutTests/fast/events/hr-timestamp/input-events.html
,
May 29 2018
,
May 29 2018
,
May 29 2018
Blur filters on gpu seem to be particularly slow for some reason. I've "fixed" the test (that doesn't depend on blur filters to exist) and create a separate bug to track the blur slowness. Don't know who I should assign this back to.
,
May 29 2018
Could this be related to the switch to SwiftShader for GPU layout tests?
,
May 29 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/5c777ff51727a999e4a5b5f0dbc95faa7e19e53c commit 5c777ff51727a999e4a5b5f0dbc95faa7e19e53c Author: Fernando Serboncini <fserb@chromium.org> Date: Tue May 29 21:03:31 2018 Fix test timeout on Canvas tests This fixes the test, but not the underlying problem. A separate bug has been created to track this TBR=junov Bug: 810437 ,847594 Change-Id: Id142b936af3378385b4fcc199ed193f5a0fb2241 Reviewed-on: https://chromium-review.googlesource.com/1077098 Reviewed-by: Fernando Serboncini <fserb@chromium.org> Commit-Queue: Fernando Serboncini <fserb@chromium.org> Cr-Commit-Position: refs/heads/master@{#562583} [modify] https://crrev.com/5c777ff51727a999e4a5b5f0dbc95faa7e19e53c/third_party/WebKit/LayoutTests/SlowTests [modify] https://crrev.com/5c777ff51727a999e4a5b5f0dbc95faa7e19e53c/third_party/WebKit/LayoutTests/fast/canvas/canvas-filter-removed-expected.html [modify] https://crrev.com/5c777ff51727a999e4a5b5f0dbc95faa7e19e53c/third_party/WebKit/LayoutTests/fast/canvas/canvas-filter-removed.html
,
Jun 18 2018
Ping on this bug - what remains to be done?
,
Jun 19 2018
I think this is done. The canvas ones were addressed, so were input, media and indexedDB. I was going to close this, but then felt weird about all those "blocked on" bugs. WDYT?
,
Jun 20 2018
The timeouts are not fully gone. Looking at a recent linux_chromium_rel_ng run [1], there is more than 60 flaky tests that time out. My suspicion is that there is a timing issue when contant_shell is being reused between tests. A large part of virtual/outofblink-cors/http/tests/fetch/workers/thorough/ suite times out consistently on the first run, and works on retry. http/tests in general suffer from flaky timeouts, outofblink-cors might be triggering the underlying cause consistently. [1] https://test-results.appspot.com/data/layout_results/linux_chromium_rel_ng/122335/site_per_process_webkit_layout_tests%20%28with%20patch%29/layout-test-results/results.html
,
Jun 26 2018
,
Jul 6
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/fa8b12920e8c2042e47816d004b113c155e81aee commit fa8b12920e8c2042e47816d004b113c155e81aee Author: Alexis Hetu <sugoi@google.com> Date: Fri Jul 06 14:14:32 2018 Attempt to unmark some tests as slow/timeout Now that SwiftShader is on Linux/Windows/MacOS and that a recent performance improvement has been landed in SwiftShader, verify which tests still require the Slow/Timeout markers and which don't. TBR=kbr@chromium.org Bug:chromium:24182 chromium:433711 chromium:763197 chromium:311482 chromium:243871 chromium:664857 chromium:9798 chromium:237270 chromium:241576 chromium:241869 chromium:246749 chromium:535478 chromium:363029 chromium:364225 chromium:552556 chromium:570656 chromium:584807 chromium:614910 chromium:791659 chromium:726075 chromium:808153 chromium:816045 chromium:693568 chromium:626703 chromium:703533 chromium:786641 chromium:799137 chromium:831686 chromium:831230 chromium:818324 chromium:810437 chromium:847205 chromium:848799 chromium:828962 chromium:849284 chromium:855055 Change-Id: I5d36d20bd87b234fefe4da3ea7e4af039c0188cb Reviewed-on: https://chromium-review.googlesource.com/1102341 Reviewed-by: Alexis Hétu <sugoi@chromium.org> Commit-Queue: Alexis Hétu <sugoi@chromium.org> Cr-Commit-Position: refs/heads/master@{#572962} [modify] https://crrev.com/fa8b12920e8c2042e47816d004b113c155e81aee/third_party/WebKit/LayoutTests/SlowTests [modify] https://crrev.com/fa8b12920e8c2042e47816d004b113c155e81aee/third_party/WebKit/LayoutTests/TestExpectations |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by e...@chromium.org
, Feb 8 2018