New issue
Advanced search Search tips

Issue 904389 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

http/tests/preload/delaying_onload_link_preload_after_discovery.html is flaky under heavy load

Project Member Reported by leszeks@chromium.org, Nov 12

Issue description

Running http/tests/preload/delaying_onload_link_preload_after_discovery.html in multiple content shells in parallel increases its flakiness from 0% to 50% (on my machine).

Example using GNU parallel:

2 concurrent tasks
$ parallel -j2 ': {}; out/Release/content_shell --run-web-tests --reset-shell-between-tests --single-process -- http://127.0.0.1:8000/preload/delaying_onload_link_preload_after_discovery.html 2>/dev/null | rg "PASS|FAIL"' ::: {1..100} | sort | uniq -c
    100 PASS Makes sure link preload preloaded resources are delaying onload after discovery

72 concurrent tasks
$ parallel -j72 ': {}; out/Release/content_shell --run-web-tests --reset-shell-between-tests --single-process -- http://127.0.0.1:8000/preload/delaying_onload_link_preload_after_discovery.html 2>/dev/null | rg "PASS|FAIL"' ::: {1..100} | sort | uniq -c
     49 FAIL Makes sure link preload preloaded resources are delaying onload after discovery assert_equals: expected 4 but got 5
     51 PASS Makes sure link preload preloaded resources are delaying onload after discovery

This flakiness seems to also be exposed by script resource controlled script streaming, with the same failure error message. Tracing suggests that in this test, resources/slow-script.pl?delay=200 is loaded twice (presumably once for the link tag, once for the injected script tag).
 
A bit of printf-debugging suggests that this is because the script is being reloaded, because Resource::HasCacheControlNoStoreHeader() is returning true (?!)

Positive case:
URL: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 (link preload, no matching preload)
RequestResource: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kLoad
URL: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 (link preload matched)
RequestResource: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kUse
URL: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 (not link preload, preload matched)
RequestResource: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kUse
URL: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 (not link preload, no matching preload)
RequestResource: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kUse
PASS Makes sure link preload preloaded resources are delaying onload after discovery

Negative case:
URL: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 (link preload, no matching preload)
RequestResource: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kLoad
URL: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 (link preload matched)
RequestResource: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kUse
URL: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 (not link preload, preload matched)
RequestResource: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kUse
URL: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 (not link preload, no matching preload)
DetermineRevalidationPolicy: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kReload because no-store
RequestResource: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kReload
RequestResource: http://127.0.0.1:8000/resources/slow-script.pl?delay=200 kLoad
FAIL Makes sure link preload preloaded resources are delaying onload after discovery assert_equals: expected 4 but got 5


Components: Blink>Loader
Project Member

Comment 3 by bugdroid1@chromium.org, Nov 12

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a765b3235540cd30146d7131c74fc4bf6429ffb1

commit a765b3235540cd30146d7131c74fc4bf6429ffb1
Author: Leszek Swirski <leszeks@chromium.org>
Date: Mon Nov 12 16:53:18 2018

Mark preload test as flaky

Mark the delaying_onload_link_preload_after_discovery layout test as
flaky, due to flakiness under heavy load.

Bug: 904389
Change-Id: I585fd5a02a7f94c98dbf10176f58069153c12b2c
Reviewed-on: https://chromium-review.googlesource.com/c/1331392
Reviewed-by: Hiroshige Hayashizaki <hiroshige@chromium.org>
Commit-Queue: Leszek Swirski <leszeks@chromium.org>
Cr-Commit-Position: refs/heads/master@{#607264}
[modify] https://crrev.com/a765b3235540cd30146d7131c74fc4bf6429ffb1/third_party/WebKit/LayoutTests/TestExpectations

There are four ResourceFetcher::RequestResource() as listed in Comment #1:

1. Preload scanner (<link preload>?) => Add the Resource to preload list.
2. <link preload> is processed. => Hits the preload list. The Resource is re-added to preload list.
3. <script> is processed. => Hits the preload list. The Resource is removed from the preload list.
4. Preload scanner (<script>?) => Flakily triggers kReload due to no-store.

no-store is added in slow-script.pl to ensure that network requests are not merged due to ordinal HTTP caches (disk/memory cache), not due to preloading mechanism.

The root cause looks that the preload scanner triggers a resource request at Step 4 AFTER the main request is dispatched at Step 3.
Perhaps due to document.write()?

The fact that flakiness increases with heavy load, and increases with the script streaming CL, both suggest that this flakes if the resource finish is delayed and the main thread is free to do something else (e.g. process preloading?).
Just for record: The revalidation policy at Step 4 is flakily because:
- Sometimes kUse, https://codesearch.chromium.org/chromium/src/third_party/blink/renderer/platform/loader/fetch/resource_fetcher.cc?type=cs&sq=package:chromium&g=0&l=1380
- Sometimes kReload, if the response is received, including the no-store header.
I think the test shouldn't rely on this request to get kUse, because I feel this should be basically kReload (due to no-store).

The flakiness looks like due to the race between
- ResponseReceived() of the script (i.e. when no-store header becomes visible in Blink, which can occur even before Step 2, or after Step 4) and
- Preload scanner at Step 4 (in Comment #4). 
Anyway this flakiness is not the root problem.
Cc: hirosh...@chromium.org csharrison@chromium.org kouhei@chromium.org
Owner: ----
Status: Available (was: Untriaged)
Done initial investigation, making the issue available.

+kouhei@ and csharrison@ as preload scanner expert.
Could you take a look? (See Comment #4)
Cc: yoavweiss@chromium.org

Sign in to add a comment