New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 849106 link

Starred by 7 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug

Blocking:
issue 463348



Sign in to add a comment

Wikipedia on mobile downloads CSS after the full HTML is downloaded (HTTP2)

Reported by phedens...@wikimedia.org, Jun 3 2018

Issue description

UserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_4) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36

Steps to reproduce the problem:
1. Access https://en.m.wikipedia.org/wiki/Barack_Obama on a mobile device using Chrome with bad connectivity
2. The page always finish download the CSS after the HTML is fully downloaded
2. Checkout https://webpagetest.org/result/180603_33_9e7994d1e51c76609efda1b675849b51/1/details/#waterfall_view_step1 for an example

What is the expected behavior?
The page could have started to render earlier if the CSS could be downloaded earlier.

What went wrong?
The render blocking CSS is never downloaded before the HTML is finished downloaded. meaning the first paint is pushed back.

Did this work before? N/A 

Does this work in other browsers? N/A

Chrome version: 66.0.3359.139  Channel: n/a
OS Version: OS X 10.13.4
Flash Version: 

This is a more generic HTTP/2 problem and I guess it is nginx fault but still wanted to run it through you:

When we did the switch to SPDY and then HTTP/2 our users on a slow internet connection got a worse experience than on HTTP/1. When we did the switch Chrome prio the HTML HIGHEST and CSS a little bit lower, but then Chrome changed so CSS needed for rendering changed to HIGHEST. All good from Chrome side. But as a user (see the WebPageTest example) the CSS never finish loading until the HTML is finished for Wikipedia = the browser cannot start to paint the screen until the CSS finished downloading. 

That means the more content (text) we have on a page, the slower the first paint will be, since the HTML download will be slower and push back the CSS.

We use nginx on our side and as I understand it then nginx chooses to push the HTML first and then the CSS (even though the recommendation from Chrome side is that they are equally important). 

How do you recommend we solve this, pushing nginx or is it something you could do on your side?
 
Labels: Needs-Triage-M66
We had an old issue where we discussed it before: https://bugs.chromium.org/p/chromium/issues/detail?id=586938

Comment 3 by tkent@chromium.org, Jun 3 2018

Components: -Blink>Network Blink>Loader Internals>Network>HTTP2

Comment 4 by tkent@chromium.org, Jun 3 2018

Blocking: 463348
Cc: pmeenan@chromium.org tbansal@chromium.org
I have no idea what Chrome could do better. Chrome asks both the main resource and critical CSS at the highest priority, and blocks rendering until the critical CSS gets loaded.  One idea could be to start rendering without waiting for style only when loading it takes really long, but not sure if that could be good.

Adding a few folks in case they may have any ideas (pmeenan@ for loading perf / priorities in general, tbansal@ for perf on slow connections
It's not so much a priority thing as it is the translation of priorities into HTTP/2 weights and dependencies.  Chrome pretty much schedules everything over http/2 in a linear order with later resources listing the earlier resources as their parent.

For the most part that works out best because you get things like js files delivered one after the other so that the compile/parse/execute can start earlier and run concurrently with the next download.  Same for css then js, etc.  It does fall apart somwehat when it comes to progressive image rendering and potentially with long html where a lot of the content is below the fold.

Nginx is just honoring the dependency chain that Chrome provides and delivering the resources in the order requested.

The only way I can think of to solve it is to build 2 parallel trees with the HTML resource on one branch and all of the page resources on another branch and weight them so that they can download in parallel.  I'm not sure the net layer has the knowledge it needs to be able to do that and it's also a bit risky.  You'd probably only want to do that for the render-blocking resources and have async scripts and low priority images wait until the html and render-blocking resources finish.

FWIW, Chrome requests the dependencies and weights but at the end of the day the server can do whatever it wants.  It would take modifying Nginx but the server could be changed to always make sure 2 streams are being actively delivered or to interleave HTML/CSS (and deliver images in parallel).

To some extent it's an artifact of moving away from the priorities that SPDY used into the dependency tree that http/2 uses.
pmeenan: It seems that when constructing the dependency tree, we are taking two resources that have the same priority and constructing the tree in a way that implies that one resource has lower priority than the other. That seems like a bug?

The only possible optimization I can think of is to load css before html, but that would work only on a subset of pages (user has visited the page before) etc., and would require considerable implementation effort.
It's more like - given equal priorities, resources are requested in the order they are discovered.  That way scripts with equal priority will arrive in the best order for the parser for example.  It also mirrors what we do for HTTP/1, just that there is no connection-level concurrency.

Comment 9 by b...@chromium.org, Jun 4 2018

Reflecting on comments #6 and #7:  Assume that resources A and B have equal priority, and C is lower priority.  One way to express this is using dependencies: C should depend on both A and B.  However, an HTTP/2 stream can only depend on at most one other stream, so the only way to express this is to make one of A and B depend on the other, and C depend on that.  This is exactly what Chrome does with resources that have equal priority.  And indeed it seems like document and stylesheet both has HIGHEST priority.

(The other way to express relations between A, B, and C would be with weights, and no dependencies, but since weight is a positive integer between 1 and 256, and the peer is expected to assign resources proportional to the weight, there is not enough resolution to express the current 6 RequestPriority buckets effectively.)
Sort of.  I'm still not 100% clear how the group streams work in Firefox but they use phantom nodes to group requests and dependencies: http://bitsup.blogspot.com/2015/01/http2-dependency-priorities-in-firefox.html

That said, things between HTML and CSS are more difficult than, say, progressive images because the CSS isn't even discovered until the HTML starts streaming.  Interleaving the CSS streams in the H2 connection is going to strongly depend on the size of the outbound buffers on the server and how well it handles changing in-flight streams.

Comment 11 by b...@chromium.org, Jun 5 2018

So if CSS is required for first paint, but the entire HTML is not, would it not make sense to lower the priority of HTML resources from HIGHEST to HIGH?
That's a really good idea.  I'd actually recommend going even lower and put it at something like Low (blink)/Lowest (Net).  That will put it behind blocking scripts but ahead of async scripts and images.  That way any blocking resources that have already been discovered will take precedence over the remaining html.

The main question will be how well the reordering of dependencies for in-flight requests will go and if the html will get re-paranted to the css correctly (on both the client and server sides).

Comment 13 by b...@chromium.org, Jun 6 2018

I have faith in both Chromium's network stack to correctly make lower priority resources "depend" (in the HTTP/2 stream dependency sense) on higher priority ones (directly or transitively), and in server implementation to mostly correctly interpret dependencies set by the client.
The statistics for first paint on mobile looks like this for us:
https://grafana.wikimedia.org/dashboard/db/navigation-timing-by-platform?refresh=5m&panelId=26&fullscreen&orgId=1&var-metric=firstPaint&var-platform=mobile&var-users=anonymous 

P95 and p99 could could potentially gain a lot if it's possible to get a lower prio of the HTML so the CSS is downloaded earlier. Looking at different WebPageTest runs and if everything works ok users that have first paint of 10 seconds could cut that time almost into half the time, that would be wonderful!
bnc, you want to take this? or pmeenan?
and, yeah, thanks for the suggestion. This definitely looks worth experimenting with.
Labels: Triaged-ET TE-NeedsTriageHelp
Adding label TE-NeedsTriageHelp as testing the issue require nginx web server which is out of TE-scope.

Thanks...!!
Labels: -TE-NeedsTriageHelp -Triaged-ET -Needs-Triage-M66
Owner: pmeenan@chromium.org
Status: Assigned (was: Unconfirmed)
I'm happy to take a stab at it.  Implementation itself is trivial but it probably needs to go through a finch trial just to make sure it doesn't regress something unexpected.

IFrames come to mind but I don't think we have any metrics that would measure it and we've also been trying different ways to deprioritize iframes so maybe it's not a problem anyway.  The frame main HTML will now be prioritized lower than blocking scripts in the main document but that's probably a good thing.

Removing all of the triage labels since we know what is going on and what to do about it.

Comment 19 by b...@chromium.org, Jun 13 2018

Cc: csharrison@chromium.org
Components: -Internals>Network>HTTP2 Blink>HTML>Parser
Discussed with csharrison@ offline.  Changing the priority of all HTML requests would probably have unforeseen consequences.  A better option would be for the HTML parser to lower the priority once enough tokens have been parsed.  I'm adding Blink>HTML>Parser component for exploring this option.

It seems like the following path is already implemented, empowering the renderer to change priority of a request:
network service URLLoader::SetPriority()
ResourceSchedulerClient::ReprioritizeRequest()
ResourceScheduler::ReprioritizeRequest()
URLRequest::SetPriority()
URLRequestHttpJob::SetPriority()
HttpNetworkTransaction::SetPriority()
SpdyHttpStream::SetPriority()

The last links will be filled in by https://crrev.com/c/1098628 (in progress).  Since this takes care of the network stack side of this issue, I'm removing Internals>Network>HTTP2 component.

Comment 20 by b...@chromium.org, Jun 13 2018

Cc: b...@chromium.org
Also happy to defer to pmeenan here, as he has more experience with priorities here. My worry with always lowering HTML priority is that it puts Chrome in a position where we can accidentally block a new page from loading due to some less important background work that was not properly deprioritized.
What kind of unforseen consequences?

At a lower priority, if nothing else is in-flight (since the document doesn't exist yet) the HTML will be the only resource in flight.  Once enough HTML has come down that the parser or preload scanner can discover scripts or styles in the head then those will be requested with a higher priority but they will block the main parser until they are fetched anyway (and the HTML is already in-flight so a decent amount of it will continue to stream even if the server pauses it to stream the higher-priority resources).

The only side-effects that come to mind are:

1 - The preload scanner won't discover resources later in the document until the earlier resources have completed and the HTML continues.  If you're on a slow enough connection that the HTML gets interrupted in the first place that is actually a good thing.

2 - Same-origin HTML in frames will get scheduled after blocking scripts and styles in the main document.

There are some subtle behavior differences if it is scheduled as LOW or LOWEST that probably make it worthwhile not going to LOWEST (only one LOWEST resource in flight on HTTP/1 connections until the critical resources have loaded) so moving down to the same level as blocking scripts probably makes sense but I REALLY don't think you want to do this scheduling in the parser.
Multi-tab loads talking to the same origin could also introduce some races as noted in #21 but mostly only if you cross the threshold into the "load one at a time".  We need to make sure the HTML priority stays high enough that it is scheduled to always be requested.

First pass until we revamp priorities in general should probably just be to knock it down a notch to be just below CSS and the same as parser-blocking scripts.  In practice that means compliant servers will only pause HTML to send CSS since the HTML will already be in-flight before parser-blocking scripts are requested.

Long-term it would be good to put them below parser-blocking scripts as well but I wouldn't recommend that right now.
pmeenan: I agree with you assuming the current architecture of priorities stays the same. I was imagining something might change in the future wrt the network service. Still, a field trial could be helpful.

I wasn't saying to do any scheduling in the parser. Just that the parser could tell the loader stack "I have enough work on my plate, feel free to deprioritize" once we have reached our speculative token limit or something.
Is it possible to somehow quickly verify if lowering the priority of http is helpful? I am wondering if it's possible that the h2 server may have already pushed large chunks of HTML down to kernel socket buffer. So, by the time, CSS request comes in (after 1 -RTT), the buffer may already contain most of the html, and it might cause HOL blocking for CSS?
Owner: ----
Status: Available (was: Assigned)
I have a CL available that lowers the priority (flag-controlled) here: https://chromium-review.googlesource.com/c/chromium/src/+/1152989

Not landing it myself as it should probably be behind a finch trial and I am only at Google for another week or so and will lose access to finch and UMA after that (will still be contributing to Chrome as much as possible).

Marking the bug as available in case someone else wants to take the lead on landing a trial.

As far as h2 and socket buffers go, a lot of the newer servers use TCP_NOTSENT_LOWAT to minimize server-side buffering and allow for reprioritization on h2 connections: https://twistedmatrix.com/trac/ticket/9078
> A better option would be for the HTML parser to lower the priority once enough tokens have been parsed.

Would the start of <body> be a strong signal here?
Cc: y...@yoav.ws
Should be strong enough for this use case anyway.  There's a good chance that might be too late depending on how much of the stream is already in-flight.

A 2-phase lowering is probably best:

1 - Default HTML to be lower than CSS
2 - Once the body start tag is parsed lower it further to be below parser-blocking scripts.
Cc: domfarolino@gmail.com
Status: Untriaged (was: Available)
Looks like this needs re-triage (hadn't realized #26). Temporarily moving back to Untriaged for triage.
Owner: kinuko@chromium.org
Have read through and looked into the CL-- I feel this can be Finched to see the effect. Concurrent background loading case is throttled (and can be deprioritized too), and MEDIUM priority looks okay to just experiment.

(Tentatively assigning this to me for triaging, but anyone can beat me)
Cc: kinuko@chromium.org
Owner: ksakamoto@chromium.org
Status: Assigned (was: Untriaged)
Looks like Sakamoto-san can take over the CL.
Trying out pmeenan@'s patch (#c26) locally.

On Linux desktop + network throttling (Slow 3G), I don't see the effect of the priority change.

Attaching screenshots. Devtools says the priority is Highest, but according to chrome://net-internals the main resource is requested with priority = "MEDIUM".

If I read correctly, this should have effect without Nginx-side modification, correct?

flag-disabled.png
378 KB View Download
flag-enabled.png
377 KB View Download
On Android with impaired network condition (2G-emulated Wi-Fi), I'm still not sure if it makes difference or not.
disabled.png
453 KB View Download
enabled.png
453 KB View Download
FWIW, Dev Tools lies.  It has a default it uses for requests that gets used for navigations because they are browser-process-initiated.  It also doesn't show reprioritizations.

You're also not going to see any effect if you use Dev Tools traffic simulation because it applies the shaping between the network stack and the renderer, not on the actual network.  You'll need to use one of these tools depending on the OS you're on: https://calendar.perfplanet.com/2016/testing-with-realistic-networking-conditions/

Finally, you might still not see any impact if the server has already written all of the HTML into the output buffers by the time it learns about the CSS.
Thanks for pointing out the caveats of DevTools.

#c35 was tested not w/ DevTools throttling, but on a slow (throttled) Wi-Fi network. Attached net-internals logs are taken with the same settings (Nexus 6P, throttled Wi-Fi).

From the HTTP2 session logs, without Patch:
- The main resource is requested with weigth = 256
- The stylesheet is requested as a dependent of the main resource

With patch:
- The main resource is requested with weigth = 220
- The stylesheet is not dependent on any other stream

But in both cases en.m.wikipedia.org seems to start sending stylesheet response before the main resource completes.

AFAICT, Chromium side change is working as intended. Maybe we can go ahead with a finch experiment and see how it works in the wild?

net_internals_log-prio-medium
1.1 MB View Download
net_internals_log-prio-highest
840 KB View Download
Cc: jakearchibald@chromium.org
Labels: Needs-Feedback
Taking a closer look at the WebPageTest result in #0 (https://webpagetest.org/result/180603_33_9e7994d1e51c76609efda1b675849b51/1/details/#waterfall_view_step1), it looks like first byte of the stylesheet resource arrives before the main resource completes.

phedenskog@, does that match what you originally observed? Is there any recent server side changes that might changed priority handling behavior?

I think the TTFB for the CSS and the CSS bar in the waterfall doesn't give the whole story. The problem is that the CSS is never downloaded fully before the HTML is downloaded = the browser cannot start to render since the CSS is render blocking.

No changes on the server side. 
The TTFB is largely the only thing the browser can help influence.  If the server knows about the CSS and it is a higher priority than the HTML then it's the one responsible to pause the HTML stream until the CSS stream completes (particularly since they are both set as "exclusive").

Now being on the server side of things I'll just point out that HTTP/2 prioritization is a mess in general so don't just assume the server is doing it right and can respond quickly to changes:

1 - It needs to support prioritization, dependencies and the exclusive bit.
2 - I needs to manage the send buffers on the server so it can respond to priority changes while responses are in-flight.  Otherwise it will fill the output buffers with the HTML before it even learns about the CSS.

#2 doesn't look like it's necessarily the problem since the HTML overlaps with the CSS stream.

I'm planning on improvements to how WPT shows HTTP/2 streams so you can see how each chunk of data is delivered but the netlog can provide the same data in raw form.

As it is right now, whatever is terminating the HTTP/2 connections for wikipedia doesn't look like it is handling the dependency tree that Chrome is sending (maybe just the exclusive bit isn't honored and it is weighting the 2 streams).
I tried to create a reduced case: https://large-html-css-test.glitch.me/.

Chrome https://www.webpagetest.org/video/compare.php?tests=180911_M5_ded9cb38cddf6bab1e28e6aa6e113b64-r:3-c:0
Edge https://www.webpagetest.org/video/compare.php?tests=180911_Y6_93cffdf6c791069948ae3c13b8cfe3d6-r:1-c:0
Firefox https://www.webpagetest.org/video/compare.php?tests=180911_TX_9346c83011f09c6fd69ecf4cc015c6ac-r:3-c:0

I'm seeing the same issue across all browser, but it seems like the request for the CSS goes out pretty early, which makes me think the server is doing a bad job of serving.

Is this the same pattern we're seeing from Wikipedia? If so, it seems like this is something we should take up with nginx (and other servers too I guess).
https://github.com/jakearchibald/h2-priority-test - I've been playing around with the Glitch site using Node's HTTP/2 implementation.

In the test, I make a request for the HTML, and when I see `<link rel="stylesheet"` I request the CSS. I've throttled the reads to around 20k/sec, although I haven't throttled the writes. I also don't know enough about Node's networking stack to know if this method of throttling is realistic.

By default, the HTML and CSS seem to share the stream 50:50. Because the CSS is smaller, this means the CSS arrives sooner than the HTML, unlike browsers.

If I make the HTML request the parent of the CSS request, the stream is shared 52:48 in HTML's favour, making the CSS arrive slower, but still before the HTML.

If I remove the parent reference, but give the CSS a weight of 256, the stream is shared 41:59 in CSS's favour.

Setting weight on the HTML doesn't seem to do anything.

Setting parent + exclusive on the CSS stream doesn't seem to do anything.
https://github.com/jakearchibald/h2-priority-test/compare/wiki

I've created a version that requests stuff from Wikipedia & tried to use the same stream priorities Chrome does.

I'm seeing the CSS fully arrive long before the HTML ends.
I've updated https://github.com/jakearchibald/h2-priority-test so it no longer performs throttling (safer to use OS-level throttling), and supports gzip. I'm still seeing CSS arrive before the HTML in the wiki example.

Sorry for the spam.
When we tested with Safari it looked better (we used it compare with Chrome but I missed to add it in the original issue). I had another go and here's the result: https://webpagetest.org/result/180912_0D_1efbc45136f4e08380e4ea30266e4abc/1/details/#waterfall_view_step1
Server-side priority support....needs some work.

I just added displaying of the individual chunk timings to the WPT waterfall so it is easier to see how they are interleaved (or exclusive).

Here is Jake's test re-run: https://www.webpagetest.org/result/180913_A7_a5253404120d920a40dcbf455defc412/3/details/#waterfall_view_step1

From some tests I have set up, it looks like h2o honors the exclusive bit: https://www.webpagetest.org/result/180913_57_3e0ad0b32396fbd329380046509ccea1/1/details/#waterfall_view_step1

But Nginx does not (at least entirely): https://www.webpagetest.org/result/180913_4A_bd51fc30f1af297fac489355c465eb65/1/details/#waterfall_view_step1

Nginx also requires some tuning to get it to be able to honor priorities at all (an area I'm actively working on fixing right now).
Cc: ksakamoto@chromium.org
Owner: ----
Status: Available (was: Assigned)
Unassigning myself for now, since it is unlikely that the priority change (#c26) alone will fix the issue, without some server-side support.

Sign in to add a comment