Large CSS/JS files download speed slow on Chromium only
Reported by
pikachu...@gmail.com,
Jun 27 2018
|
||||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/67.0.3396.99 Safari/537.36 Example URL: https://cdn.spacious.hk/assets/apps/spacious_website/render_unblocking/phase_2-3429eed7918e4940829d5cfe724e13f9110114f853111b6f9c49add67d36b0aa.css Steps to reproduce the problem: 1. Launch Chrome 2. Load https://cdn.spacious.hk/assets/apps/spacious_website/render_unblocking/phase_2-3429eed7918e4940829d5cfe724e13f9110114f853111b6f9c49add67d36b0aa.css 3. Feel the slowness What is the expected behavior? The transfer time should be 200ms ~ 500ms on average What went wrong? Transfer time always > 500ms, if requested directly it can take up to seconds Did this work before? N/A Chrome version: 67.0.3396.99 Channel: stable OS Version: OS X 10.12.6 Flash Version: According to Cloudflare support response: ``` I was able to reproduce the issue you described and it seems that the issue relates to how Chrome downloads the content and saves it to cache rather than a bottleneck at Cloudflare. This can be checked from Chrome's net-internals (chrome://net-internals) page describing the request and how it's handling the response which I've attached in this ticket. Instead of downloading the entire content from the server then cache the response it seems to be doing it on the fly. ``` See attached screenshots for timings & request/response headers from different browsers including - Chrome 67 - Firefox 61 - Safari 11.1.1 The normal transfer time can also be confirmed with curl ``` $ curl 'https://cdn.spacious.hk/assets/apps/spacious_website/render_unblocking/phase_2-3429eed7918e4940829d5cfe724e13f9110114f853111b6f9c49add67d36b0aa.css' -H 'Connection: keep-alive' -H 'Pragma: no-cache' -H 'Cache-Control: no-cache' -H 'Upgrade-Insecure-Requests: 1' -H 'DNT: 1' -H 'User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3472.3 Safari/537.36' -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8' -H 'Accept-Encoding: gzip, deflate, br' -H 'Accept-Language: en' -H 'Cookie: __cfduid=d4ca69e8aaeff5c23321cc96d9e836c0a1530068899' -w " Content Type: %{content_type}\n HTTP Code: %{http_code}\n HTTP Connect: %{http_connect}\n Number Connects: %{num_connects}\n Number Redirects: %{num_redirects}\n Redirect URL: %{redirect_url}\n Size Download: %{size_download}\n Average Speed Download: %{speed_download}\n Size Upload: %{size_upload}\n SSL Verify: %{ssl_verify_result}\n Time Handshake: %{time_appconnect}\n Time Connect: %{time_connect}\n Name Lookup Time: %{time_namelookup}\nn Pretransfer: %{time_pretransfer}\n Time Redirect: %{time_redirect}\n Time Start Transfer: %{time_starttransfer}\n ----------\n time_total: %{time_total}\n\nEffective URL: %{url_effective}\n" --output /dev/null % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 100 223k 0 223k 0 0 1370k 0 --:--:-- --:--:-- --:--:-- 1370k Content Type: text/css HTTP Code: 200 HTTP Connect: 000 Number Connects: 1 Number Redirects: 0 Redirect URL: Size Download: 228711 Average Speed Download: 1403134.000 Size Upload: 0 SSL Verify: 0 Time Handshake: 0.081291 Time Connect: 0.017964 Name Lookup Time: 0.014509 Pretransfer: 0.081376 Time Redirect: 0.000000 Time Start Transfer: 0.092065 ---------- time_total: 0.163514 Effective URL: https://cdn.spacious.hk/assets/apps/spacious_website/render_unblocking/phase_2-3429eed7918e4940829d5cfe724e13f9110114f853111b6f9c49add67d36b0aa.css ``` Events in chrome://net-internals/#events can be reproduced So I don't post the event log here
,
Jun 27 2018
OK, well, that's one way to triage it :D I think there were some earlier reports where there seemed to be slowness at higher level with a navigation; something like opening a big JSON?
,
Jun 27 2018
I added you to the CC list, didn't assign it to you. I mean, feel free to assign it to yourself if you want, just figure you're on triage, this reproduces, has a big impact in at least one case, so seems like a good thing to look into during your shift.
,
Jun 27 2018
Yeah, I assigned it to me, with the bonus of removing it from my triage queue, and putting it on my work queue. I am not sure I'll actually figure anything out, but it's worth a shot.
,
Jun 28 2018
Both direct requests and sub-resource requests are both slower than other non-Chromium browsers & curl Directly requests take more time than sub-resource requests (about 10x) to transfer the content I attached timings and net internal logs for - direct request with "Disable cache" enabled - sub-resource request with "Disable cache" enabled - direct request with "Disable cache" disabled (request still sent with 304 response) - sub-resource request with "Disable cache" disabled (directly read from disk cache)
,
Jul 2
So subresource case is bad (600-900ms) or so, and the direct one is horrible (8 seconds!). What was the subresource setup as, just <script src=>, or something fancier? Still looks like some higher-level-than-network-stack stuff, though. The disk cache level logs all look pretty nice...
,
Jul 3
```
<link rel=preload as=style href="//cdn.spacious.hk/{css_path}" media=all onload="this.onload=null;this.rel='stylesheet'"/>
<noscript>
<link rel=stylesheet media=all href="//cdn.spacious.hk/{css_path}"/>
</noscript>
```
With fallback script (which should not be run on Chrome) from
https://github.com/filamentgroup/loadCSS
But JS file loaded via script tag with async & defer is also slow
```
<script src="//cdn.spacious.hk/{js_path}" debug=false crossorigin=anonymous async defer></script>
```
,
Jul 9
Let me know if there is anything else is needed (More logs / test with some dev version etc.)
,
Jul 9
Well, I can't reproduce the subresource case with just above, so if you have a handy test page that shows the problem, it would be helpful. (I can reproduce the top-level one, so I'll dig into that, but it's likely to be less important).
,
Jul 9
The top level one may well just be slowness in formatting a really long string without any whitespace for display - I'm not even sure it's worth digging into.
,
Jul 9
That does seem likely to me as well (given that my debug builds are about 3x slower at it, too!), but I figured it would be a good excuse to look around... but then on further thought, navigation and resource paths are pretty different, aren't they?
,
Jul 9
Yea, they are. If you want to dig around the navigation code to better understand this case, don't let my comment stop you - I think it's good code to know, since it's rather network-stack-adjacent.
,
Jul 9
So in the top-level case, from browser side we're just stopped by flow control at MojoAsyncResourceHandler level.... time to learn what the renderer side peer to that is. This is much easier to see if I make URLRequest::LogBlockedBy log everything and not just first spot by commenting out this: https://cs.chromium.org/chromium/src/net/url_request/url_request.cc?rcl=aa72414951c47a474b12f873e6aed8ca677a4057&l=338 And it does seem likely that rendering is a huge factor --- if I switch away from that tab to net-internals, it renders in "only" about 3 sec rather than about 21, and flow control pauses look like this: t=208808 [st= 216] DELEGATE_INFO [dt=6] --> delegate_blocked_by = "MojoAsyncResourceHandler" rather than more like this: t= 9407 [st= 6226] DELEGATE_INFO [dt=574] --> delegate_blocked_by = "MojoAsyncResourceHandler" As resource load it looks OKish --- 100ms is bigger than it should be, but it's quite possibly just because I am on a slow debug build.
,
Jul 9
Yep, ~14-20ms on the subresource (from cache) in an optimized build, maybe it could be improved but in the not obviously brokenly slow range; so I will probably need a testcase...
,
Jul 9
And it does look like the slow-top-level loading was indeed due to rendering (though it's not as slow as reported in an optimized build) ---- the debug output I've added is reasonably compelling: [1:1:0709/155320.867204:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155320.876633:ERROR:local_frame_view.cc(718)] 0x3fff12047d30 Begin LocalFrameView::PerformLayout [1:1:0709/155320.876827:ERROR:local_frame_view.cc(786)] 0x3fff12047d30 End LocalFrameView::PerformLayout 0.148 ms [1:1:0709/155320.878606:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155320.880169:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155320.881106:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155320.881322:ERROR:local_frame_view.cc(718)] 0x3fff12047d30 Begin LocalFrameView::PerformLayout [1:1:0709/155321.087516:ERROR:local_frame_view.cc(786)] 0x3fff12047d30 End LocalFrameView::PerformLayout 206.218 ms [1:1:0709/155321.088540:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155321.088846:ERROR:local_frame_view.cc(718)] 0x3fff12047d30 Begin LocalFrameView::PerformLayout [1:1:0709/155321.143878:ERROR:local_frame_view.cc(786)] 0x3fff12047d30 End LocalFrameView::PerformLayout 55.019 ms [1:1:0709/155321.215692:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155321.215906:ERROR:local_frame_view.cc(718)] 0x3fff12047d30 Begin LocalFrameView::PerformLayout [1:1:0709/155321.261556:ERROR:local_frame_view.cc(786)] 0x3fff12047d30 End LocalFrameView::PerformLayout 45.683 ms [1:1:0709/155321.262881:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155321.263131:ERROR:local_frame_view.cc(718)] 0x3fff12047d30 Begin LocalFrameView::PerformLayout [1:1:0709/155321.299000:ERROR:local_frame_view.cc(786)] 0x3fff12047d30 End LocalFrameView::PerformLayout 35.887 ms [1:1:0709/155321.300314:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155321.300630:ERROR:local_frame_view.cc(718)] 0x3fff12047d30 Begin LocalFrameView::PerformLayout [1:1:0709/155321.334622:ERROR:local_frame_view.cc(786)] 0x3fff12047d30 End LocalFrameView::PerformLayout 34.005 ms [1:1:0709/155321.336240:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155321.336583:ERROR:local_frame_view.cc(718)] 0x3fff12047d30 Begin LocalFrameView::PerformLayout [1:1:0709/155321.375002:ERROR:local_frame_view.cc(786)] 0x3fff12047d30 End LocalFrameView::PerformLayout 38.462 ms [1:1:0709/155321.376762:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE [1:1:0709/155321.376991:ERROR:local_frame_view.cc(718)] 0x3fff12047d30 Begin LocalFrameView::PerformLayout [1:1:0709/155321.401373:ERROR:local_frame_view.cc(786)] 0x3fff12047d30 End LocalFrameView::PerformLayout 24.426 ms [1:1:0709/155321.402991:ERROR:url_response_body_consumer.cc(160)] 0x3cd9ae427c00 OnReceivedData:65536 sent to:N7content16WebURLLoaderImpl15RequestPeerImplE => The main thread is doing lots of re-layouts, which are kinda expensive, and so it's not reading from mojo. Anyway, the renderer people may care about that (or quite possibly already know about it), but seems like the subresource case is what we in network should indeed focus on, so a testcase that demonstrates it would be very much appreciate it, as I can't reproduce it by pasting the css in comment #7 into a file (with proper URL, of course)
,
Jul 10
I test again for direct access while that tab is inactive (looking at a blank tab when it's loading) And the content download time seems more normal now (~300ms) Also it seems the rendering also affects the content download time for sub-resource requests (~200ms longer) But it's still strange that the content download time for both requests from CDN and from file cache are similar (~500ms) Is the file cache too slow? Or the other operations for displaying the webpage is slowing down both requests? I am testing with https://www.spacious.hk/en/hong-kong
,
Jul 10
I test again for direct access while that tab is inactive (looking at a blank tab when it's loading) And the content download time seems more normal now (~300ms) Also it seems the rendering also affects the content download time for sub-resource requests (~200ms longer) But it's still strange that the content download time for both requests from CDN and from file cache are similar (~500ms) Is the file cache too slow? Or the other operations for displaying the webpage is slowing down both requests? I am testing with https://www.spacious.hk/en/hong-kong
,
Jul 10
Thanks, that I can reproduce (well, at least on stable package, dev tree doing other stuff at the moment)...
And while our cache isn't great, I think the slowness is above that --- for example the netlog for this says,
among other things, this:
t=207763 [st= 75] HTTP_CACHE_READ_DATA [dt=8]
t=207771 [st= 83] URL_REQUEST_JOB_BYTES_READ
--> byte_count = 32768
t=207771 [st= 83] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 6686
t=207949 [st= 261] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=207949 [st= 261] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208095 [st= 407] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208096 [st= 408] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208251 [st= 563] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208251 [st= 563] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208376 [st= 688] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208376 [st= 688] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208618 [st= 930] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208618 [st= 930] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208683 [st= 995] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=208683 [st= 995] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=209209 [st=1521] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=209209 [st=1521] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=209536 [st=1848] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 32768
t=209536 [st=1848] URL_REQUEST_JOB_FILTERED_BYTES_READ
--> byte_count = 11415
st = here is time in milliseconds since start of request.
This means that it took about 8ms to get a chunk of data (happens to be 32K) out of the cache.
Then brotli decompressed it to a very impressive amount over a bunch of reads from upper layer...
spread over about 1.8 seconds! I can't say why the reads are so spread out w/o a hacked up
dev build, though.
,
Jul 10
So it does look to me like, to use your words "the other operations for displaying the webpage is slowing down both requests", though in this case there is too much stuff going on in the renderer for me to know for sure (especially since I am not familiar with that area). Basically it looks like while the big stylesheet is still loading it also starts loading all the various ads scripts, social API widgets, starts executing them, etc., and it slows down the render's consumption of the CSS from the browser as those things take up CPU and compete for transferring data. Also, is the CSS supposed to be (when uncompressed) about 3.2MiB? (Dev note: it looks to just be woken up by mojo::SimpleWatcher, and it consumes data from the pipe immediately... should priorities matter there?)
,
Jul 11
We care more about transfer size than uncompressed size for static assets So we ignored the 3.2MiB size Going to move some not so important async script tag to bottom of body and test again
,
Jul 13
Tried again after moving most JS (except the app essential JS) to bottom of body There is still a long delay between some `ENTRY_READ_DATA` entries under `DISK_CACHE_ENTRY` See the text file with events logged during a caching enabled, tab inactive during loading request (removed events after CSS is loaded, like AJAX requests where t > 4817) CSS URL is https://cdn.spacious.hk/assets/apps/spacious_website/render_unblocking/phase_2-51285caf07cfc1ab2ae3cff6fa15a20b75e3b0e12797419506fd414a1098030d.css
,
Jul 16
It might help to be recording chrome traces (view using chrome://tracing) to do more debugging. Here's one with a 1.8s css load (https://cdn.spacious.hk/assets/apps/spacious_website/render_unblocking/phase_2-51285caf07cfc1ab2ae3cff6fa15a20b75e3b0e12797419506fd414a1098030d.css) As you can see, the main renderer thread is completely busy for the entire duration of the request, mostly running javascript. To me this indicates that we're not blocked on network but by other activity on the site. If you want to try to log one of these yourself, navigate to chrome://tracing and hit record. You can probably also manually curate categories, but "Web developer" might be good enough.
,
Jul 16
=> Loader, as the render's ability to consume data from the browser, rather than browser's ability to fetch it appears to be the issue.
,
Jul 18
This is all renderer-side. Deferring to the loader team.
,
Jul 19
Submitting my trace for web page load
,
Aug 6
Is there any more info should be provided to help the investigation?
,
Aug 20
Please let me know if there is any progress or more details is needed
,
Sep 10
Again, if there is any progress or any other info/testing is required Please let me know
,
Sep 19
We found out that switch back to sync stylesheet tag make the loading speed normal again So we decided to use sync stylesheet tag for chromium based browsers only (like Chrome, Opera, Vivaldi)
,
Oct 31
Trying to add more details here We use Chrome Timeline with option "Disable Cache" Unchecked And we record a profile after finishing one page load (so that cachced content should be used) We found that there are TWO Send Request events with different priority And the first "request" has less "Receive Data" events with bigger Encoded Data size The latter "request" still has many "Receive Data" events with Encoded Data size of 64KB / 32KB So this indicates there is something strange going on even when loading cached content See screenshots and saved profile in attachments |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by mmenke@chromium.org
, Jun 27 2018