Slow network traffic on Linux |
||||||
Issue descriptionI tested different MTU sizes since my IPv6 goes over a Hurricane Electric tunnel. No change. Chrome still very slow network I/O. Additional data. I used wireshark to capture my tcp/IPv6 traffic to youtube. FF works just fine opening up a TCP window of 161K. Chrome connects and then immediately the TCP window size drops 275 bytes and eventually it oscillates between 0 window size and 275 and values in between. There is something with my linux desktop, chrome, tcp that is causing the tcp window size to drop. On a Fedora/Linux laptop also running Chrome 54 (same network), I get full 30 Mbps stream to YouTube. I used a tool called iperf to test throughput between my machine and another linux box and I got 900+ Mbps out of my computer. For graphics-related bugs, please copy/paste the contents of the about:gpu page at the end of this report. Filed on behalf of a forum user, please see more info at: https://productforums.google.com/forum/#!topic/chrome-admins/7KpyhKjuQ0E;context-place=forum/chrome-admins
,
Apr 25 2017
Hi! Thanks for the reply. High-level is that with later versions of Chrome on my linux desktop, all network I/O is dog slow like 100s KBps vs other browsers. I used YouTube as an example because I could use stats-for-nerds to see what bitrate the stream is. I really don't feel its a graphics card issue. I used wireshark to capture packet traces and the tcp window size for Chrome tcp sessions gets ratched way down to near zero at many times. I don't think its an IPv6 issue as it happens with IPv4 websites as well. Bobby
,
Apr 25 2017
Hi! I uploaded some notes and packet trace. Summary. I used iperf to test network I/O out of my system (to rule out my linux fc24 system) and I was able to get 915 Mbps network throughput. While streaming a 4k video from YouTube, using netstat and lsof, I could see that the FF TCP window remain open in the 161K range. Chrome TCP window closes quickly and then window vacillates between 10s and 100s. Bobby
,
Apr 25 2017
Upgraded all my systems to chrome 57 and still have same problem on my desktop.
,
Apr 25 2017
Hi! Just for completeness, below is the output of about:gpu attached as pdf Bobby
,
Apr 25 2017
Hi! I disabled quic and all page loads are dog slow. Still seeing see like really slow network I/O streams as shown by chrome task manager, youtube stats for nerds. Bobby
,
Apr 25 2017
Do you have the same results when running from an incognito window? How about when using a fresh profile? Try launching chrome with: --user-data-dir=SOME_TEMP_PATH_HERE Can you provide a network log (https://dev.chromium.org/for-testers/providing-network-details) both from a version of Chrome that is slow, and one where it goes quickly?
,
Apr 25 2017
+jri: Any thoughts on this?
,
Apr 26 2017
Hey, in the other bug stream, I had mentioned that I had done a complete re-install after first wiping out all cache, bookmarks, extensions, etc. Still problem -- slow network I/O to every web page. Bobby
,
Apr 26 2017
Hey, I will try incognito and fresh profile using say /tmp. I'll report this tomorrow. I'll also look at the providing-network-details. Thanks, Bobby
,
Apr 26 2017
Here is a dump, from net internals, of me re-loading the YouTube page and trying to stream a 4k video. During the page loading, stats for nerds says the stream slows to 175 KBps. (Its not a video card issue though. YouTube just used for streaming large files as test.) Bobby
,
Apr 26 2017
Hi! I created a new incognito window and went to YouTube and loaded a 4k video of hiking in Yosemite. I was able to get a 35,000 Kbps stream!!! Why did incognito not have the b/w limitations that my regular mode did? I think you are on to something. Bobby
,
Apr 26 2017
Hi! Another self-disclosure. My home directory is NFS mounted. Its never mattered before so I don't know if thats relevant here. Bobby
,
Apr 26 2017
Hi! google-chrome --user-data-dir=/tmp Running google-chrome this way gives me full bandwith to the InterWebs. I will try re-wiping out my home-dir google-chrome files and see if that works. (I tried this before but who knows.) Would NFS home directory be an issue? Bobby
,
Apr 26 2017
Hi! OK. I exited chrome and completely removed .config/google-chrome and re-started with a clean chrome. No change. b/w in regular mode is dog slow. Is there something in newer versions of chrome that don't like NFS mounted homedirs? That is, my .config/google-chrome is NFS mounted. Thanks, Bobby
,
Apr 26 2017
Hi! OK, problem solved. I moved .config/google-chrome and .cache/google-chrome to my desktop's local hard drive and the speed went back to normal. Question though. Why does google chrome network I/O perform so dog slow when home directory is NFS mounted? I realize NFS mount is not as fast as local disk but going from a 40 Mbps stream to 1Kbps stream seems extremely punitive. Before everyone writes this issue off as a "stupid-user-trick", I'm wondering if there are a few performance issues or bugs with how chrome is writing cache files and letting that interfere with network I/O. Thanks, Bobby
,
Apr 26 2017
@bobby...: For better or worse, a cache write is inline in returning the data to the consumer (i.e. we wait until the data has been written to cache until we send it up the stack). That cache is in the users home directory, and writing to an NFS based home directory may result in delays. Looking at your net-internals log, as an example, for event 115654 the total dt= for events HTTP_CACHE_WRITE_DATA total to about 16s, i.e. that's 16s worth of delay in video streaming because of the cache. So yeah, the problem being an NFS home directory makes sense to me. I'm going to go ahead and close this ticket, but feel free to let me know if you're not satisfied with this resolution.
,
Apr 27 2017
Hi! Thanks for your comments. Yes, its clear there is some perverse interaction with NFS and caching. However, my FF cache is on NFS. And, with earlier versions of chrome, I wasnt facing this problem. Plus, I am in the process of deploying a RAID 5 storage array and have been testing it on my desktop and measuring write, read speed. As part of that, I measured I/O speed to my NFS server (e.g. home directory) and I'm getting a lot faster write speed than Chrome is seeing. So, I'm guessing there is still some code that specific to chrome that is not working efficiently. Maybe locking, maybe sync writes when async writes would be better, etc. I certainly don't want you guys to spend too much time on it but I think it would nice to mark this as an improvement request. My environment has changed in a long time (fundamental network architecture and NFS servers are constant) but something in chrome did and it caused a sudden drastic drop in network I/O. Thanks, Bobby
,
Apr 27 2017
One suggestion here though would be to only redirect your cache to a local disk and keep the profile on NFS for backup or roaming reasons. You can achieve this by using the --disk-cache-dir flag or the DiskCacheDir enterprise policy.
,
Apr 27 2017
Thanks! Yes, its only the cache. I like that better. Bobby
,
Apr 27 2017
Maksim: Pulling you in in case you want to investigate if SimpleCache has worse behavior in the NFS home directory case than blockfile cache and decide if we should do something about it (see c#18; the OP says that they didn't have this problem in earlier versions of Chrome, and don't on FF). I don't know whether Simple Cache rollout on Linux coincides with the appearance of this problem, but it's probably something worth at least a few minutes of consideration. Back to Available while you're considering; feel free to close if you don't think this is worth pursuing.
,
Apr 28 2017
Hmm, well, the logs do suggest we have pretty high latency for getting the physical ops completed[1] --- and it sounds like the server had a lot of effort put into it, so it's probably not across a 100ms link... It's possible that using a small # of threads doing sync I/O calls might be amplifying server latency somehow, and getting disk ops stuck in a queue, though I wouldn't expect LAN latency to be worse than spinny disks anyway. It may be worth studying how bad the queuing gets in general, come to think of it, or
at least repeating some benchmark experiments with my spinny disk test device.
@rdsmith: Do you know how flexible the new scheduling stuff is about # of threads it assigns? That seems to be taking over our fixed thread pool in master builds, so maybe that will help some, and would make me feel a bit less
guilty if using more threads is a good idea, at any rate.
Cutting the number of separate ops might help, too, depending on caching policy, and some of it happening but there is very little that can be done.
I don't know enough about blockfile cache to draw a comparison.
[1] e.g:
t=160755 [st= 0] SIMPLE_CACHE_ENTRY_CREATE_CALL
t=160755 [st= 0] SIMPLE_CACHE_ENTRY_CREATE_OPTIMISTIC
t=160755 [st= 0] SIMPLE_CACHE_ENTRY_CREATE_BEGIN
t=160819 [st= 64] SIMPLE_CACHE_ENTRY_WRITE_CALL
--> buf_len = 5120
--> index = 0
--> offset = 0
--> truncate = true
t=161369 [st= 614] SIMPLE_CACHE_ENTRY_CREATE_END
t=161369 [st= 614] SIMPLE_CACHE_ENTRY_WRITE_BEGIN
That's 614ms from time we scheduled an entry creation and where we got OK back from it. That opened a file and did a couple of tiny writes of headers to it.
I don't now much of it is queuing delay and how much is execution.
(And 64ms in it we already had headers ready to write out)
,
Apr 28 2017
I don't know about the new scheduler; my guess is that there's some flexibility, but don't quote me on that. My complete WAG was that the cost of the Simple Cache compared to the blockfile case was in the open/close calls, of which it will make many more (one file per URL cache entry); I could imagine those require more round trip synchronization to the server than do the read/writes.
,
Apr 28 2017
The writes don't look so good either, e.g.:
t=149810 [st= 75] SIMPLE_CACHE_ENTRY_WRITE_BEGIN
--> buf_len = 32768
--> index = 1
--> offset = 0
--> truncate = true
(aka SimpleEntryImpl::WriteDataInternal, which is a bit of bookkeeping
and PostTaskAndReply(&SimpleSynchronousEntry::WriteData,
&SimpleEntryImpl::WriteOperationComplete)
...
t=150572 [st= 837] SIMPLE_CACHE_ENTRY_WRITE_END
--> bytes_copied = 32768
(Which is WriteOperationComplete)
... of course they may be getting blocked behind other stuff.
What probably doesn't help is that the write actually does some ftruncates as well (I do vaguely plan to get rid of some, though it might not actually work completely in this case...)
,
May 30 2017
It doesn't look like this should still have the Needs-Feedback label.
,
May 30 2018
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by eroman@chromium.org
, Apr 25 2017