New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 672968 link

Starred by 3 users

Issue metadata

Status: Untriaged
Owner: ----
Cc:
EstimatedDays: ----
NextAction: 2019-07-09
OS: Android
Pri: 3
Type: Bug



Sign in to add a comment

Incomplete page shown on a slow network

Project Member Reported by mdw@chromium.org, Dec 9 2016

Issue description

Application Version (from "Chrome Settings > About Chrome"): 56.0.2924.18
Android Build Number (from "Android Settings > About Phone/Tablet"): LXI22.50-62
Device: Moto E

Steps to reproduce:

1) Put device on a slow network (e.g., GIN-2gpoor)
2) Try to load a page (e.g., http://aws1.mdw.la/fw in this case).
3) Wait for a VERY long time for the page to load.

Observed behavior: 

Eventually, the page loads, but is lacking the Grumpy Cat image. There is also JS that fires on onLoad that populates the navigation timings at the bottom of the page -- these were not present.

Expected behavior:

It's nice that something finally did display, but there was no indication that the page was not complete. I could imagine a user who was used to slow page loads being confused about why the quality was degraded (and feeling that they needed to reload to see the page properly).

On the assumption that this is an intentional intervention we're doing for loading pages on slow networks ("last gasp"?), perhaps we should consider a snackbar indicating that the page load was canceled or degraded in some way. I am not sure as it depends somewhat on why this is happening.

See internal doc for more details on experiment:
https://docs.google.com/a/google.com/document/d/19ctmz25Dd4FsCUeFJeaIhWGBTENcVo5_KD626fRE4DM/edit?usp=sharing
 
The page likely fails to load because this page issues a sync xhr while it's parsing, which means that the parser and page UI will be locked up until the xhr completes. This could take a long time on slow connections, or time out and fail, which leads to a poor experience (onload probably doesn't fire, etc).

While the browser should be more resilient to this on a slow network, this does represent a sort of worst case for the browser to have to handle. You might switch to async xhr and see if things get better. That may increase the rate at which the page load completes successfully, though it may not have an impact on how long the page takes to load.

See https://blogs.msdn.microsoft.com/wer/2011/08/03/why-you-should-use-xmlhttprequest-asynchronously/ for more on why sync XHR is not recommended.

We have a DevTools warning for this: "Synchronous XMLHttpRequest on the main thread is deprecated because of its detrimental effects to the end user's experience. For more help, check https://xhr.spec.whatwg.org/."
Cc: ojan@chromium.org
"While the browser should be more resilient to this on a slow network" probably means intervening and blocking the sync XHR altogether.

We talked about adding an intervention for sync XHR however my sense is that it's somewhat rare and thus the impact would be limited.

Ojan, Chris, others, do we have UMA counters that tell us how frequently we encounter a sync XHR?
Cc: tyoshino@chromium.org
+tyoshino who may know. I actually think we may have this on chromestatus on the use counter.
On chromestatus - looks like sync xhr on about 1% of page loads.

https://www.chromestatus.com/metrics/feature/popularity#XMLHttpRequestSynchronous
I think we should close this out - this page is susceptible to loading failure on slow/flaky networks due to its use of sync XHR.

Chrome already has a DevTools warning to strongly discourage use of sync XHR.

I don't think it makes sense to add an intervention for sync XHR given that only 1% of page loads are affected.

Comment 6 by mdw@chromium.org, Dec 10 2016

OK, so my test page is kind of a hack. Does that explain why the onload event apparently never fires though?

Comment 7 by ojan@chromium.org, Dec 11 2016

Cc: k...@chromium.org
A couple things here:

1. There's no existing intervention I know if that we'd be applying here. If onload is not firing after the image loads or errors, that's a bug and we should fix it regardless whether the bug is a side effect of using sync XHR.

I made a custom network throttle in devtools that did 1kb/s up/down and 2000ms latency and saw something like what the original post describes.  I think the sync XHR is a red herring. I see the XHR finish and the image start to load.

I don't see an onload bug though. The image is just taking forever to load. It did *look* like nothing was happening, but once it finally loaded (minutes later!), the onload did fire. My guess is that what you're seeing is that you didn't give the image enough time to load/error.

mdw, are you able to reproduce it? If the root bug is that there's no user indication that the page is still getting bytes, I think we should dupe this with  issue 672946 . Incidentally, ktam and I are meeting on Wednesday to talk through an idea I have for a replacement to the progress bar.

2. Regarding sync XHR interventions, 1% of page loads for something as bad as sync XHR is a lot in my opinion. That's more than enough to justify investigating why/how it's being used and considering approaches to address that. I wonder if this is mostly analytics for ads. A first step here would be to split the UseCounter across top-level vs cross-origin frame. The good and surprising news is that usage has been steadily dropping from 3.5% 2 years ago.

That's work we should do, but let's keep this bug focused on onload not firing.
I'm not sure a 1kb/s throttle without loss is comparable to the environment Matt was testing in. My understanding is that Matt's testing environment incorporates random packet loss. We see cases where a request will fail to succeed on slow networks with loss.

What's the expected behavior for the HTML parser if a sync XHR fails in the midst of parsing? My understanding is that parsing stops at that point, rather than trying to continue after the failed XHR. If that's the case, then the onload logic would not succeed on this page even if onload runs, since it tries to append content into a DOM element that appears after the sync XHR in the HTML payload.

We'll need a more detailed trace at both network and parser level to debug this further.
I looked at this a bit more - it appears the HTML parser generally tries to make progress in the face of script errors during parsing, so my original hypothesis that the parsed DOM would be incomplete due to the failed sync XHR is likely incorrect.

RE: next steps here, I'd need a netlog and a trace for this page load to figure out what's going on here.
As a next step, I'll try to repro on gin 2g poor when I'm next in the office. If I can't repro there then we will need a netlog+trace.
Status: Started (was: Untriaged)
First, I now agree that sync XHR is not the problem here. I was under the impression that the HTML parser would stop parsing subsequent HTML if a JS error was encountered, but I've since learned that's incorrect. Given this, even if the sync XHR fails, it should not lead to failing to display the information inserted into the DOM at onload.


Where the sync XHR can get us into trouble here is that it does block the parser from making progress until the resource being XHR'd is complete, or until it errors out. This can take a long time on slow networks with packet loss.

Matt, do you recall how long you waited before abandoning the page load?


Some debugging info:

I spent some time testing this page on a simulated poor 2g connection with packet loss.

The results were highly variable: sometimes the page loaded just fine, other times the page failed to render any content at all.

For one test, I was able to repro the behavior Matt described, with the page load hanging before onload fired. The progress bar was roughly 2/3 complete and just stopped.

After waiting a long time, the page did eventually fire the load event, however, and the timing information appeared at the bottom of the page.

Matt, any chance you captured a netlog or tcpdump for this page load? I'd be interested to see what was happening at the network layer here. I can try to capture a netlog or tcpdump if you don't have them available.

Comment 12 by mdw@chromium.org, Dec 12 2016

Unfortunately, I did not get a netlog for this particular case.
I don't recall how long I waited after seeing the partial page load (with no image or timing information), but it seemed to be quite a while. Also, the loading bar had disappeared by this point, so I assumed the browser considered the page "complete".

Cc: bmcquade@chromium.org
Owner: ----
Status: Available (was: Started)
I'm unlikely to dig into this further any time soon, so unassigning.

Comment 14 by k...@chromium.org, Feb 15 2018

Cc: -k...@chromium.org

Comment 15 by ojan@chromium.org, May 8 2018

Cc: -ojan@chromium.org
Labels: Pri-3
NextAction: 2019-07-09
Downgrading P2s that haven't been modified in more than 6 months, which have no component or owner.
Status: Untriaged (was: Available)
Available, but no owner or component? Please find a component, as no one will ever find this without one.

Sign in to add a comment