New issue
Advanced search Search tips

Issue 632119 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 2
Type: Bug



Sign in to add a comment

[Offline Pages] BackgroundOffliner retrying failed request multiple times wiht no backoff - perhaps in same processing window

Project Member Reported by dougarnett@chromium.org, Jul 27 2016

Issue description

In trying to background load https://www.google.com, the prerenderer was failing and the request seemed to be retried multiple times in the same processing window from these debug logs:

07-27 12:46:05.746 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    http://stafaband.info/
07-27 12:46:21.418 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 2
07-27 12:46:21.429 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    https://www.google.com/
07-27 12:46:24.134 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5
07-27 12:46:24.139 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    https://www.google.com/
07-27 12:46:26.757 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5
07-27 12:46:26.785 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    https://www.google.com/
07-27 12:46:29.319 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5
07-27 12:46:29.340 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    https://www.google.com/
07-27 12:46:31.872 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5

where 5 => PRERENDERING_FAILED

Perhaps either the Coordinator is not stopping the processing window on this failure or the Coordinator is actually stopping but new processing window got fired again quickly. In the latter case, we should consider some backoff for the start time.

This was on a built synced 7/27 morning with some logging added.
 
Status: Assigned (was: Untriaged)
Another sample - most likely all in one processing window:

07-27 13:43:45.475 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    http://uol.com.br/
07-27 13:43:59.347 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 2
07-27 13:43:59.383 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    http://espncricinfo.com/
07-27 13:44:19.506 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 2
07-27 13:44:19.539 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    http://waptrick.com/
07-27 13:44:28.012 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 2
07-27 13:44:28.028 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    https://www.irctc.com.in/
07-27 13:44:31.382 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5
07-27 13:44:31.408 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    https://www.irctc.com.in/
07-27 13:44:33.538 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5
07-27 13:44:33.557 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    https://www.irctc.com.in/
07-27 13:44:35.643 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5
07-27 13:44:35.670 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    https://www.irctc.com.in/
07-27 13:44:37.855 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5
07-27 13:44:37.883 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    http://indianrail.gov.in/
07-27 13:45:55.154 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5
07-27 13:45:55.177 I/chromium(25052): [INFO:request_coordinator.cc(171)] XXXXX  SendRequestToOffliner    http://indianrail.gov.in/
07-27 13:47:10.965 I/chromium(25052): [INFO:request_coordinator.cc(199)] XXXXX  OfflinerDoneCallback status: 5

Retrying the request with no backoff on failure is by design.

However, we should be removing the request on prerender failure, and there isn't code to do that yet.
Are you saying retrying in the same processing window is by design?
I don't recall that - instead, thought I fixed it prior to retry limit support - so maybe we need to discuss.
One particular case to support, if the prerender fails because its render process crashed, then we need to stop any further prerender attempts for the processing window (in case Android needing resources for another, foreground activity) - at least for low-end devices.
I'm saying that we don't have any limitations to prevent retrying a failed request again in the same processing window, and that I don't remember designing any in our current design doc.  As you point out, it might not make sense if the pre-renderer itself crashed.

However, if it just timed out, a retry in the same window is not only perfectly resonable, but this might be the best time while some resources are already in cache.
Let's discuss tomorrow.

Here is the CL where I "fixed" this previously: https://codereview.chromium.org/2071403003/
We don't need to fix this at the moment since we have turned off retries.  We should fix this before turning retries back on.  The proper fix is to check the failure type, and on anything but a timeout, not to retry the same request.
Cc: petewil@chromium.org
Owner: dougarnett@chromium.org
We do need to stop processing further requests on system resource types of failures - especially if there are no retries - so subsequent requests get a more fair shot at succeeding. Our hardest requirement is to stop further processing on svelte device if rendering process was killed.

So we at least need a better default and then more ideally, surface the prerender failure types and have some whitelist of failures for which we are confident are limited to the specific page (eg, Unsupported Scheme).
Project Member

Comment 9 by bugdroid1@chromium.org, Aug 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e91a689c5f795a9c7f7d472d75c970d237897e5c

commit e91a689c5f795a9c7f7d472d75c970d237897e5c
Author: dougarnett <dougarnett@chromium.org>
Date: Wed Aug 03 17:20:28 2016

Skips processing more SavePageLater requests in current processing window when one completes unsuccessfully.

Ideally, we would consider processing another request in the same processing
window for certain types of page-specific failures (follow-on work) but the
default needs to be more conservative in case we are seeing system resource
related failures/cancels.

BUG= 632119 

Review-Url: https://codereview.chromium.org/2196293002
Cr-Commit-Position: refs/heads/master@{#409554}

[modify] https://crrev.com/e91a689c5f795a9c7f7d472d75c970d237897e5c/components/offline_pages/background/request_coordinator.cc
[modify] https://crrev.com/e91a689c5f795a9c7f7d472d75c970d237897e5c/components/offline_pages/background/request_coordinator_unittest.cc

Status: Fixed (was: Assigned)

Sign in to add a comment