New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 312327 link

Starred by 35 users

Issue metadata

Status: WontFix
Owner:
Closed: Jan 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug



Sign in to add a comment

rel=subresource results in duplicate downloads and wrong prioritization

Project Member Reported by igrigo...@chromium.org, Oct 28 2013

Issue description

Chrome version: 30.0.1599.101, OSX.

What steps will reproduce the problem?
1. Open: http://jsbin.com/OweVOZi/1/quiet, check devtools / netlog

Inspecting netlog, two requests are being fired (see attached file for full log):

---
                           --> method = "GET"
                           --> priority = 1
                           --> url = "http://ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"
t=1382682021622 [st=18]        HTTP_STREAM_REQUEST_BOUND_TO_JOB
                               --> source_dependency = 94496 (HTTP_STREAM_JOB)
---
                            --> method = "GET"
                            --> priority = 2
                            --> url = "http://ajax.googleapis.com/ajax/libs/jquery/2.0.3/jquery.min.js"
t=1382682021700 [st= 93]        HTTP_STREAM_REQUEST_BOUND_TO_JOB
                                --> source_dependency = 94500 (HTTP_STREAM_JOB)
---

What is the expected output? What do you see instead?

a) <script> request should block on rel=subresource request.
b) Subresource request should be fetched with high(er) priority. 

Duplicate download aside (a clear bug), currently subresource is not a very useful optimization hint since it downloads everything at a priority level just above prefetch [1], but below any of the critical page resources (i.e. it appears to be on par with image). As such, using subresource for anything but images means you'll be competing with large image downloads and will likely only make the situation worse if you actually need that resource for anything critical.

Our own wiki dances around this: "The subresource should be loaded with the same semantics as if it were a resource discovered within the page- with the same HTTP header construction and the same priority as it would otherwise use." [2] Except, we don't actually do that - currently, all resources are treated in the same way.

The obvious challenge here is how to determine the type of resource. Relying on filename extensions could work, but can also be misleading. Alternatively, could we leverage type attribute on link? E.g..

<link rel="subresource" href="jquery.js" type="application/javascript">
<link rel="subresource" href="style.css" type="text/css">
<link rel="subresource" href="photo.webp" type="image/webp">

Yes, in theory I could lie and fetch photo.webp as "text/css", which would give it high priority, but I can do that anyway if I simply inject a rel="stylesheet" link pointing at an image asset... The argument in the other direction is: why bother with subresource at all, if we can just inject the appropriate tags? I think one good reason is that subresource allows us to fetch an asset without any post-processing - i.e. skip parsing and execution of JS, or decoding the image, etc., which is a big plus, since I may want to defer those actions until later in the lifecycle of the app/page. 

[1] https://code.google.com/p/chromium/codesearch#chromium/src/third_party/WebKit/Source/core/fetch/ResourceFetcher.cpp&q=LinkSubresource&sq=package:chromium&type=cs&l=134
[2] http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource
 
Cc: simonjam@chromium.org japhet@chromium.org
Labels: -Cr-Internals-Network Cr-Blink-Loader
This isn't a browser-side network stack issue, but a Blink loader issue. Redirecting accordingly.

FWIW, I would expect the loader to generate two requests as we see here in the network stack. But I would expect the loader's cache to only result in a single network request sent to the browser network stack.
Okay, there are a couple of things in this bug.

First, I think this code review will fix the duplicate requests issue: https://codereview.chromium.org/25772002/

Second, I kinda like the idea of adding type to subresource. I definitely don't like the idea of using file extensions.

We definitely need some feedback to assign the correct priority. Right now, there's little point in using subresource, because it's always no better than just declaring the resource.

Separately, the type gives advanced developers a non-obvious lever for boosting certain requests. You could mark a critical image as text/css and it'll get the top priority. That's not strictly a good thing, but I'm in favor of giving advanced users something. I know our heuristics aren't perfect and this gives them an out.

On the Blink side, we should make sure that subresource means the resource is always fetched from MemoryCache, like a preload. Otherwise, if it's not cacheable, it's just a waste.
James, that patch looks to be only in the browser-side loader code, not the renderer-side. How does that de-dupe requests? Shouldn't that be done renderer-side in MemoryCache?

In terms of subresource, I agree it's kinda broken. Do we think that type is enough? I agree it's an improvement. Ilya, can you bring this proposal to the relevant standards bodies?
> On the Blink side, we should make sure that subresource means the resource is always fetched from MemoryCache, like a preload. Otherwise, if it's not cacheable, it's just a waste.

Just to make sure I understand: you're saying subresource should be fetched, placed in http cache (if correct headers are present), *and* loaded into Blink's memory cache such that we don't have to incur the cache lookup later?
@willchan: You're right. I'm confused. That patch is for prefetch. The MemoryCache should already de-dup. If it's not, maybe the underlying resource is not cacheable.

@igrigorik: Not exactly. It should be like the preload scanner. That places the resources in Blink's memory cache unconditionally. Then when the parser tries to fetch the same resource, it uses the copy in the memory cache unconditionally.

I think it'd be ideal if subresource should work the same way. Otherwise, what's the point of having a subresource if it doesn't lead to a cache hit?
@willchan: Sadly, there are no specs for any of the pre* features -- they're mentioned as "a thing" in the whatwg land on <link>, but there are no definitions on how they should work, etc. That, I think, is the root of our problem.. we need specs for subresource, prefetch, and others. Especially if we want this stuff to work across browsers (which is not the case today). I'll try to draft something to that extent, stay tuned. (But that shouldn't block this bug ;-))

@simonjam: Agreed. That said, even if the resource didn't make it into blinks memorycache (ideally, it should), there is still value in caching it at the HTTP layer (assuming resource has appropriate headers).
I don't think the de-duping should be blocked on spec work. But the use of a type attribute to allow user agents to prioritize appropriately is something I'm interested in getting wider feedback on. It might be best to split this bug into two?

As for why it's not getting de-duped at the browser HTTP cache layer, I am skeptical there's a bug here. I suspect the requests simply look different to the HTTP cache layer and so they don't get de-duped. Ilya, if you can provide a net-internals log, we can verify this.
Disregard the attachment in my original post, think I had devtools open with cache disable flag... Instead, see attached file -- looks a bit different, but nonetheless as far as I can tell, we're making two requests for the file (Version 32.0.1684.0 canary).
Labels: Cr-Internals-Network-Cache
I deleted the old netlog, and Ilya provided a repro case: http://jsbin.com/OweVOZi/1/quiet.

I looked into it, but don't know enough about HttpCache to answer this. Adding cache folks.
The HttpCache normally de-dupes these requests (as long as the caching headers allow that). However, with the proper timing in place we may end up not doing that when a request has already gone through two calls to the backend trying to locate an entry (Open + Create entry), so we simply log the event and issue the second request on pass-through mode.

The current histogram says that this happens for 0.27% of requests.

The "nice" fix is to provide an atomic call to Open/Create entry at the disk layer. That eliminates the race and reduces thread hops for the common case. Given the work in progress to land the updated disk cache (v3), I don't anticipate changing the interface right away.

If we really want to fix this soon (at the HttpCache layer), we could add more logic to count how many times we retry operations before going to pass-though.
Hmm. HttpCache aside though, could this be caught higher, say in the Blink's loader logic? That's not to say we shouldn't also address it at the HttpCache layer..
Blink should already do that. It doesn't have a complete implementation of the HTTP caching rules though. It only de-dupes in trivial cases if it's certain it can reuse the resource.* Otherwise, it errs on the safe side and falls back on the disk cache.

* This logic only applies after the load event. Prior to the document load event, it should always reuse things with the same URL.
@simonjam: why does the subresource case fall outside of the "trivial" one? The URL is identical. 
The HTTP headers would be my guess. I dunno, I haven't dug into this particular case. It would probably be good for someone to take a look. (I need to wrap up some other stuff before I leave first.)
The only difference in the headers is that prefetch adds "Purpose: prefetch". Also, internally, there is a different priority level assigned to the request.

Comment 16 by dxie@google.com, Nov 11 2013

Labels: M-33
Owner: rvargas@chromium.org
Status: Assigned
Can you take a look to see if this is a network cache issue on comment #10?  If not, can you assign this to a blink engineer?
Owner: japhet@chromium.org
Given that both request come from the same page, I'd say that we want to fix the blink issue first, and then add an atomic OpenCreate on the net code (mostly to cover the case of multiple tabs racing for the same resource).
Project Member

Comment 18 by bugdroid1@chromium.org, Dec 17 2013

Labels: -M-33 MovedFrom-33 M-34
Moving all non essential bugs to the next Milestone.
I just ran into this issue myself and wanted to note that the Chromium documentation is very clear on how this should work: 

"Once the rel=subresource request has been made, the browser may discover the same resource as it is used within the page.  The browser must not request the resource twice, and must recognize that the resource has already been requested."

from http://www.chromium.org/spdy/link-headers-and-server-hint/link-rel-subresource.

This should have nothing to do with whether or not the object is cacheable. In fact, we can't know that an object is cacheable until we receive the response anyway so the de-dup decision must be based on request headers alone.

Let me know when a fix is in the dev branch and I'm happy to help test.

Thanks,

Peter

Comment 20 by dxie@chromium.org, Mar 3 2014

Labels: -M-34 MovedFrom-34
This issue has already been moved once and is lower than Priority 1,therefore removing mstone.
Chromium team,

Is there a fix for this issue yet? If so I'd love to be able to apply it to my Chromium build as I've been doing testing of the idea described here: http://caffeinatetheweb.com/baking-acceleration-into-the-web-itself/. I see this as a critical performance feature.

Thanks for your help!

Peter
I would love to see this fixed soon.  Our team would like to implement a feature that leverages this, but we are currently blocked and this affects our results on Chrome.
All,

I have a local fix for this bug that appears to be working. But I'd like some Chromium expert feedback on my approach if it's at all possible.

I browsed to this site: http://caffeinatetheweb.com/prefetched/lots.html?max=100, which has 200 LINK subresource hints -- 100 javascript files and 100 jpeg files. Note that I used the "type" attribute to distinguish between script and image fetches. Those 200 files are subsequently requested in the body of the html. By viewing the attached net-log you can see that only one copy of each of these is being fetched, which resolves the reported issue.

My changes were only to the LinkLoader::loadLink method. They can be found below between "PJL changes" and "End PJL changes" comments. Note that the changes below only support javascript and image type LINK rel=subresource prefetching. Sorry for not creating a standard pull request but I'm not yet up-to-speed on Chromium dev practices.

Feedback would be very much appreciated.

Thanks,

Peter


bool LinkLoader::loadLink(const LinkRelAttribute& relAttribute, const AtomicString& crossOriginMode, const String& type, const KURL& href, Document& document)
{
    if (relAttribute.isDNSPrefetch()) {
        Settings* settings = document.settings();
        // FIXME: The href attribute of the link element can be in "//hostname" form, and we shouldn't attempt
        // to complete that as URL <https://bugs.webkit.org/show_bug.cgi?id=48857>.
        if (settings && settings->dnsPrefetchingEnabled() && href.isValid() && !href.isEmpty())
            prefetchDNS(href.host());
    }

    // FIXME( crbug.com/323096 ): Should take care of import.
    if ((relAttribute.isLinkPrefetch() || relAttribute.isLinkSubresource()) && href.isValid() && document.frame()) {
        if (!m_client->shouldLoadLink())
            return false;
        
        //PJL changes to fix https://code.google.com/p/chromium/issues/detail?id=312327
        WTF_LOG(ResourceLoading, "LINK url %s. type=%s.", href.elidedString().latin1().data(), type.latin1().data());
        if(type.contains("application/javascript", false))
        {
            WTF_LOG(ResourceLoading, "LINK type indicates javascript.");
            FetchRequest linkRequest(ResourceRequest(document.completeURL(href)), FetchInitiatorTypeNames::internal);
            if (!crossOriginMode.isNull())
                linkRequest.setCrossOriginAccessControl(document.securityOrigin(), crossOriginMode);
            //setResource(document.fetcher()->fetchScript(linkRequest));
            ResourcePtr<ScriptResource> fetchedScript = document.fetcher()->fetchScript(linkRequest);
        }
        else if(type.contains("image", false))
        {
            WTF_LOG(ResourceLoading, "LINK type indicates image.");
            
            FetchRequest linkRequest(ResourceRequest(document.completeURL(href)), FetchInitiatorTypeNames::internal);
            if (!crossOriginMode.isNull())
                linkRequest.setCrossOriginAccessControl(document.securityOrigin(), crossOriginMode);
            ResourcePtr<ImageResource> fetchedImage = document.fetcher()->fetchImage(linkRequest);
        }
        //END PJL changes
    }

    if (const unsigned prerenderRelTypes = prerenderRelTypesFromRelAttribute(relAttribute)) {
        if (!m_prerender) {
            m_prerender = PrerenderHandle::create(document, this, href, prerenderRelTypes);
        } else if (m_prerender->url() != href) {
            m_prerender->cancel();
            m_prerender = PrerenderHandle::create(document, this, href, prerenderRelTypes);
        }
        // TODO(gavinp): Handle changes to rel types of existing prerenders.
    } else if (m_prerender) {
        m_prerender->cancel();
        m_prerender.clear();
    }
    return true;
}
net-internals-log.json
2.0 MB Download

Comment 24 by y...@yoav.ws, Jun 23 2014

Hi Peter,

Thanks for sharing your patch.
As you guessed, this is not the most appropriate forum for code reviews You may want to check out http://dev.chromium.org/developers/contributing-code .

Otherwise, glancing over your patch, I see a few things that need fixing, where the major one IMO is that there's no real spec for this behavior change. Ilya's doc outlines the behavior (and changes "subresource" to "preload"), but my guess is that you'd need to send out an "intent to implement" thread, and put these changes behind a flag until a spec emerges. In any case, the blink-dev mailing list is probably the best place to discuss this feature change.

Based on the discussions on the webperf mailing list, it feels to me like this issue is going to be resolved via a change to the spec so I'm going to wait to submit a fix until this is resolved in Ilya's doc: https://igrigorik.github.io/resource-hints/.
Cc: -willchan@chromium.org
I am using link="subresource" on my website, but still chromium shows duplicate request corresponding to that.
When is this problem likely to be solved?
Thanks replying.
Does "preload" download the content in same priority as that of "subresource"? If yes, then hows it different from the other one?
Is "subresource" not supported? 
When I am using preload instead of subresource for the assets, only one request is being shown in chrome network tab. But the Initiator corresponding to that request is pointing to the link where I am emitting "stylesheet link" and not "preload" onto the page. 

Is this a known issue? Or preload is not prioritizing the download as that of normal request?
Preload implementation still underway. Unless you're working with a custom build, your tests wouldn't be triggering the appropriate logic. 

Re, priority: no, we'll most likely use type to initialize correct priority, see: https://github.com/w3c/preload/pull/26
Thanks, Is there an ETA for this to complete?
Soon (tm). Follow https://code.google.com/p/chromium/issues/detail?id=471199 for updates.
Is this issue(link type=subresource leading to duplication) resolved with chrome 46?
No, preload implementation is still WIP. Also, note that subresource is likely to be deprecated.. Once available, you should use rel=preload.
Status: WontFix
rel=subresource is deprecated: https://groups.google.com/a/chromium.org/d/msg/blink-dev/Y_2eFRh9BOs/gULYapoRBwAJ, marking this as wontfix :)

Sign in to add a comment