Do you mean SimpleURLLoader::DownloadToFile?
This bug is for renderer-initiated ResourceRequest::download_to_files for blobs, not URLFetcher-initiated save-to-files, which use an entirely different mechanism. SimpleURLLoader just opens and write to files in the current process, which won't work for renderers.
Or is there some DownloadToFile method in URLLoaderFactory that I'm missing?
Hi Matt, We were thinking that this would be a good starter bug for Sylvain to take on for a 20% on the Network Service. However, your description makes me think that it might be significantly more complex than we had envisioned (the OP had made us think that it would be a relatively straightforward implementation of a Mojo interface to take into account a flag in a similar way as existing code does). Can you describe the project here in some more detail? Thanks!
Unfortunately, I only know the ResourceLoader part of the API (download_to_file_resource_handler), and not how it integrates into the renderer logic. I think the right path forward here would ideally be to have the renderer send these requests to the browser, and have the browser manage the writing (And someone, not sure who, do the security checks on redirects and the like).
I think you'll need something more plugged in to src/storage for more details.
I think the thinking about this in the past would be that we'd get rid of download-to-file, replacing it with some kind of download-to-blob. Benefits of that would be that would get let the blob system deal with file management. The hard part there is that we don't have a way yet to create a blob from streaming data, but we're hoping to have some kind of API where you can just pass a mojo DataPipe to the blob system and eventually get a blob out of that.
See also issue 712693 , and +dmurph for his thoughts.
RE #8:
Marijin: Just to let you know, in order to make PDF loading working with Network Service, I need to be able to create a blob from an URL loader. I imagine instead of passing a mojo DataPipe, I will need to pass a URLLoaderClientRequest to the blob system. With the URLLoaderClient, the blob system will receive the DataPipe and also various loading events.
Have someone been working on this? I would be happy to learn more details about it (and help to make it happen if needed). Thanks!
So dmurph did some work on this in the linked bug with the old non-mojo blob system and non-mojo loading path. The tricky part here is that the blob system currently always assumes it knows how big a blob is before it can start creating the blob. In particular that means that we might need some changes in blobs memory/disk-paging logic to decide if/when a blob should be stored on disk instead of in memory.
So I imagine we'll add some kind of new method to the BlobRegistry mojo interface to which you pass something that produces the data, which then ultimately resolves either with the Blob (+uuid+size) or with an error. It's not clear yet to me if that "something" is indeed a URLLoaderClient or something like that, or if it is something more blob specific (problem with anything URLLoader related is that that is currently all tied up inside of content, while the blob system can't depend on content. Specifically its mojo interfaces can't depend on content since they are in WebKit/common and used by blink.
The implementation side would then somehow stream that data to a blob, "allocating" blob space as it goes (or failing if it realizes there is not enough blob space available anymore).
RE #10: In the URL loader case, there is no need to allocate any space at the blob system side, right? It retains the data pipe and later give it to the requester of the blob URL.
Or do you mean we would like to enforce some size restriction even if the data doesn't take up any storage in the browser process (or even the network process, if we don't pull down too much data waiting for the data pipe is drained)?
Thanks!
Hmm? Where did Blob URLs come into play? Ideally we'd be getting rid of any blob URL usage. But either way, no, we can't just retain the data pipe, as blobs can be shared, sliced, diced etc. So just like any other blob BlobMemoryManager has to allocate space for the blob either in memory (for small downloads) or on disk (for larger ones). The actual blob will then just refer that memory and/or file.
(Marijn and I had an offline chat.)
Blob URLs backed by streams don't seem to support being consumed by multiple clients or sliced. I will try to write down the current PDF loading sequence and my thoughts on how to make the blob system work for this scenario with Network Service. And then we can continue discussion from there.
The misleading part with streams is that they have nothing to do with the blob system, other than for historical reasons using "blob:" as the scheme for their URLs. But other than that they are really just an implementation detail of the extensions system...
So yeah. figuring out how to deal with streams is indeed a good thing to work on too, but unrelated to this bug afaict (and mostly unrelated to blob related work).
Something that I'm gathering out of this discussion is that this is not a simple starter bug for someone beginning a 20% on the Network Service :). mek@, is that assessment accurate?
I think this and issue 754493 are more or less duplicates (since XHR responseType=blob is afaik the only user of download_to_file currently). Anyway, my rough plan here is:
1) Add some API to the BlobRegistry mojom interface to create a blob from a datapipe (prototyped in https://chromium-review.googlesource.com/c/chromium/src/+/916731, although the implementation of that API there still needs a lot of work)
2) Update XHR code to use this API rather than download_to_file (by somehow getting the datapipe for the fetch response and passing that off to the blob system).
3) Get rid of download_to_file
4) Also somehow optimize FetchDataLoaderAsBlobHandle and related code to not always have to copy all the data to the renderer before passing it back to the browser.
So with the above CL the blob side of re-implementing the features that use download_to_file is done (there is still some cleanup/optimization work to be done on the blob side, but the functionality is there). So now that "just" needs somebody to implement the loading side of things (i.e. rewrite XHR's use of download_to_file to instead use this new blob API).
Given the very low usage (hovering under 0.0001%) and the fact that this is just used in pnacl apps, which will be deprecated, can we get rid of it now?
I agree that we should probably just get rid of it. Or at least get rid of PPAPI's usage of the API now, and maybe wait with actually getting rid of the implementation till the usecounter has reached stable, just to be extra cautious (but the use counter seems low enough that I would be really surprised if stable would show anything else).
Happy to put together a CL to either just disable usage from PPAPI, or get rid of the feature entirely.
Comment 1 by bugdroid1@chromium.org
, Dec 4 2017