Cannot download PDF files served as POST request with "Cache-Control: no-store"
Reported by
michel.b...@gmail.com,
Sep 16 2016
|
||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2859.0 Safari/537.36 Example URL: https://europass.cedefop.europa.eu/editors/en/cv/compose Steps to reproduce the problem: 1. Go to https://europass.cedefop.europa.eu/editors/en/cv/compose 2. Create a CV, fill in the data 3. Press on «Preview», you'll see your CV in PDF file 4. Try to download this file, press on download icon in Chrome's UI What is the expected behavior? Chrome downloads this PDF file. What went wrong? You'll get «Failed Network error». If you press on «Resume», you'll get «Failed - Server problem». There is no such issue in Edge. Did this work before? N/A Chrome version: 55.0.2859.0 Channel: dev OS Version: 10.0 Flash Version: Shockwave Flash 23.0 r0 Google Chrome 55.0.2859.0 (Official Build) dev-m (64-bit) Revision 3f63c614e8c4501b1bfa3f608e32a9d12618b0a0-refs/heads/master@{#418117} OS Windows JavaScript V8 5.5.167 Flash 23.0.0.173 User Agent Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/55.0.2859.0 Safari/537.36 Command Line "C:\Program Files (x86)\Google\Chrome\Application\chrome.exe" --flag-switches-begin --enable-browser-task-scheduler --clear-data-reduction-proxy-data-savings --no-pings --enable-new-bookmark-apps --enable-account-consistency --enable-app-window-controls --enable-appcontainer --enable-canvas-2d-dynamic-rendering-mode-switching --enable-clear-browsing-data-counters --enable-embedded-extension-options --enable-experimental-canvas-features --enable-experimental-web-platform-features --enable-google-branded-context-menu --google-profile-info --enable-gpu-rasterization --enable-icon-ntp --enable-input-ime-api --javascript-harmony --enable-md-policy-page --enable-message-center-always-scroll-up-upon-notification-removal --new-profile-management --enable-offline-auto-reload-visible-only --enable-offline-auto-reload --enable-win32k-lockdown-mimetypes=* --enable-push-api-background-mode --enable-quic --enable-session-crashed-bubble --enable-site-engagement-service --enable-spelling-feedback-field-trial --enable-suggestions-with-substring-match --enable-tab-audio-muting --enable-unsafe-es3-apis --enable-wasm --enable-webfonts-intervention-v2=enabled-2g --enable-webgl-draft-extensions --enable-webrtc-dtls12 --enable-zero-copy --extension-content-verification=enforce_strict --force-pnacl-subzero --force-ui-direction=ltr --gpu-rasterization-msaa-sample-count=16 --ignore-gpu-blacklist --enable-lcd-text --mark-non-secure-as=neutral --media-router=1 --mhtml-generator-option=skip-nostore-main --num-raster-threads=4 --enable-overlay-scrollbar --reduced-referrer-granularity --save-page-as-mhtml --secondary-ui-md --enable-smooth-scrolling --ssl-version-max=tls1.2 --supervised-user-safesites=disabled --top-chrome-md=material --try-supported-channel-layouts --sync-url=https://chrome-sync.sandbox.google.com/chrome-sync/alpha --use-winrt-midi-api --v8-cache-options=code --v8-cache-strategies-for-cache-storage=aggressive --v8-pac-mojo-out-of-process --enable-features=CredentialManagementAPI,FeaturePolicy,FontCacheScaling,MaterialDesignHistory,MaterialDesignSettings,MaterialDesignUserManager,MaterialDesignUserMenu,NewAudioRenderingMixingStrategy,OptimizeLoadingIPCForSmallResources,OriginTrials,ScrollAnchoring,StaleWhileRevalidate2,TranslateLanguageByULP,TranslateUI2016Q2,UsePasswordSeparatedSigninFlow,V8Ignition,ViewsSimplifiedFullscreenUI,WebRTC-H264WithOpenH264FFmpeg,WebUSB --flag-switches-end
,
Sep 20 2016
Looks like the PDF Viewer is using LOAD_ONLY_FROM_CACHE from a no-store resource, which isn't expected to work, so issue seems to be with the PDF Viewer.
,
Sep 21 2016
I don't see the PDF Viewer directly referring to LOAD_ONLY_FROM_CACHE. So it's not obvious from my perspective what the PDF Viewer can do to improve the behavior.
,
Sep 21 2016
There's a different name for it, blink-side.
,
Sep 23 2016
Is this related to the problems reported here? https://productforums.google.com/d/msg/chrome/E-UYoPPUDNU/IfT3GyGpAwAJ
,
Oct 19 2016
We have the same problem, any update on this? Can you confirm if this is a bug with PDF viewer? We are able to see and download the pdf files by using other browsers- IE / Firefox/ Safari.
,
Oct 27 2016
Any update on this?
,
Nov 2 2016
On comment #4, the blink side name is WebCachePolicy::ReturnCacheDataDontLoad
,
Nov 3 2016
,
Jul 6 2017
I just hit the same issue. When a PDF file is fetched with POST and the headers say no caching, the PDF viewer displays it inline but the download button fails with a "Network error". I assume the download refuses to re-issue the POST because it's a POST but... this feels wrong. It works in Firefox. I feel like that button should ignore the no-cache request and just download whatever was actually fetched for display inline, instead of re-issuing any network requests.
,
Jul 6 2017
,
Aug 10 2017
Here's a reduced test case that demonstrates the problem. It seems that a combination of a POST request with Cache-Control:no-cache headers prevents the PDF from being downloaded. Steps to reproduce: 1. Download pdf-nocache.py 2. Run `python pdf-nocache.py` 3. Navigate to the URL that the Python program prints 4. Click the "Show PDF" button 5. The PDF button loads 6. Click the Download button in Chrome's PDF viewer 7. The download fails with "Failed - Network Error"
,
Aug 10 2017
I would wager that the better approach here would be to rename the "Download" button to "Save" and change the behavior so that it writes to disk whatever is currently being displayed, instead of ever issuing another request to the network layer. Even where POST isn't involved, I find it silly that PDFs that are fetched via GET through some download portal/script that returns no-cache headers are re-downloaded when the browser already has a copy. The button isn't a refresh button and I dare say users would never expect that something *different* from what they're currently seeing on the screen could potentially be what is being saved to disk. So there is no UX reason to actually re-download the file, regardless of what cache headers say, and it improves UX in the many cases of PDFs served from scripts to not re-fetch them (which could fail or unexpectedly result in a different document). The present POST issue is really just a corner case of what I think is an incorrect design higer up the stack in this case.
,
Aug 10 2017
It's not re-issuing a POST as a GET. When the PDF plugin requests a download, the browser attempts to retrieve the resource through the cache layer, to avoid re-requesting it from the server. But when Cache-Control:no-store is used, the cache layer respects that header and does not store the response object. When a Cache-Control:no-store PDF is served via GET, the browser *will* issue a new request GET request because it is ostensibly safe to do so for an idempotent GET. But since it is not safe to re-issue a POST request, the cache entry lookup fails, which causes the download to fail. For a normal GET or POST that does not have Cache-Control:no-store, the resource is served directly from the cache to the DOWNLOAD_FILE (i.e. no additional HTTP request). The logic for aborting the cache entry in the face of "no-store" is in HttpCache::Transaction::WriteResponseInfoToEntry: https://cs.chromium.org/chromium/src/net/http/http_cache_transaction.cc?sq=package:chromium&dr=C&l=2815 The PDF plugin sends PpapiHostMsg_PDF_SaveAs to the browser when the download button is pressed, which makes its way via Mojo to PDFWebContentsHelper::SaveUrlAs. That calls WebContents::SafeFrame, which leads to the creation of a download job here: https://cs.chromium.org/chromium/src/content/browser/web_contents/web_contents_impl.cc?gsn=SaveFrame&l=3273 I have no ideas for a solution (this is not my area), but that explains why/what is happening. This is the net-internals log for trying to save the PDF served by my test script in #12: 143565: URL_REQUEST http://127.0.0.1:50957/ Start Time: 2017-08-10 16:13:27.135 t=150466 [st=0] +REQUEST_ALIVE [dt=1] --> priority = "LOWEST" --> url = "http://127.0.0.1:50957/" t=150466 [st=0] URL_REQUEST_DELEGATE [dt=0] t=150466 [st=0] +URL_REQUEST_START_JOB [dt=1] --> load_flags = 32780 (MAYBE_USER_GESTURE | ONLY_FROM_CACHE | SKIP_CACHE_VALIDATION) --> method = "POST" --> upload_id = "1502396003997313" --> url = "http://127.0.0.1:50957/" t=150466 [st=0] URL_REQUEST_DELEGATE [dt=0] t=150466 [st=0] HTTP_CACHE_GET_BACKEND [dt=0] t=150466 [st=0] HTTP_CACHE_OPEN_ENTRY [dt=0] --> net_error = -2 (ERR_FAILED) t=150467 [st=1] -URL_REQUEST_START_JOB --> net_error = -400 (ERR_CACHE_MISS) t=150467 [st=1] URL_REQUEST_DELEGATE [dt=0] t=150467 [st=1] -REQUEST_ALIVE --> net_error = -400 (ERR_CACHE_MISS) t=150467 [st=1] DOWNLOAD_STARTED --> source_dependency = 143568 (DOWNLOAD) 143568: DOWNLOAD Start Time: 2017-08-10 16:13:27.136 t=150467 [st= 0] +DOWNLOAD_ITEM_ACTIVE [dt=3701] --> danger_type = "NOT_DANGEROUS" --> file_name = "" --> final_url = "http://127.0.0.1:50957/" --> has_user_gesture = false --> id = "200" --> original_url = "http://127.0.0.1:50957/" --> start_offset = "0" --> type = "NEW_DOWNLOAD" t=150467 [st= 0] DOWNLOAD_URL_REQUEST --> source_dependency = 143565 (URL_REQUEST) t=154168 [st=3701] DOWNLOAD_ITEM_INTERRUPTED --> bytes_so_far = "0" --> interrupt_reason = "NETWORK_FAILED" t=154168 [st=3701] -DOWNLOAD_ITEM_ACTIVE
,
Aug 27 2017
I believe I am having the same problem. I have been trying to download an important PDF document from my Government's website, which is a personal payment summary for the previous financial year. Unfortunately, the document has the "no-cache" and "no-store" headers applied, and also produces the "Failed - Network error" message when trying to download the file. Having said this, I was able to get it to both display and download in Internet Explorer 11. So, definitely a problem with Google Chrome. I daresay that thousands of other people are affected by the same issue, as Centrelink and MyGov are fairly big govt. websites in Australia, which is where I am trying to download the file from. People use these sites for their social security, taxation, and other government related information and contacts. PDF documents have become the standard for downloading and printing off information. Unfortunately, despite Chrome being the most popular browser in the world, we still appear to be dependant on using more than one web browser other than Chrome on occasion, to get certain things done. I am requesting that the priority of this case be set higher, and for the issue to be reviewed by a Chrome developer with the relevant knowledge. I have also attached a network log file for review (please keep this confidential). Thanks.
,
Aug 27 2017
This is a public bug tracker, anything you post here can be downloaded by anyone. You may want to delete your comment if the attachment contains confidential information.
,
Aug 27 2017
Yeah, this is one of those catch-22 scenarios... damned if I do, damned if I don't. I made sure that "strip private information" was selected before logging. All I can do now is ask that anything else private remains confidential.
,
Sep 6 2017
Any updates on this, yet?
,
Oct 27 2017
Also experiencing this issue with our product. It may cause problem by several hundred of our customers. Any plan to fix this bug?
,
Oct 29
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue. Sorry for the inconvenience if the bug really should have been left as Available. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Nov 28
Arthur-- if I remember correctly you've once looked into PDF downloading? Do you think you could take a look at this one?
,
Nov 28
I tried to reproduce comment 12. Tests: 1) Google Chrome:70.0.3538.110 (Official Build) (64-bit) Revision:ca97ba107095b2a88cf04f9135463301e685cbb0-refs/branch-heads/3538@{#1094} OS:Linux 2) Chromium: 72.0.3625.0 (Developer Build) (64-bit) Revision: 2788395cc17f551a8619f91df1a4a42079a32ca8-refs/heads/master@{#610864} OS:Linux It worked on both. This issue is very old, it looks like someone fixed it. I am closing this bug. Feel free to open it again if you can reproduce it.
,
Nov 29
This is still reproducible on 71.0.3578.75 on Mac. The download fails with "Failed - Network error".
,
Dec 3
Loading triage: per comment 23, let me assign Arthur again. On my 71.0.3578.53 (Official Build) unknown (64-bit) / Linux, I can reproduce the issue by using pdf-nocache.py.
,
Dec 3
I had the chrome://settings option to download PDF instead of displaying them... I can now reproduce.
,
Dec 3
The download triggered from the PDF viewer is not part of a navigation. So I am not the right owner here. I will take a look to understand what really happens and try to find an owner.
,
Dec 3
,
Dec 3
I read comment 2 and comment 14. They makes sense. The http cache returns net::ERR_CACHE_MISS because the response wasn't cached. It can't reissue the request because it is a POST and it would have potential side effect on the server. I have no solution for this. I am unassigned myself. I guess the pdf viewer should store the initial response themselves to make the "download" button to work. Maybe the pdf viewer could try to override headers and turn POST into GET, but it doesn't look like a good idea. The user might get a different document. +CC thestig@ FYI. |
||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||
Comment 1 by shobhity...@gmail.com
, Sep 18 2016