New issue
Advanced search Search tips

Issue 603960 link

Starred by 2 users

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: All
Pri: 2
Type: Bug-Regression



Sign in to add a comment

Continuous build archive is escaping file names upon download

Project Member Reported by kbr@chromium.org, Apr 15 2016

Issue description

The continuous build archive is newly escaping file names upon download, making them incomprehensible. This is new behavior, but I'm not sure how new.

For example, consider this bucket:

https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Linux_x64/387622/

Download "chrome-linux.zip".

It used to download as:

Linux_x64-387622-chrome-linux.zip

but now downloads as:

Linux_x64%2F387622%2Fchrome-linux.zip

I'm not sure whether this is a bug in Google Cloud Storage or a bug in how Chromium's uploads are named.

 
Strange, I got Linux_x64-387622-chrome-linux.zip when I downloaded that file. Can you reproduce?

Comment 2 by kbr@chromium.org, Apr 18 2016

Yes, it's reproducible with Version 51.0.2704.7 dev (64-bit) on Goobuntu.

Comment 3 by kbr@chromium.org, Apr 21 2016

Also reproducible when downloading "chrome-win32.zip" from this directory:

https://commondatastorage.googleapis.com/chromium-browser-snapshots/index.html?prefix=Win/388798/

with 52.0.2712.0 (Official Build) canary (64-bit) on Mac OS X.

Labels: Infra-Troopers

Comment 5 by dnj@google.com, May 9 2016

On 50.0.2661.94, this still downloads fine. Since this broke between Chrome builds, perhaps this is actually a Chrome bug?

Comment 6 by kbr@chromium.org, May 9 2016

Components: UI>Browser>Downloads
Labels: -Restrict-View-Google
Adding label UI>Browser>Downloads and un-restricting view to get a wider audience.

This is due to crrev.com/c011ad50d54dba122868e2169019098656062ae8 , but I think there are other issues.

The request and response look like this:

                             :method: GET
                             :path: /download/storage/v1/b/chromium-browser-snapshots/o/Win%2F388798%2Fchrome-win32.zip?generation=1461260753672000&alt=media
                             :scheme: https

[...]
                         --> HTTP/1.1 200
                             status: 200
                             content-type: application/x-zip-compressed
                             content-disposition: attachment
                             content-language: en
                             content-length: 103141292
[...]

The supported way to specify a filename for a download is to use the Content-Disposition header with a 'filename' attribute. The response above doesn't have a filename attribute, and hence causes Chrome to use its legacy heuristics. The heuristic that succeeds is to use the last hierarchical component of the URL. That would be "Win%2F388798%2Fchrome-win32.zip" in this case.

Before crrev.com/c011ad50d54dba122868e2169019098656062ae8 , Chrome used to unescape the path separator character when using URL components (among other characters). So, prior to that change, the effective filename would've been: "Win/388798/chrome-win32.zip", except that path separators are illegal in a filename. So the sanitized filename becomes "Win-388798-chrome-win32.zip" where the offending characters have been replaced by '-'. Now that Chrome no longer unescapes path separators (which were illegal in a filename to begin with), it ends up with "Win%2F388798%2Fchrome-win32.zip".

If "Win-388798-chrome-win32.zip" was the intended filename instead of one that's arrived at accidentally, the server should state that via the Content-Disposition header. The heuristics for dealing with servers that don't use Content-Disposition is up to the browser implementation (says so right in the spec). Dealing with malformed filenames even more so. In this case, in order to arrive at the desired filename, the browser has to unescape the path separator character, and then re-escape it using '-'. This is fragile.

I'd suggest finding some way to get the server to respond with the correct Content-Disposition header.

Comment 8 by dnj@google.com, May 9 2016

Thanks for the explanation.

The download link is a Google Storage URL, so this can probably be fixed by setting the Content-Disposition header on the Google Storage object:

https://cloud.google.com/storage/docs/gsutil/addlhelp/WorkingWithObjectMetadata#content-disposition

This will need to be done by whatever script is uploading the file.
Looks like these builds are being uploaded from https://build.chromium.org/p/chromium/builders/Linux%20x64 using https://chromium.googlesource.com/chromium/tools/build/+/master/scripts/slave/chromium/archive_build.py

I can look into adding that header into the upload.
Cc: mmoss@chromium.org
Components: -Infra Infra>Platform>Buildbot
Labels: -Infra-Troopers
Owner: bpastene@chromium.org
Status: Started (was: Untriaged)
Thanks for volunteering, ben. Please make sure this gets addressed soon. Also CC'ing mmoss@, who understands release workflows and deals with other systems which upload builds to Google Storage and may also need to be updated.

Comment 11 by mmoss@chromium.org, Jun 21 2016

Cc: lafo...@chromium.org
 Issue 599436  has been merged into this issue.
Status: Assigned (was: Started)
Status: Started (was: Assigned)

Comment 14 by kbr@chromium.org, May 7 2017

Could this please be prioritized? It's annoying when doing manual bisects from the continuous archives.

Thanks.

Components: -Infra>Platform>Buildbot Infra>Client>Chrome

Sign in to add a comment