New issue
Advanced search Search tips

Issue 711972 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows , Mac
Pri: 2
Type: Bug



Sign in to add a comment

File extension changes to *.sdx; whenever I download a *.h (or *.hh) file via Facebook Messenger

Reported by 3ic...@gmail.com, Apr 16 2017

Issue description

UserAgent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36

Steps to reproduce the problem:
1. Create a simple header.h file (Contents can be anything, for example I used "hi")
2. Send it to yourself on Facebook Messenger: https://www.messenger.com/
3. Click to download it

What is the expected behavior?
The header.h file should appear in my downloads folder (or displayed in the browser)

What went wrong?
I get a header.sdx instead of a header.h file (File contents are good, but incorrect file extension. Forcing me to rename every C++ header file I download.)

Did this work before? N/A 

Chrome version: 57.0.2987.133  Channel: stable
OS Version: 10.0
Flash Version: 

I experience a very strange file extension change; whenever I download a .h (or .hh) file (c++ headers) from Facebook, it gets renamed to .sdx (associated with Microsoft's Secure Download Manager, installed on my computer)

The file is sent to me as an attachment in Messenger. My own file server displays .h and downloads .hh files normally. Only in Facebook Messenger is this strange bug reproducible. (They use a window.location redirect to serve the file.) And only with Google Chrome does this happen. I of course tried it in other browsers. (e.g. Microsoft Edge)

Furthermore, hpp, cpp, pro, user, and ui file extensions all work fine. Also hhh works. But not h or hh. Except if the *.h file contains not text but binary data. Only then does it remain a h file upon download. But of course C source code is never in binary form, that'd be named *.o not *.h
What is looking inside my downloaded header files and renaming them to sdx when they're plaintext / ASCII? Very strange.
 

Comment 1 by 3ic...@gmail.com, Apr 16 2017

component:UI>Browser>Downloads

Comment 2 by 3ic...@gmail.com, Apr 16 2017

New development: I uninstalled Secure Download Manager (Should have done that to start with) and now my .h files become .txt upon downloading... Slightly better, but still not optimal. There is no security risk associated with plaintext header files, so why do they get renamed automatically?

Inspecting what the server gives me, the difference between how .h and .hhh files are served is:
content-type:text/plain
vs
content-type:application/octet-stream

This might turn out to be a bug in Messenger. But then how come Microsoft Edge saves the .h file without renaming it to txt. Why not Chrome?
Labels: Needs-Triage-M57

Comment 4 by mmenke@chromium.org, Apr 19 2017

Components: -UI UI>Browser>Downloads
Labels: -Needs-Triage-M57 M-60 OS-Linux OS-Mac
Status: Untriaged (was: Unconfirmed)
Thanks for filing the issue.

Able to reproduce the issue on Windows 7,Mac-10.12.4 & Ubuntu 14.04 using Chrome stable version-58.0.3029.81 and canary-60.0.3075.0 with the steps mentioned in comment#0 & 2.Observed header.h file is downloaded as header.txt in www.messenger.com.

This is Non-regression issue, Observed from M30 and confirming this issue to get more inputs from Dev team.Hence marking it as 'Untriaged'.

Please find the attached screencast for reference.

711972.mp4
1.2 MB View Download
Labels: Needs-triage-Mobile

Comment 7 by dah...@chromium.org, Apr 20 2017

Owner: xingliu@chromium.org
Status: Available (was: Untriaged)
Cc: asanka@chromium.org dtrainor@chromium.org qin...@chromium.org mmenke@chromium.org
The mime type of the .h file in facebook messenger is text/plain, so net::GenerateFileName and EnsureSafeExtension will replace the the extension with txt.

I think we do mime sniffing for some security reason. cc to people who might know this. Is it an expected behavior for chrome for better security?


The URL looks like this:
 https://cdn.fbsbx.com/v/t59.2708-21/17695926_777102685777968_2141803158614048768_n.h/base_file.h?oh=ea5f9f126bbc832d817aef466cfacc47&oe=5901E26B&dl=1


Comment 9 by asanka@chromium.org, Apr 26 2017

This isn't due to MIME sniffing. It's due to a number of heuristics that are in place to deal with legacy servers that don't set correct headers for downloads. In this particular case, the presence of the query string is taken to mean that the last "path" component of the URL doesn't convey a correct filename. Hence Chrome attempts to correct the file type based on the MIME type of the resource.

The correct server behavior would be to set a Content-Disposition header to indicate the desired filename. If the server doesn't recognize the file type, then it shouldn't set a Content-Type header.

Unnecessarily setting the Content-Type header forces the UA to treat the resource as being of that type, and this gets carried over to the on-disk filename as seen here.

I'd vote for WontFixing this bug. If we were to always treat the last component as authoritative for downloads with no Content-Disposition header, then we'd break a class of downloads where the last component just happens to be a script that's generating the content. So you'll end up with images being saved as generate.php for example.
I see, thanks for the comment here.

The facebook server here didn't give a Content-Disposition here.

Will double check with other browser and then mark it as Won't fix.

There is a weird thing I noticed is, if I query the url with some rest client or curl, the facebook server did give us a Content-Disposition header.
Probably worth checking what's going on using a net-internals log. Is the Content-Disposition header correctly formatted? Chrome will ignore malformed C-D headers.
Status: WontFix (was: Available)
Just double checked the headers in Chrome, it's ok, but the content-disposition only has "attachment" but no file name specified. In that case Chrome will honor the mime type instead of the file extension analyzed from the url.

Marked as won't fix for now.

Response headers:
HTTP/1.1 200 OK
Last-Modified: Tue, 25 Apr 2017 21:41:50 GMT
Content-Type: text/plain
Content-Disposition: attachment
Expires: Tue, 09 May 2017 22:19:49 GMT
Cache-Control: max-age=1209600, no-transform
Date: Wed, 26 Apr 2017 01:37:49 GMT
Connection: keep-alive
Content-Length: 11199


if it is attachment, should we just respect the filename from the url? attachment means the link is for download, unlike those download.php urls
In general I'm not sure, a php url can serve the file in the response. If it didn't provide the file name in Content-Disposition then we will have a php file.


A Content-Disposition of 'attachment' is a clear indication that the server intends the resource to be downloaded, but without a filename attribute it does not lend credence to the URL being authoritative.

IOW, the UA can conclude that the resource should be downloaded, but not what it should be named.
I think the issue is that text/plain covers a large set of file extensions, see https://www.sitepoint.com/web-foundations/mime-types-complete-list/, if we know that .h extension is already text/plain, then we shouldn't rename it to .txt.

But then we need to maintain that mimetype<->extension map, which could be constantly changing
We use the system's map, and there's apparently something installed locally that says text/plain means SDX, presumably.

Sign in to add a comment