File extension changes to *.sdx; whenever I download a *.h (or *.hh) file via Facebook Messenger
Reported by
3ic...@gmail.com,
Apr 16 2017
|
||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36 Steps to reproduce the problem: 1. Create a simple header.h file (Contents can be anything, for example I used "hi") 2. Send it to yourself on Facebook Messenger: https://www.messenger.com/ 3. Click to download it What is the expected behavior? The header.h file should appear in my downloads folder (or displayed in the browser) What went wrong? I get a header.sdx instead of a header.h file (File contents are good, but incorrect file extension. Forcing me to rename every C++ header file I download.) Did this work before? N/A Chrome version: 57.0.2987.133 Channel: stable OS Version: 10.0 Flash Version: I experience a very strange file extension change; whenever I download a .h (or .hh) file (c++ headers) from Facebook, it gets renamed to .sdx (associated with Microsoft's Secure Download Manager, installed on my computer) The file is sent to me as an attachment in Messenger. My own file server displays .h and downloads .hh files normally. Only in Facebook Messenger is this strange bug reproducible. (They use a window.location redirect to serve the file.) And only with Google Chrome does this happen. I of course tried it in other browsers. (e.g. Microsoft Edge) Furthermore, hpp, cpp, pro, user, and ui file extensions all work fine. Also hhh works. But not h or hh. Except if the *.h file contains not text but binary data. Only then does it remain a h file upon download. But of course C source code is never in binary form, that'd be named *.o not *.h What is looking inside my downloaded header files and renaming them to sdx when they're plaintext / ASCII? Very strange.
,
Apr 16 2017
New development: I uninstalled Secure Download Manager (Should have done that to start with) and now my .h files become .txt upon downloading... Slightly better, but still not optimal. There is no security risk associated with plaintext header files, so why do they get renamed automatically? Inspecting what the server gives me, the difference between how .h and .hhh files are served is: content-type:text/plain vs content-type:application/octet-stream This might turn out to be a bug in Messenger. But then how come Microsoft Edge saves the .h file without renaming it to txt. Why not Chrome?
,
Apr 17 2017
,
Apr 19 2017
,
Apr 20 2017
Thanks for filing the issue. Able to reproduce the issue on Windows 7,Mac-10.12.4 & Ubuntu 14.04 using Chrome stable version-58.0.3029.81 and canary-60.0.3075.0 with the steps mentioned in comment#0 & 2.Observed header.h file is downloaded as header.txt in www.messenger.com. This is Non-regression issue, Observed from M30 and confirming this issue to get more inputs from Dev team.Hence marking it as 'Untriaged'. Please find the attached screencast for reference.
,
Apr 20 2017
,
Apr 20 2017
,
Apr 26 2017
The mime type of the .h file in facebook messenger is text/plain, so net::GenerateFileName and EnsureSafeExtension will replace the the extension with txt. I think we do mime sniffing for some security reason. cc to people who might know this. Is it an expected behavior for chrome for better security? The URL looks like this: https://cdn.fbsbx.com/v/t59.2708-21/17695926_777102685777968_2141803158614048768_n.h/base_file.h?oh=ea5f9f126bbc832d817aef466cfacc47&oe=5901E26B&dl=1
,
Apr 26 2017
This isn't due to MIME sniffing. It's due to a number of heuristics that are in place to deal with legacy servers that don't set correct headers for downloads. In this particular case, the presence of the query string is taken to mean that the last "path" component of the URL doesn't convey a correct filename. Hence Chrome attempts to correct the file type based on the MIME type of the resource. The correct server behavior would be to set a Content-Disposition header to indicate the desired filename. If the server doesn't recognize the file type, then it shouldn't set a Content-Type header. Unnecessarily setting the Content-Type header forces the UA to treat the resource as being of that type, and this gets carried over to the on-disk filename as seen here. I'd vote for WontFixing this bug. If we were to always treat the last component as authoritative for downloads with no Content-Disposition header, then we'd break a class of downloads where the last component just happens to be a script that's generating the content. So you'll end up with images being saved as generate.php for example.
,
Apr 26 2017
I see, thanks for the comment here. The facebook server here didn't give a Content-Disposition here. Will double check with other browser and then mark it as Won't fix. There is a weird thing I noticed is, if I query the url with some rest client or curl, the facebook server did give us a Content-Disposition header.
,
Apr 26 2017
Probably worth checking what's going on using a net-internals log. Is the Content-Disposition header correctly formatted? Chrome will ignore malformed C-D headers.
,
Apr 26 2017
Just double checked the headers in Chrome, it's ok, but the content-disposition only has "attachment" but no file name specified. In that case Chrome will honor the mime type instead of the file extension analyzed from the url. Marked as won't fix for now. Response headers: HTTP/1.1 200 OK Last-Modified: Tue, 25 Apr 2017 21:41:50 GMT Content-Type: text/plain Content-Disposition: attachment Expires: Tue, 09 May 2017 22:19:49 GMT Cache-Control: max-age=1209600, no-transform Date: Wed, 26 Apr 2017 01:37:49 GMT Connection: keep-alive Content-Length: 11199
,
Apr 26 2017
if it is attachment, should we just respect the filename from the url? attachment means the link is for download, unlike those download.php urls
,
Apr 26 2017
In general I'm not sure, a php url can serve the file in the response. If it didn't provide the file name in Content-Disposition then we will have a php file.
,
Apr 26 2017
A Content-Disposition of 'attachment' is a clear indication that the server intends the resource to be downloaded, but without a filename attribute it does not lend credence to the URL being authoritative. IOW, the UA can conclude that the resource should be downloaded, but not what it should be named.
,
Apr 26 2017
I think the issue is that text/plain covers a large set of file extensions, see https://www.sitepoint.com/web-foundations/mime-types-complete-list/, if we know that .h extension is already text/plain, then we shouldn't rename it to .txt. But then we need to maintain that mimetype<->extension map, which could be constantly changing
,
Apr 26 2017
We use the system's map, and there's apparently something installed locally that says text/plain means SDX, presumably. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by 3ic...@gmail.com
, Apr 16 2017