Add support for Metalink XML Download Description Format
Reported by anthonyb...@gmail.com, Sep 7 2008
Product Version : 0.2.149.29 URLs (if applicable) : http://www.metalinker.org/ Other browsers tested: Add OK or FAIL after other browsers where you have tested this issue: Safari 3: FAIL Firefox 3: FAIL IE 7: FAIL What steps will reproduce the problem? 1. Download a .metalink, such as http://www.metalinker.org/samples/OOo/OOo_2.4.1_Win32Intel_install_en- US.exe.metalink What is the expected result? The download agent parses the XML file and uses a resource URL to start the download. The partial file checksums can be used to detect errors during or after the download. If a server goes down during the transfer, backup URLs can be switched over to. All this is done transparently for the person at the browser with no interaction required. What happens instead? The .metalink file is downloaded but not processed and the files referred to in the .metalink are not downloaded. Please provide any additional information below. Attach a screenshot if possible. Metalink is currently used by around 35 programs, mostly download programs, download managers, FTP clients, & web browsers. It's currently used by mostly open source projects like OpenOffice.org, openSUSE, Ubuntu, Fedora, cURL, and others. For most people to take advantage of Metalink, it needs to be included in browsers. Chromium could be the first open source browser to add support. Metalinks give more definite information about a download, which should be more conducive to searching. http://en.wikipedia.org/wiki/Metalink http://www.metalinker.org/
Sep 8 2008,
Jan 8 2009,
Jan 30 2009,
This feature request is fairly involved. First, this requires rewriting the current download manager. By doing a quick search for "metalink" at http://aria2.svn.sourceforge.net/viewvc/aria2/trunk/src/ that gives 58 different files. (Ok there's probably too many source files in this implementation). That in addition to the multipart download logic that needs to be added. I wouldn't object to a contribution so I'll leave this issue available.
May 15 2009,
Some of the download programs that support metalink support multipart/multi-source downloads, but that is not a requirement. (aria2 is the most advanced metalink client). From what I've read, Chromium already uses libxml2 (which aria2 also uses) for XML and NSS for checksums. These are the foundations of metalink support. I can't argue that this feature request is fairly involved. But perhaps it could be approached in stages, from easiest/simplest and moving on from there. Stage 1 could include extracting a single FTP or HTTP URL from the metalink XML, then downloading from that URL (no multipart download, just single source). Stage 2, checksum the whole file to see if the file has been corrupted. Stage 3, failover to alternate URLs if a server becomes unreachable. Stage 4, use chunk checksums to tell if there are errors in a download, and only re-get the chunks with error so the download can be repaired. Without getting too insanely complicated, or supporting multipart downloads, by stage 2 you've helped many people, especially those on Windows who don't have native checksumming tools (md5sum, etc). Many of the people who download OpenOffice.org, openSUSE, Ubuntu, Fedora, Sabayon, and other projects that use metalink are on Windows. The first step in dealing with problems for downloads is usually a manual checksum verification which is a support nightmare when dealing with inexperienced people. Metalink aims to make downloads much easier and to be able to recover from transmission errors, servers going down, etc, and complete without the user needing to know anything went wrong.
May 16 2009,
To parse metalink file, libmetalink(https://launchpad.net/libmetalink) is very handy. This is a C library and parses metalink file into C structure from which you can retrieve URLs. libmetalink uses libxml2. Simple example code is included in the archive. FYI, mulk project(https://sourceforge.net/projects/mulk) uses libmetalink. I agree to Anthony that multi-part downloading is not a requirement, rather it is a optional "bonus" feature for metalink.
Jun 18 2009,
Jun 24 2009,
I would like to add these notes that Bram Neijt took when looking for possible places where to hook in: http://groups.google.com/group/metalink-discussion/msg/2c63ced761bc95e6?hl=en
Dec 17 2009,
Replacing labels: Area-BrowserBackend by Area-Internals
Dec 23 2009,
The Internet Draft version of Metalink is in IETF Last Call. It would be great to have review from anyone, but browser people would be especially nice. http://tools.ietf.org/html/draft-bryan-metalink
Jun 2 2010,
RFC 5854 'The Metalink Download Description Format' is out. http://tools.ietf.org/html/rfc5854
Sep 26 2010,
Hey:) Have a look at this excellent site I have got: <a rel="no follow" href="http://www.shopinguggboot.com"> UGG on sale</a> <a rel="no follow" href="http://www.fashionuggboots.co.uk">http://www.fashionuggboots.co.uk</a> <a href="http://www.shopinguggboot.com/" rel="nofollow">http://www.shopinguggboot.com/</a>
Nov 22 2010,
This is something that's probably best handled by an extension. We'd want an download manager extension API that's rich enough to implement something like metalink or bittorrent. I think this is the current proposal: http://www.chromium.org/developers/design-documents/extensions/downloads-api If you're interested in implementing metalink support as an extension, you might want to take a look at that API and email firstname.lastname@example.org with any feedback.
Nov 22 2010,
Thanks for info about downloads API. I sent some feedbacks.
Dec 29 2011,
Nobody seems to be mentioning that, if support for Metalink were to be implemented as a feature in WebKit, it'd be a great, low-overhead alternative to .zip files for things like GMail's "download all attachments" link. Yes, checksumming and multi-source downloading are important, but we still have no proper download equivalent to HTML5 <input type="file" multiple> and, at the very least, checksumming does have a simpler proposed alternative (Content-MD5 HTTP header). There are quite a few desktop applications that, if they're to ever be comfortably implemented in the browser, need that kind of functionality and extending the drag-and-drop API won't cut it. (BIG hassle to HAVE to open a file manager window just to save something. Drag-and-drop is for when you've already got a file manager on hand to use as the source or destination.)
Dec 30 2011,
Content-MD5 is fundamentally broken, due to early HTTP specification-makers doing incorrect analogies between HTTP and MIME. If you're going to do any checksumming, use RFC 3230.
Apr 6 2012,
Metalink is in Google Summer of Code this year and is looking for a mentor from Chromium to aid one of our students in adding native metalink support. we want to use our own GSoC slot on this. the student has already written an extension with metalink support. if anyone is interested, please contact me! :)
Apr 10 2012,
Hi! I'm the primary author of the downloads extension API, and I also work on the downloads system in general. We've been thinking about implementing chunked downloading in C++ in src/content/browser/download at some point, but it isn't on any roadmap. I'm not sure if I have the cycles to be a mentor. I'll need to talk to my team. Even if I can't be the mentor, please feel free to contact me directly if you have any questions about the extension API or the C++ downloads system.
Apr 10 2012,
Hey, this is Sundaram working on the chrome extension for downloading metalinks. For downloading the file, I use the chrome experimental downloads API. I have a few doubts regarding the flexibility of the API. I believe that the extensions API has the following set of limitations. Please correct me if I'm wrong with any of this. 1. Metalinks provide the ability to check for errors in the downloaded file. However, I would have to checksum the file using XHR and download the file using experimental downloads API. This means the same file would have to be downloaded twice to check for errors in the data. 2. Metalinks provide information about multiple mirrors. Thus, multi-sourced downloads are theoretically possible. However, the downloads API does not have options for that. 3. If a particular piece is erroneous, the API does not allow you to download the piece alone from another mirror. I don't expect the API to support all of this as it extends only the browser's core functionality. Basically, what the extension does is really really minimal and we would want to expand on it to make use of all the advantages of metalinks. So at this stage, we can probably look at 1. NPAPI plugin. 2. Native Chromium support. NPAPI has its share of disadvantages. Plus, it will be an addon at the end of the day. So, native chromium support would be ideal. THanks
Apr 11 2012,
To checksum chunks without downloading them twice, you can write the chunks to an HTML5 FileSystem blob, then download() the blob. This means that the download's URL and referrer are useless unless you maintain your own database and provide your own manager UI that incorporates that database. This is certainly a drawback, but it still seems like a possibility. I'm not sure what you mean by multi-sourced downloads. The onChanged event contains error information if a download fails; a handler for this event might fallback to a mirror. So, let's talk about extending content/browser/download to handle metalinks. The issue here is that c/b/d has accrued a significant amount of technical debt. There are some long-standing bugs in c/b/d, and it's difficult to add new features cleanly. It's improved in the last year and a half or so, but it's still a long way from being lean enough that I would be comfortable with adding the significant complexity that metalinks requires. (I haven't talked to my team about this, so I'll update if they feel differently.) We're planning on starting a massive refactoring effort soon to pay down this technical debt. I hope I'm being pessimistic, but I would be surprised if we finished all the refactoring tasks before summer 2013. Of course, we may reach diminishing returns and decide that c/b/d is lean enough to support additional complexity before then. The other side of the seesaw is the importance of metalinks. I do not feel like I know how important metalinks is to the web. If every web admin and every other major browser supports metalinks and the user experience is significantly better with metalinks than without, if chrome is holding the web back, then we might be able to afford adding the complexity before/during the refactoring marathon. I understand all the technical advantages of metalinks, and I see that DTA and FlashGot support most of it, but if most users and servers aren't using metalinks, then we have a harder time justifying the additional complexity and engineer-hours. I imagine that the truth is somewhere between those extremes, I just don't know where. If c/b/d were already in better condition and if I had a better idea of how important metalinks is to the web, then I would probably say "Patches welcome!" As it is, however, the more prudent course of action seems to me to be to try to take the extension API as far as it will go, release the metalinks extension, gauge/drum up interest, and try again for native support in a few months when c/b/d is in better condition and we can see how many users install and use the metalinks extension. (Again, I haven't talked to my team, so I'll update if they feel otherwise.)
Apr 11 2012,
> I do not feel like I know how important metalinks is to the web. This is a perfect example of a hen-egg problem. I also like what metalinks are able to do, e.g. I'm managing the mirrors of a big (.5 gb) open source program and in some parts of the world the internet is so flaky and slow, that you have no chance to get this file just with simple html or ftp. Resume capabilites are important and selecting good mirrors automatically is nice. BUT: There is also another technology in the wild that is in my eyes a replacement for metalinks. It's the "WebSeed" [ws] feature for torrents. BitTorrent is much more popular and if your client supports webseeds (many do!) you can use them equally well for the same purpose as metalinks. So, if chrome would support everything to create a bittorrent extension with webseeds, you get everything you need. Note: For this OSS project, I also create torrents with webseeds and I have no complaints. [ws] http://getright.com/seedtorrent.html (but there are other implementations)
Apr 11 2012,
Apr 11 2012,
It is a bit of a chicken-egg problem. The solution to the actual chicken-egg problem is, of course, the Zen/quantum state of "neither and both": chickens evolved gradually, so there wasn't a 'first' chicken or a 'first' chicken egg that was significantly different from its parent almost-chicken/almost-chicken egg. The same solution applies here: let's evolve chrome's metalink support one step at a time. Chrome's metalink support may be "complete enough" (enough like a chicken) when a "complete enough" extension is in the webstore, even if the totally complete native implementation (modern chicken) isn't justifiable until much later. The extension may spawn (the justification for) the native implementation.
Apr 12 2012,
Ben, thanks for the comments and help. I understand where you're coming from. we appreciate your tips and will work to improve the extension & take it as far as an extension can go! :) a few questions: 1) how stable is the downloads extension API or when will it be non-experimental? 2) extensions that use experimental APIs can't be in the webstore, right? (it seems that, for users this could be even more perplexing than installing an external download manager - first you'd have to enable the experimental APIs then find it somewhere besides the webstore). to the importance of metalinks, it is admittedly niche, but inclusion in a browser will solve problems for many people, as far as downloads go. will every site use metalinks? certainly not, but many download sites would like to take advantage of it if minimal features are supported in a mainstream browser. it's supported by around 40 download applications, mostly download managers but some p2p, browser, FTP, and system update utilities (all Red Hat/CentOS/etc systems use it). support in more mainstream apps is coming, thanks to Google Summer of Code too! :) quite a few linux distributions use it for ISO (or more) downloads like Ubuntu, Fedora, & openSUSE. other projects like KDE, LibreOffice, OpenOffice, FSF, Sugar, Xfce, & XBMC use it too. so you have 7 years of highly technical early adopters I understand some of the metalink features add complexity, but I too would like to start and stay as simple as possible. the first step is verifying a file's hash is correct after download. this seems like a minimal change and that alone will help many people who are not familiar with hashing files or do not even have reasonable options (for average people) to even attempt on their OSes. I think the next step would be switching to another mirror if a server went down during download, and the final step could be repairing downloads with the partial file hashes. that's more complex but optional of course, and would help many people as well. Use cases: * downloading large files (error repair) like Linux distributions, software, and games ( https://forums.eveonline.com/default.aspx?g=posts&m=51440 ) * downloading files available on a CDN or mirror network w/ failover if some nodes are unreachable * webmail could use metalinks for "Download All Attachments" instead of putting all files in a .ZIP archive. * downloading a whole album & creating a directory structure instead of putting already compressed files in an archive * sites publish mirror/CDN info so caches can recognize more hits properly. as far as webseeds go (this doesn't concern Chrome), that's a feature complementary to metalink. BitTorrent is obviously insanely popular for certain things, but there are reasons why no major browser supports it. metalink also offers a way to transparently go from a normal web download to a BitTorrent or p2p download too (without having to click on a torrent). (some of the architects of the web have called it a "bootstrap into a peer-to-peer system" for the web: http://www.w3.org/2001/tag/2012/01/06-minutes#item02 ) and finally, this bug doesn't mention Metalink/HTTP (RFC 6249) which provides mirrors & hashes in HTTP headers: Link: <http://www2.example.com/example.ext>; rel=duplicate Link: <http://example.com/example.ext.asc>; rel=describedby; type="application/pgp-signature" Digest: SHA-256=MWVkMWQxYTRiMzk5MDQ0MzI3NGU5NDEyZTk5OWY1ZGFmNzgyZTJlO
Apr 12 2012,
Apr 12 2012,
As I understand it, the advantage to Metalink/HTTP is that, if you serve up Metalink/HTTP headers in a 200 or 3xx response, you can provide a single link which will seamlessly provide the best possible experience whether or not the user agent supports Metalink... without requiring the user to know or care what Metalink is or whether they have it.
Apr 16 2012,
Ben, thanks for the info on the API & your thoughts. good to have that clarified. Sundaram is looking into the webRequest API & your other suggestions. for interaction design for the extension, we can probably take some cues from what download managers have been doing all these years. usually we keep things simple with options to investigate more for advanced users. if the download completes we don't show anything special, although we could show some type of indicator when the checksum is correct. you're right about metalink/HTTP, there's reliance on a central server. it also lessens the dependency on XML, so it's optional for chunk checksums. for an organization serving small files with a few mirrors and a whole file checksum, it could be simpler. that would be cool to have support in a lower level library - as you said, it's more general & not as specific on 'downloading a file' so it can be used for any resource or for site-level failover like you say. no, there is no imgur like site for metalinks (but some file hosting sites use it). that would be great! I'll share that idea & see what we can come up with. there wouldn't need to be voting on mirrors since the clients will automatically cycle through & not rely on the ones that failed. Stephan, you can use either the XML or HTTP variety of Metalink to "provide a single link which will seamlessly provide the best possible experience whether or not the user agent supports Metalink... without requiring the user to know or care what Metalink is or whether they have it." I think that's one of the strong points of Metalink!
Apr 16 2012,
@27: I'd been trying to figure out how to do fallbacks for non-Metalink user agents with the XML variety off and on (in among other work) for a week but I'd concluded it must have been done using Metalink/HTTP because the only places I could even find mentions of it were old discussions about how one might make it possible. Could you point me to the relevant documentation?
Apr 19 2012,
@28: yes, it requires server side stuff. :) either the client request contains Accept: application/metalink4+xml for transparent content negotiation (RFC 2295)... And/or the server advertises that a metalink is available (using the HTTP Link header field, part of Metalink/HTTP as you mentioned): Link: <http://example.com/example.ext.meta4>; rel=describedby; type="application/metalink4+xml" here is a video of it in action: http://youtu.be/A4-03-Dn4R8
Apr 19 2012,
Note that REST purists will surely tell you to use Link and not content negotiation, because the metalink is not semantically an alternative representation of the same resource.
Apr 19 2012,
@29: Thanks. That's what I'm looking for. Where should I have been looking and which search keywords should I have been trying in order to find that information myself? @30: I'm generally very big on ReST, but I'm practical enough to recognize that, if the client is Accept:-ing application/metalink4+xml, it understands the implicit rel=describedby and there's always the chance that some client will support only that approach. I'll probably use a mix of all three approaches (Metalink/HTTP, Accept, Link) to ensure widest compatibility when I use Metalink.
Apr 28 2012,
Metalink Downloader, a Chrome extension, is ready for use! http://code.google.com/p/metalink-chrome-extension/ this is just the 0.1 release, but please try it out & file issues & comments there. there's much more work to do, but this is a great start and I think the best we can do with an experimental extension that can't be in the web store. enable the experimental extension APIs as described on the download page to try it out. @31 the use of the Link header for transparent metalinks probably isn't spelled out explicitly enough in RFC 6249, we'll have to fix that. TCN is deprecated but some of the mirror redirectors liked it because it meant one less request when they were already doing 40 million a day.
Jun 21 2012,
Mar 10 2013,
May 14 2013,
May 17 2013,
I decided to mark this as WontFix, rather than a low priority available bug. The litmus test for this is: Even if someone provided a perfect patch for this feature, would we want to take it? In this case, I don't think we would - each new feature adds some amount of code bloat, build and test time cycles, and long-term maintenance. We decided not to support torrent files for a similar reason. The Chrome download extension API should work for this and it's available in dev channel and beta in Chrome 28. If this is not sufficient, it would be great to get feedback on why that is. Alternately, users should be able to specify "Always Open Files of this Type" in the download context menu, associate a default program on their OS with the metalink file type, and have a fairly seamless download experience.
May 17 2013,
@#36: I don't know about the download extension API (my concern is as a web developer, not a Chrome extension developer), but "Always Open Files of this Type" is useless for Metalink/HTTP since Chrome will just blindly use the HTTP direct download fallback without even realizing that a metalink file full of mirrors is being offered.
Feb 27 2016,
Metalink support is more vital now than ever with the internet expanding to new areas. Raise your hand if you've tried to download a linux distribution on an internet line in a developing country..... If not, I'd like to tell you that getting the OS is often more difficult than using it. This support could also help with media distribution, and a lot more. So, eight years after the first bug report, I implore you to rethink the "wontfix" status. Metalink could improve the internet experience for billions of people.
Sign in to add a comment