sendBeacon/ping should be reported as XHR resource type |
|||||||
Issue descriptionVersion: 50.0.2661.94 (64-bit) M49 introduced new "ping" ResourceType for requests initiated via navigator.sendBeacon and <a ping>. See crbug.com/512406 . I'd like to propose that we, instead, reclassify and report both of these as "xmlhttprequest": - Giving these requests a separate bucket is a spurious distinction. At their core, they are XHR requests, sans success callback. - Current distinction is being misused by extensions to block these requests, which is bad on multiple levels: 1) It breaks app functionality. For example, a video app wants to save the last video position as the user navigates away from the page. Because the video app does not want to hold the navigation from proceeding, it uses sendBeacon - e.g. navigator.sendBeacon("/savePosition", {"video": "ABC", "time": 423}). If an extension blocks such requests, it breaks the app. 2) It forces bad user experience. Applications that want to make use of sendBeacon/ping are forced to detect if such features are disabled and fallback to old-and-costly approaches: sync XHR's, spinning in a busy loop while waiting for the XHR to complete, and so on. Disabling these API's does not "block" these requests, it only hurts the user experience, both functionally and in terms of performance. In short, current "ping" type is harmful for user experience. If an extension wants to block certain types of requests, it should do so based on other properties of the request (e.g. URL, payload, etc), not it's resource type. What is the expected output? "ping" resource type should be merged under "xmlhttprequest" in: https://code.google.com/p/chromium/codesearch#chromium/src/extensions/browser/api/web_request/web_request_api_helpers.cc&q=extensions/browser/api/web_request/web_request_api_helpers.cc&sq=package:chromium&l=54
,
May 12 2016
I added them because some researchers wanted to identify such requests. You should complain to extension authors if they break your website. If the issue if really as severe as you claim, then it'd be technically possible to require a (warningless) permission in order to block ping requests. The only purpose of this is then educating extension devs, at the cost of letting requests go through, so this is not really a serious option. Due to a bug, the ping resource type is not yet visible in the webRequest docs, so I think that the number of extensions that misuse it is limited, so I really suggest to complain to the extension authors instead of here. (Also, due to another bug it was not possible to directly filter on ping requests, which was fixed in 50 or 51 IIRC. Extensions had to explicitly check the details in the callback if they wanted to block ping requests).
,
May 12 2016
Which researchers, and what questions are they trying to answer? I stand by what I said earlier, it's a spurious distinction and one that harms user experience. > If the issue if really as severe as you claim, then it'd be technically possible to require a (warningless) permission in order to block ping requests. No, this misses the point. That's meaningless setting because anything you can do with sendBeacon/ping can also be done with XHR or another mechanism; in practice you don't "block" anything, you just redirect these requests to other mechanisms and force a worse-off user experience. If an extension wants to filter requests, it should look at the URL, request headers, payload, etc. Which is to say, exactly as they do today for all other non-"ping" resource types. Blocking a resource just because it doesn't have a callback is counter-productive; the fact that a resource doesn't have a callback is not a meaningful signal for what it communicates. Re, bugs and docs: actually, that's good news. It means we can remove this with minimum pain for everyone, before this becomes an even more widespread problem.
,
May 12 2016
I have notified the researchers of this issue, in case they want to chip in. I've looked around and can confirm that there is some FUD about sendBeacon (e.g. uBlock has an option to disable ping/sendBeacon and links to several sources that put these APIs in a bad daylight [1]). Given the evidence of misuse, and the fact that any privacy-harming characteristic can easily be implemented with other performance-killing APIs, I don't oppose removing the "ping" label from sendBeacon requests. <a ping> is specified and advertised as an unreliable way specifically designed for tracking [2], so it may be useful to keep the "ping" label for <a ping> (so that extensions can (selectively) disable <a ping>). As for the implementation: - Blink should then differentiate between the two request types, so that sendBeacon falls in the "other" bucket again. If I recall correctly, image requests at unload were also tagged as ping. This should still be labeled as ping (since it is probably tracking), and may encourage developers to switch to navigator.sendBeacon as a positive side effect. - At the content layer, a new item must be added to an enumeration, and as a result lead to a lot of changes for UMA (like https://crbug.com/410382#c39 ). Though I wonder whether removing sendBeacon from ping really helps that much. Extension devs could listen for the "other" type, check whether it is a POST request without frameId/tabId and from that infer that it may be a beacon request and block it. Devlin, what do you think? Should we change the type of sendBeacon requests? PS. The implementation bug that I mentioned before is bug 591988 (fixed in 50). [1] https://github.com/gorhill/uBlock/wiki/Disable-hyperlink-auditing-beacon [2] https://html.spec.whatwg.org/multipage/semantics.html#hyperlink-auditing
,
May 12 2016
My position from #1 really hasn't changed much - I think changing our API reactively to extension behavior is an unpleasant situation. I'm more amenable to changing sendBeacon, since I can see arguments for it reasonably falling into either category, and certainly won't block that change. That said, has anyone spoken with the [Ad|U]Block[Plus] folks about this? I *do* strongly think that there should be a reasonable to somehow differentiate between pings/beacons/xhrs/etc, even if they are initially lumped into the same category. As such, are these blockers blocking pings because a) They mistakenly thought they were scary and don't really care, b) They just had a lump catch-all to block anything that wasn't x, y, or z, and so adding 'ping' made it fall into that bunch, or c) They really intentionally block pings and beacons. If c), then if we change the API (and there is still some differentiating factor between XHR and sendBeacon), then, mistaken or not, won't they continue blocking those requests?
,
May 12 2016
second paragraph: "...reasonable *way* to somehow..."
,
May 12 2016
Historically, we supported targetting <a ping> individually in Adblock Plus filter lists. However, since that feature was disabled by default in Firefox (and still is) and Chrome didn't even have been around back then, we removed that feature. However, about a half year ago we reintroduced the $ping filter option with new semantics so that it matches <a ping> along sendBeacon(): https://adblockplus.org/development-builds/reintroducing-the-ping-filter-option We did so because that matches Chrome's (new) semantics, but also because people actually use sendBeacon() now, and because we believe it to be used for the same purpose as <a ping>. But to be honest it's not heavily used so far. However, I disagree that it's related to XHR, at least not any more than XHR is related to images in the DOM. I wouldn't mind having different types for <a ping> and sendBeacon(). Mozilla/Gecko does that and merging those on our end is quite trivial. However, making Chrome's request types even more ambiguous would be a bad idea, in particular if it breaks backwards compatibility again, IMO.
,
May 13 2016
Some extensions disabling sendBeacon is very unfortunate. If differentiating resource types for sendBeacon and ping can help (and can work for, say, adblockplus, as commented in #7) I think we can work on this. Yoshino-san: do you think this can be tracked by you (reg: impl side, as far as we can agree on the idea)?
,
May 13 2016
"Mozilla/Gecko does that and merging those on our end is quite trivial." Does this mean the the blocking would occur based on other heuristics (e.g. URL, header, payload) as recommended by Ilya? (See comment #3). As Ilya explained, a plain and simple blocking of <a ping> and sendBeacon is pointless and as a consequence harmful to the user experience.
,
May 13 2016
hi, my name is Sergio Maffeis and i'm an academic at Imperial College. i'm not a browser developer so i realise your priorities may be different, but i would like to contribute a slightly different point of view. me and my colleague Chris Novakovic started working on an extension to categorize the various web requests issued by a page, as part of a tool to support research in web security and privacy. we found that beacons were not associated to the iframe issuing them, and were not correctly labelled as such, but got the "other" label. it is important for all requests to be labelled by their type and context, as a request can be malicious even just based on the context where it was issued. for example, a form submitted via xhr while a user is browsing a page can be good, but the same form submitted by a beacon as the user leaves the page may be a sign of something fishy going on. so we asked Rob's help, as these issues were related to a group of bug reports already filed by others on the Chromium bug tracker: https://code.google.com/p/chromium/issues/detail?id=522124 https://code.google.com/p/chromium/issues/detail?id=522129 https://code.google.com/p/chromium/issues/detail?id=512406 to cut a long story short, i mostly concur with the author of comment 5, and i'd like to add/restate: - labelling a requests precisely according to its type (xhr vs ping vs beacon) it's an improvement on an existing API and in itself does not break app functionality or force bad user experience. extensions are already able to break app functionality or force bad user experience in as many way as they like (for example blocking/editing any xhr request, deleting the dom etc.) - intentionally hiding the fact that a request is a ping or a beacon has at least 2 drawbacks: 1. forces who wants to capture that information anyway to find hacky/ugly/unstable ways to do so, which can backfire and degrade user experience further 2. make sit look like Chromium intentionally tries to hide sneaky requests that could be used to violate user security/privacy (beacons and pings can be used by malicious web pages as well as legitimate web pages) as researchers, we love the openness of Chromium, and we hope is stays the browser were users can have control over their data and the pages they visit. this also depends on having a powerful extension API. the burden of not screwing up should be on extension developers, not a diktat from above.
,
May 13 2016
For any competent blocker extension, it is very important to know the context in which a network request was done. Network requests fired by navigator.sendBeacon() have no such context information. For example, if a user installs an extension which purpose is to block 3rd-party network requests, for network requests fired without context, an extension developer has two choices: - Assume no-context network requests are all 3rd-party: this does not make much sense since it just means to block ALL network requests which have no context (instead of just the beacon one as is the case here): this will likely break the normal operation of the browser, not just maybe one feature on maybe one site. - Assume no-context network requests are all 1st-party: this does not make much sense either since it means that the extension can no longer do its job properly (for example github.com uses sendBeacon() to send data to google-analytics.com -- see attachment). Currently there is probably many users who thinks their extensions protect their privacy on Chromium, while it is not the case because these extensions do not plug the sendBeacon() hole. sendBeacon() is also meant to be used when a page unload, which means by the time the related network request is fired, the context no longer exists, which means that even if the context information was provided to the webRequest API, the context would not longer exist, and we are back to the 1st-/3rd-party dilemma described above. Network requests sent as a result of calling sendBeacon() are NOT comparable to XHR/Image: XHR/Image come with context information, unless they are really fired by the browser itself.
,
May 13 2016
Thanks everyone for the feedback. There are a number of different issues here, so I'll try my best to untangle them... 1) Providing a "disable ~ping/sendBeacon" toggle to the user (say, as part of a content blocker) is, at best misguided, and at its worst actually harmful to the user. As we noted before, there is nothing in ping/sendBeacon that cannot be accomplished via other means. As a result, all you've done is made a promise you cannot keep, or worse, have misled the user into a false sense of security. 2) Realistically, if you want to enforce some set of content filtering (or monitoring) rules with any degree of accuracy, you have to consider all requests, regardless of their type. For the sake of an example, let's say that a request to foo.com/bar ought to be considered as "bad". Would you only block it if the initiator is "ping", knowing that the same request can be trivially sent via XHR/img/whatever? If yes, then you've failed because this is trivial to detect and work around. If not, and you've put smarter signature detection in place (e.g. based on URL, payload, request headers), then ResouceType doesn't add any new information since we've already established that it's an unreliable signal. 3) Indiscriminate blocking of API's like sendBeacon is actively harmful to the user and the ecosystem. The mere fact that extensions put such rules into place prevents well-intentioned developers from using such API's to improve user experience - e.g. to record application data without blocking navigations; to allow the browser to coalesce requests of this type to improve energy efficiency, and so on. In effect, if the goal of such extensions and research is to improve the user experience (performance, privacy), then indiscriminate blocking is actively harmful to that goal: it entrenches the status quo (which relies on perf-costly hacks), and it blocks forward progress in the ecosystem for good applications and players that want to improve the experience for their users. --- In summary... Could knowing the detailed initiator be a useful signal in some cases? Yes, it can be. However, this signal alone is a *bad* signature, due to the reasons outlined above, and we already have examples of popular extensions using it in just such a way. As such, I believe that the negative impact of exposing this information far outweighs the positive use cases that it enables. Further, I believe that if you actively ignore this bit of information, and use a smarter signature which actually examines where the request is going to, and it's payload, you'll end up delivering a better service to your users, while simultaneously allowing developers to make use of these new API's to deliver better experiences to their users. In short, everyone wins. --- Moving forward Rob, Devlin: I believe "xmlhttprequest" is the right classification for these requests, "other" falls into the exact same trap as above.
,
May 13 2016
I suspect some of the conflict here is that xmlhttprequest doesn't accurately reflect what ping/beacon actually are. Alternate proposal: can we stop reporting both xmlhttprequest and ping types and lump *both* of those into the other category? Extensions making the user experience better trump extensions that are used for measurement in my opinion. Corollary is that extensions making the user experience worse for the sake of better measurement is the wrong tradeoff.
,
May 13 2016
> Alternate proposal: can we stop reporting both xmlhttprequest and ping types and lump *both* of those into the other category? This would break any extension that currently looks for xmlhttprequest type. > Extensions making the user experience better trump extensions that are used for measurement in my opinion. Corollary is that extensions making the user experience worse for the sake of better measurement is the wrong tradeoff. Agreed - but that's not really the issue here. There is no possible API that could not be used to make the user experience worse in some way. Extensions could just as easily block all xhrs, or all requests of any type (and you're left with quite a useless browser) - but I disagree that that's a reason to remove every API. :)
,
May 13 2016
I'd love to see some numbers and analysis for how "xmlhttprequest" is used in practice. That said, my hunch is that it's not subject to the same problem as "ping" because extension authors recognize that indiscriminate blocking of all such requests is not a productive outcome for their users and hence they use smarter signatures to trigger their filter or monitoring logic... just as they should (but, critically, don't) for "ping". FWIW, the reason I keep suggesting "xmlhttprequest" is because it's actually highly likely that future versions of ~sendBeacon are simply a request flag (maybe several, e.g. delay tolerant [1]) on fetch(). [1] https://github.com/whatwg/fetch/issues/184
,
May 13 2016
> Rob, Devlin: I believe "xmlhttprequest" is the right classification for these requests, "other" falls into the exact same trap as above. "other" contains many other request types, so being in "other" does not mean that extensions are more likely to block them than in "xmlhttprequest". > Alternate proposal: can we stop reporting both xmlhttprequest and ping types and lump *both* of those into the other category? As Devlin said, this would break extensions... I understand that you're not happy with sendBeacon + "ping", but backwards-compatibility is important and breaking it should not be done light-heartedly. > I'd love to see some numbers and analysis for how "xmlhttprequest" is used in practice. That said, my hunch is that it's not subject to the same problem as "ping" because extension authors recognize that indiscriminate blocking of all such requests is not a productive outcome for their users and hence they use smarter signatures to trigger their filter or monitoring logic... just as they should (but, critically, don't) for "ping". <a ping> by design serves no purpose other than tracking link clicks (even if JavaScript is blocked). So extensions that promote privacy do certainly have a point in blocking <a ping>. > FWIW, the reason I keep suggesting "xmlhttprequest" is because it's actually highly likely that future versions of ~sendBeacon are simply a request flag (maybe several, e.g. delay tolerant [1]) on fetch(). Thanks for this elaboration, this puts your statements in context. I'm still not convinced that sendBeacon should be classified as "xmlhttprequest", because beacons are always fire-and-forget (and persist after tab closure), and XHR is not. Even before I offered evidence that "ping" requests are blocked, you already assumed that offering the ability to discern beacon requests is harmful. This suggests that you think that extensions should have a reason to treat beacon request differently. That thought -from a spec editor no less- may be an indication that sendBeacon may be used for things that aren't always matching the desires of users (why would they block beacons otherwise?). If beacons are really that despised, then extensions can block them even without support from the webRequest API, by injecting a content script that removes the navigator.sendBeacon API. This is worse for performance than blocking through the webRequest API. See for example the WebRTC API. This is an awesome API, but some users don't want it and use extensions that use several hacks to try and disable the API. There is no reason to not believe that this could happen with sendBeacon. It is obvious that there is a non-zero privacy risk in offering websites the ability to continue sending requests even when all associated pages and scripts are terminated (why is this not mentioned in the Privacy and Security considerations in the spec?), so it's not that weird that there are extensions (read: users) who try to block them. --- My stance is that it's not a problem that extensions can discern beacon requests, and that moving sendBeacon to "other" satisfies the spirit of this feature request. A weaker alternative is that beacons are classified as a separate resourceType "beacon". It's weaker because extension authors can still block it if they believe that sendBeacon is undesirable. It's better than the current situation because <a ping> is clearly only for tracking links, while sendBeacon can also be used for non-tracking purposes.
,
May 13 2016
What do image requests issued from inside the unload handler get categorized as?
,
May 14 2016
> "other" contains many other request types, so being in "other" does not mean that extensions are more likely to block them than in "xmlhttprequest". I suspect that, exactly as you pointed out in #4, they'll simply shift from "ping" to the approach you described. Whereas, with "xmlhttprequest" they'll have to use smarter signatures which are based on actual content of the request, not which API is initiating it. > Thanks for this elaboration, this puts your statements in context. I'm still not convinced that sendBeacon should be classified as "xmlhttprequest", because beacons are always fire-and-forget (and persist after tab closure), and XHR is not. Right, not always.. but "always" is not the point. The point is that sendBeacon can be trivially polyfilled via XHR -- you won't get the performance benefits, but you'll get the same functionality. > Even before I offered evidence that "ping" requests are blocked, you already assumed that offering the ability to discern beacon requests is harmful. I was aware of this. > This suggests that you think that extensions should have a reason to treat beacon request differently. That thought -from a spec editor no less- may be an indication that sendBeacon may be used for things that aren't always matching the desires of users (why would they block beacons otherwise?). I don't follow this at all. My point about indiscriminate blocking is precisely the opposite. > If beacons are really that despised, then extensions can block them even without support from the webRequest API, by injecting a content script that removes the navigator.sendBeacon API. I think you're wrongly conflating sendBeacon with a very narrow collection of use cases. And yes, you're right, technically you don't need webRequest API to implement request blocking.. but that's not an argument for enabling harmful functionality within webRequest. > It is obvious that there is a non-zero privacy risk in offering websites the ability to continue sending requests even when all associated pages and scripts are terminated (why is this not mentioned in the Privacy and Security considerations in the spec?), so it's not that weird that there are extensions (read: users) who try to block them. There is non-zero privacy risk with any request. Further, the page does see all sendBeacon requests before it's unloaded, the only difference is that if such request is queued right before the page begins unloading, the request is not automatically terminated. All of this, and more, is covered in the spec: http://w3c.github.io/beacon/#privacy --- > My stance is that it's not a problem that extensions can discern beacon requests, and that moving sendBeacon to "other" satisfies the spirit of this feature request. It does satisfy the spirit but it doesn't address the issues I outlined in #12.. as you outlined yourself in #4 :-)
,
May 14 2016
I would like to point out that because <a ping> is effectively made unreliable, we end up with otherwise unecessary redirections which hurt the user experience. I wish there was a way to fulfill the needs of privacy conscious users without hurting other users in the process. Thoughts?
,
May 14 2016
#17 > What do image requests issued from inside the unload handler get categorized as? "ping", with method "GET". #18 > > "other" contains many other request types, so being in "other" does not mean that extensions are more likely to block them than in "xmlhttprequest". > > I suspect that, exactly as you pointed out in #4, they'll simply shift from "ping" to the approach you described. Whereas, with "xmlhttprequest" they'll have to use smarter signatures which are based on actual content of the request, not which API is initiating it. If extension devs want a way to single out beacon requests, they can do that, whether easily (through a specific resource type) or through hacks (by heuristics). The heuristic from comment #4 is imperfect, and blocking all matching requests may have unexpected consequences. E.g. CSP violation reports also match that description. Even with "xmlhttprequest", beacon requests can still be detected (e.g. due to bug 522124 , bug 522129, which are consequences of the requests being detached). > > I'm still not convinced that sendBeacon should be classified as "xmlhttprequest", because beacons are always fire-and-forget (and persist after tab closure), and XHR is not. > > Right, not always.. but "always" is not the point. The point is that sendBeacon can be trivially polyfilled via XHR -- you won't get the performance benefits, but you'll get the same functionality. The thing that I put between braces cannot be polyfilled: XHRs are killed when the document unloads. Whether that's significant is open to debate. > > This suggests that you think that extensions should have a reason to treat beacon request differently. That thought -from a spec editor no less- may be an indication that sendBeacon may be used for things that aren't always matching the desires of users (why would they block beacons otherwise?). > > I don't follow this at all. My point about indiscriminate blocking is precisely the opposite. If beacons are equivalent to XHR, then this concern (indiscriminate blocking) wouldn't exist since no sane extension would break legitimate features of websites for no reason (how many websites would still work if AJAX was gone?). > > ... risk in offering websites the ability to continue sending requests even when all associated pages and scripts are terminated .. > > There is non-zero privacy risk with any request. Further, the page does see all sendBeacon requests before it's unloaded, the only difference is that if such request is queued right before the page begins unloading, the request is not automatically terminated. All of this, and more, is covered in the spec: http://w3c.github.io/beacon/#privacy I'm referring to this part of the spec: "The user agent may delay transmission of provided data ... wait until network interface is active". Depending on the duration of the delay, a server may obtain information that is not offered through other APIs (long) after the user has left a website (e.g. the user's IP address).
,
May 14 2016
> The thing that I put between braces cannot be polyfilled: XHRs are killed when the document unloads. Whether that's significant is open to debate. Right, that's the performance aspect I was referring to. Believe it or not, many popular libraries block the onclick and wait until the XHR comes back; worse, some of them literally spin in a busy loop while doing so. Those are the behaviors we want to fix. > If beacons are equivalent to XHR, then this concern (indiscriminate blocking) wouldn't exist since no sane extension would break legitimate features of websites for no reason (how many websites would still work if AJAX was gone?). "Sane extensions" do just such a thing - e.g. uBlock. As a side effect, this form of blocking breaks existing sites that attempt to leverage sendBeacon for use cases that uBlock has no business blocking; it discourages adoption of this API due to the risk of never seeing their requests -- see #12, where I already covered all this, and more. Taking a step back: do you accept that sendBeacon can be used for cases that content blockers should allow through? If not, then we're talking past each other. If you do, then we agree that indiscriminate blocking is bad. > I'm referring to this part of the spec: "The user agent may delay transmission of provided data ... wait until network interface is active". Depending on the duration of the delay, a server may obtain information that is not offered through other APIs (long) after the user has left a website (e.g. the user's IP address). I think you're looking at an old version of the spec. Relevant bits from the latest spec: > "The user agent may delay transmission of provided data to optimize network and energy efficiency - e.g. deliver immediately if the network is active, or wait until network interface is active. However, the user agent should not delay transmission indefinitely and ensure that pending transmissions are periodically flushed even if there is no other network activity." > ... > "Similarly, from the privacy perspective, the resulting requests are initiated immediately when the API is called, or upon a page visibility change, which restricts the exposed information (e.g. user's IP address) to existing lifecycle events accessible to the developers." See here: http://w3c.github.io/beacon/. If you have suggestions for better wording, or if you think we did not cover all the aspects, please feel free to open an issue on the spec. --- > If extension devs want a way to single out beacon requests, they can do that, whether easily (through a specific resource type) or through hacks (by heuristics). The heuristic from comment #4 is imperfect, and blocking all matching requests may have unexpected consequences. E.g. CSP violation reports also match that description. Even with "xmlhttprequest", beacon requests can still be detected (e.g. due to bug 522124 , bug 522129, which are consequences of the requests being detached). Yep, good points, and thanks for taking on those bugs! I do think that "xmlhttprequest" is still a better choice because it runs fewer risks of bad behaviors we outlined above - e.g. blocking CSP reports does not functionally break the app, whereas blocking sendBeacon can.
,
May 14 2016
> it discourages adoption of this API due to the risk of never seeing their requests (...) > do you accept that sendBeacon can be used for cases that content blockers should allow through? Yes, this is without doubt. > See here: http://w3c.github.io/beacon/. If you have suggestions for better wording, or if you think we did not cover all the aspects, please feel free to open an issue on the spec. Here you go: https://github.com/w3c/beacon/issues/31 > I do think that "xmlhttprequest" is still a better choice because it runs fewer risks of bad behaviors we outlined above - e.g. blocking CSP reports does not functionally break the app, whereas blocking sendBeacon can. Extensions shouldn't block any of these. CSP reports indicate potential security issues or misconfigured CSP on websites, which could be as bad as blocking important data in sendBeacon. CSP reports were the first example that I found with a matching signature, there are probably other requests in the "other" category. Extensions should not block requests if they don't understand what they're blocking. Side note: Chrome is not the only browser out there. Firefox offers detailed request details (https://bugzil.la/1209983), so if Firefox add-ons also apply a blanket block on beacons, then you're still not going to achieve the desired level of adoption. Rather than trying to withold APIs from extensions, it may be much more fruitful to try and educate extension developers that sendBeacon is not bad (and if they have reasonable arguments against the API, as I tried to find in my last few comments), then you should address them, so that the need/desire for blocking beacons dissipates.
,
May 14 2016
> Here you go: https://github.com/w3c/beacon/issues/31 Ack. Followed up on GH. > Rather than trying to withold APIs from extensions, it may be much more fruitful to try and educate extension developers that sendBeacon is not bad. We have a bad API that's encouraging bad and harmful implementation patterns: that's a problem with the API, not the developers. We should fix it in Chrome and we can have separate conversations with other browsers. --- Let's reach some agreement on the label for sendBeacon. The two options are 'other' and 'xmlhttprequest'. What are your objections to 'xmlhttprequest'? To me, the sendBeacon request is equivalent to an XHR because the use cases that it seeks to replace *are* most frequently currently powered by XHR. The fire-on-unload is only one example. Another is apps that want to replace their while-the-page-is-active XHR's with sendBeacon such that: they are removed from the critical path; can be scheduled more efficiently; can be coalesced by the browser. Further, as I said earlier, I fully expect to see flags on fetch() that will expose this same type of functionality to any fetch-initiated request.. at which point, they'll end up in 'xmlhttprequest' bucket.
,
May 15 2016
> We have a bad API that's encouraging bad and harmful implementation patterns: that's a problem with the API, not the developers. I disagree with this view. The webRequest API doesn't force developers to block specific requests or anything. The API provides information, it's up to the extension devs to do something useful with it. You're trying to change the webRequest API in favor of sendBeacon without considering legitimate use cases of users/extensions. If users/extensions have reasons to block sendBeacon (which is not the only use of the webRequest API), then let them be. If their reason is misguided, go educate them. > Let's reach some agreement on the label for sendBeacon. The two options are 'other' and 'xmlhttprequest'. What are your objections to 'xmlhttprequest'? I have no strong objections against "xmlhttprequest". I just mentioned "other" as an alternative because it was the original catch-all type before the change to "ping". My main thing against "xmlhttprequest" was the thought that sendBeacon continues long after load, but it appears that the intention of the specification is that this delay is minimal, so that's not really an issue any more. Another minor thing is that responses to beacons requests are ignored. Note that hiding beacons from webRequest doesn't prevent beacons from being blocked. Let's take uBlock again as an example: - Without webRequest support, uBlock can be used to filter WebSockets requests: https://github.com/gorhill/uBO-WebSocket - Extensions who want to block sendBeacon can do so without webRequest support, as shown in #16. - So there is no point in hiding beacon requests from the webRequest API if your main concern is that beacons can be blocked. Given this fact, what do you have against giving beacons a unique label "beacon"? Even if fetch gets some beacon-like performance improvements, fetch won't continue requests after the page is unloaded.
,
May 16 2016
> We have a bad API that's encouraging bad and harmful implementation patterns: that's a problem with the API, not the developers. We should fix it in Chrome and we can have separate conversations with other browsers.
The fact that an API can be used to do harm doesn't mean the API is bad, or that it is 'encouraging bad and harmful implementation patterns'. Any API (extension or web) can be used to do Bad Stuff that degrades user experience. To some extent, it is up to a) the developer to do the right thing and b) users to not use software that results in worse experiences.
As Rob has pointed out, there are plenty of ways to block beacon requests even without the webRequest API (another trivial way would be injecting a content script at document start that does 'navigator.sendBeacon = function() {};'). We can't (and, fundamentally, shouldn't) prevent extensions from blocking sendBeacon if they choose to. The fact that some extensions are is the fault of the extensions, not the API.
,
May 17 2016
> (#24) I have no strong objections against "xmlhttprequest". Great. Glad we're in agreement on the technical bits. > (#25) The fact that an API can be used to do harm doesn't mean the API is bad, or that it is 'encouraging bad and harmful implementation patterns'. Any API (extension or web) can be used to do Bad Stuff that degrades user experience. To some extent, it is up to a) the developer to do the right thing and b) users to not use software that results in worse experiences. I guess we'll have to respectfully disagree on this "soft" issue. In my books, the shape of an API makes a meaningful difference, insofar as it encourages or discourages certain types of mistakes and best practices. A "good API" should lead the developer into a pit of success [1]: "a well-designed system makes it easy to do the right things and annoying (but not impossible) to do the wrong things." In this particular instance, for better or worse, there is a bunch of FUD around sendBeacon (perhaps we should have called it sendBacon :-)), that is resulting in knee-jerk indiscriminate blocking.. without due considerations for use cases that this breaks, the intent behind the API, or understanding of how it's actually implemented. I'm not saying we ought to completely obscure such requests - we can't and that's perfectly fine. However, I do believe that doing so should require a bit more active thinking (e.g. perhaps by reading this very thread), which is what this proposal is all about. [1] http://blog.codinghorror.com/falling-into-the-pit-of-success/ --- Ojan, Elliott, Kenji, Kinuko: curious to hear your thoughts. If I'm in the minority on the above points.. we can close this and move on.
,
May 17 2016
I'm with you Ilya. > (#25) To some extent, it is up to a) the developer to do the right thing and b) users to not use software that results in worse experiences. Except that we have 2 developer stakeholders here. It takes both of them to do the right thing in order to delight their users. We can either leave things as-is and live with the fact that our users are served with a sub-par experience (and a false sense of privacy) or make the change that will incentivize the right outcomes.
,
May 17 2016
Unlike uBlock, Adblock Plus is not blocking all "ping" requests by default. We merely allow filter authors to distinguish those kind of requests from others, to more accurately match requests to be blocked. It's generally used in combination with a URL pattern. For example if a website sends malicious ping requests to example.com, while other kind of requests sent to example.com are legit, a filter author might add a rule like ||example.com$ping. Yes, this might still give an inaccurate sense of security, as for the most part websites could achieve the same with other kind of requests. However, instead of changing the request type they could also just change the request URL. But as more precise information about the request we have as more reliably we can avoid false positives/negatives. And as some mentioned before, removing support for distinguishing "ping" requests from the webRequest API, will most likely just result in extensions patching sendBeacon() by content scripts which is much less efficient.
,
May 17 2016
I'm the other researcher who, along with Sergio (#10), asked Rob to implement this change for us. Rather than clog up this discussion by rehashing points that have already been made, I'll just say that I agree with Rob's overall position and most of the points he's put forward. #26: I strongly object to HTTP requests triggered by <a ping> and navigator.sendBeacon() being categorised under the "xmlhttprequest" type. They weren't triggered by a call to the XMLHttpRequest API, and therefore shouldn't be reported as such to extension code. It might be that all three are processed in similar ways in Chromium's code, but this fact is of no consequence to users of the webRequest API, who ought to be given higher-level information about the cause of a request rather than information about the minutiae of the underlying code that processes it. I equally object to them being categorised under the "other" type, on the basis that the distinction can be made at a lower level and would be artificially suppressed (the fact that users of the webRequest API weren't being made aware of this distinction is what motivated us to ask Rob to make the change, since it compromised the quality of the information we were able to record about a browsing session). If categorising both under the "ping" type is unacceptable, I'd prefer for them to be reported under separate types (e.g., "ping" and "beacon").
,
May 17 2016
> (28) Yes, this might still give an inaccurate sense of security, as for the most part websites could achieve the same with other kind of requests. However, instead of changing the request type they could also just change the request URL. But as more precise information about the request we have as more reliably we can avoid false positives/negatives. As Rob outlined above, even with the proposed change, there are other heuristics you can use to build a fingerprint for such requests. The only difference is how much thinking you have to do to build such a thing. Or, said differently: you actually have to think before you simply block all such requests, which is what this thread is all about. > (29) And as some mentioned before, removing support for distinguishing "ping" requests from the webRequest API, will most likely just result in extensions patching sendBeacon() by content scripts which is much less efficient. Right, we can't stop people from doing bad things. We can, however, discourage them from doing so by making it harder to do. From the sounds of it, that's not the approach you're considering, which I'm glad to hear! > (29) I strongly object to HTTP requests triggered by <a ping> and navigator.sendBeacon() being categorised under the "xmlhttprequest" type. Chris, I understand and I'm sympathetic to your points. That said, you have to weigh this against the fact that said API is: encouraging indiscriminate blocking (mostly, due to unnecessary FUD), preventing good developers from making use of sendBeacon, and hurting our ability to move the platform forward in terms of user experience and performance. Also, I don't know what your research is trying to capture, but similar to my earlier points about false sense of security... Hopefully, you're not falling into the same trap and using an intelligent signature. > If categorising both under the "ping" type is unacceptable, I'd prefer for them to be reported under separate types (e.g., "ping" and "beacon"). That's a no-op. It doesn't address the core issues at hand.
,
May 17 2016
> (27) and a false sense of privacy How is there a "false sense of privacy"? I provided a real, actual case above: - Install an extension which supposedly block connections to google-analytics.com (Ghostery, Disconnect, ABP+EasyPrivacy, AdBlock+EasyPrivacy, etc.) - Go to github.com. - See connections to google-analytics.com being made through the use of navigator.sendBeacon() -- *despite* the extension being configured to block connections to google-analytics.com. Connections to google-analytics.com occur even though a user reasonably assumes they are not occurring -- *that* is a false sense of privacy. See attachment, it shows exactly the issue with navigator.sendBeacon() (I used Ghostery for this example, but this was occurring with all such extensions last time I checked). In practice, blocking selectively network requests triggered as a consequence of using navigator.sendBeacon() does not work very well in the current state of the webRequest API: If a blocker was to whitelist github.com -- i.e. disabling itself for the site, there is no way for that extension to know the origin (tab/frame) of a network request fired from sendBeacon() on that page, and hence it can't competently filter these network requests. The choice left is to not filter anything from navigator.sendBeacon(), or filter everything from navigator.sendBeacon(). For uBO I personally chose the 2nd option because my observation so far is that navigator.sendBeacon() is mostly used for tracking purpose. As with other sort of issues, I will look into any real, actual cases where blocking network requests from sendBeacon() breaks something, and try to provide the solution which serves users' best interests.
,
May 17 2016
I forgot to attach pic.
,
May 17 2016
rhill@ aren't you confirming my exact point? By the sounds of it, these extensions are relying on type, instead of having a smarter signature.. which, in this case is the URL (and potentially, payload). The initiator doesn't matter here -- img, xhr, sendBeacon, whatever. Re, frameId's: those are separate issues and it sounds like Rob is already looking into addressing them.
,
May 17 2016
#30: > Chris, I understand and I'm sympathetic to your points. That said, you have to weigh this against the fact that said API is: encouraging indiscriminate blocking (mostly, due to unnecessary FUD), preventing good developers from making use of sendBeacon, and hurting our ability to move the platform forward in terms of user experience and performance. Sorry, but I don't accept that the exposure of a "ping" type in the webRequest API encourages indiscriminate blocking of <a ping> and navigator.sendBeacon() requests any more than the exposure of an "image" type in the webRequest API encourages indiscriminate blocking of images. I also don't see how FUD surrounding the adoption of web standards is alleviated by making less information available to extensions about the use of those standards --- if anything, by doing so, you implicitly substantiate the FUD. > Also, I don't know what your research is trying to capture, but similar to my earlier points about false sense of security... Hopefully, you're not falling into the same trap and using an intelligent signature. There's no need to use an "intelligent signature" --- which, however intelligent you make it, is still prone to error --- if the underlying API knows the ground truth and provides it to you. That's precisely why we asked Rob to make this change for us. Removing it conceals the ground truth, forcing us to make wild guesses and inaccurate assumptions about the data we're being given. I don't see how that helps anyone, be it those whose motives you might happen to agree with (such as ours) or those whose motives you don't (such as the ad-blocking extension developers).
,
May 17 2016
> (33) The initiator doesn't matter here It does matter. If one disable the blocker for a specific site, then sendBeacon()-related network requests on a page from that site should be allowed. Because of the missing tabId information in the webRequest API, it's not possible to filter selectively these network requests. Another example aside the whitelisting example above, is the case of filters (EasyList/EasyPrivacy/etc) which should cause some network requests to be blocked, *except* when visiting specific sites: it's not possible to apply these filters to sendBeacon()-related network requests, because there is no information about where they originate. When the tabId/frameId information is available, this will open the door to filter properly. I point out EasyList/EasyPrivacy here but I am pretty sure other extensions such as Ghostery/Disconnect also need the origin to implement their filtering, there are always exceptions required when it comes to blocking.
,
May 18 2016
> (34) I also don't see how FUD surrounding the adoption of web standards is alleviated by making less information available to extensions about the use of those standards.. It's not FUD surrounding adoption but the use cases and how the feature is implemented -- the discussions in this thread are case in point. That said, you're right, what we're discussing here doesn't "alleviate" the FUD. Instead, it intentionally makes FUD-driven patterns (examples of which we already have out in the wild) slightly harder to implement; it makes you examine the FUD, instead of simply falling into its trap. Nothing more, nothing less. Further, I think we're actually more in agreement than you think... Take a look at my "in summary" section in #12: there *is* value in exposing the detailed initiator (e.g. you find it more convenient, which is perfectly valid), but I do believe that the overall impact of doing so is net negative for the ecosystem due to the other considerations I outlined in that reply. > (35) If one disable the blocker for a specific site, then sendBeacon()-related network requests on a page from that site should be allowed. Because of the missing tabId information in the webRequest API, it's not possible to filter selectively these network requests. Agreed, and we should fix that! That's why we have crbug.com/522124 and crbug.com/522129. > (35) Another example aside the whitelisting example above, is the case of filters (EasyList/EasyPrivacy/etc) which should cause some network requests to be blocked, *except* when visiting specific sites: it's not possible to apply these filters to sendBeacon()-related network requests, because there is no information about where they originate. This is also due to the same tabId bugs above. > When the tabId/frameId information is available, this will open the door to filter properly. I point out EasyList/EasyPrivacy here but I am pretty sure other extensions such as Ghostery/Disconnect also need the origin to implement their filtering, there are always exceptions required when it comes to blocking. Right, I'm glad we're getting to the root of the problem here. We're on the same page (err, tabId? :-)) and I'm 100% with you: we should fix the tabId issues. Further, I can see how making the switch I'm proposing before addressing those issues can be problematic for the scenarios you've outlined... We can block landing the change I'm proposing here until we resolve #522124 and #522129. Does that sound reasonable?
,
May 18 2016
#36: > Further, I think we're actually more in agreement than you think... Take a look at my "in summary" section in #12: there *is* value in exposing the detailed initiator (e.g. you find it more convenient, which is perfectly valid), but I do believe that the overall impact of doing so is net negative for the ecosystem due to the other considerations I outlined in that reply. I'm afraid the only thing we agree on is the positive case for exposing the "ping" type in the webRequest API. Your objections in #12 all relate to what you see as potentially harmful use of that API; the solution to this, as Rob indicated in #24, is to educate people who use the API in that way about why this usage is wrong. I have no stake in leveraging the "ping" type to block requests triggered by <a ping> and navigator.sendBeacon(): neither I nor Sergio have any desire to write an ad-blocker, and that wasn't our motivation for asking Rob to implement this change. Our interest lies in recording accurate information about the provenance of HTTP requests made by a browser. Your proposed change would obstruct that effort, which is why we oppose it. People are going to do things you find unpalatable whether you make it easy for them or not. Imposing arbitrary technical restrictions in response to a perceived sociological problem, forcing people whose use-cases you agree with to implement inefficient and inaccurate workarounds in an ultimately futile attempt to hinder people whose use-cases you disagree with, only has the effect of frustrating the former. The latter will continue to work around whatever barriers you impose unless you convince them that what they shouldn't do what they're doing.
,
May 18 2016
"Our interest lies in recording accurate information about the provenance of HTTP requests made by a browser." Apologies if I missed a link but where can we learn more about the purpose of your research? What is the value you get out of knowing that a request came from sendBeacon? For one, it doesn't tell you what the request was about (the whole point of this discussion: it's not necessarily a "ping"). It's hard to balance your needs with others and confirm that there is no alternative without further details. "forcing people whose use-cases you agree with to implement inefficient and inaccurate workarounds in an ultimately futile attempt to hinder people whose use-cases you disagree with" The disagreement has nothing to do with the use cases but how they are achieved and impacting users and developers.
,
May 18 2016
I agree with rhill, that it is actually a larger problem that request sent by sendBeacon() don't have a valid tabId and frameId. I understand that this probably is because the request might get sent when the tab doesn't exist anymore. But it should be possible to leave the metadata of the corresponding tab and and frame around until any pending beacon requests have been sent. It seems that this is what Mozilla is doing, at least in Adblock Plus for Firefox, where we use the Gecko API, we get those context information for beacon requests. I still think that it's useful to be able to identify ping/beacon requests by their type. However, if we get proper context information (about it's tab and frame) it might be less crucial.
,
May 18 2016
#38: > Apologies if I missed a link but where can we learn more about the purpose of your research? The research is work in progress, so there isn't a public web page describing it at the moment. Sergio summarised our intentions in #10. > What is the value you get out of knowing that a request came from sendBeacon? We're interested in discovering which websites use <a ping> and navigator.sendBeacon(), and who they're communicating with. We also asked Rob to look into fixing bugs #522124 and #522129 as well as #512406 to make this possible. > For one, it doesn't tell you what the request was about (the whole point of this discussion: it's not necessarily a "ping"). I would've preferred two separate types ("ping" and perhaps "beacon") to be reported by webRequest, but according to #30 that's out of the question. In light of that, the next best thing is for them both to be categorised as "ping", given their similar intentions.
,
May 18 2016
> (39) I agree with rhill, that it is actually a larger problem that request sent by sendBeacon() don't have a valid tabId and frameId. I understand that this probably is because the request might get sent when the tab doesn't exist anymore. As we already established earlier in this thread, this is not true, see #24 and background discussion in https://github.com/w3c/beacon/issues/31. We should fix the associated tabId bugs in Chrome, at which point you'll have the same behavior as you described in Firefox. > (39) I still think that it's useful to be able to identify ping/beacon requests by their type. However, if we get proper context information (about it's tab and frame) it might be less crucial. Great, glad to hear. I'm adding those bugs as blockers for this one. --- > (40) We're interested in discovering which websites use <a ping> and navigator.sendBeacon(), and who they're communicating with. How are you collecting this data? Do you ask users to install your extension and report back the results? Also, have you considered using other public datasets (e.g. HTTP Archive [1]) to gather this data? [1] https://discuss.httparchive.org/t/quickstart-guide-to-exploring-the-http-archive/682
,
May 20 2016
#41: > Also, have you considered using other public datasets (e.g. HTTP Archive [1]) to gather this data? Without wanting to get drawn into a discussion about our research (which isn't the focus of this issue, nor should it be), no public datasets, including HTTP Archive, contain the information we're looking for.
,
May 26 2016
> (42) Without wanting to get drawn into a discussion about our research (which isn't the focus of this issue, nor should it be), no public datasets, including HTTP Archive, contain the information we're looking for. I see, ok. At the same time, you can obtain this same data by patching sendBeacon() via a content script -- correct? I'm not suggesting this as an endorsed method, but weighing this particular use case (which, presumably has a relatively small population and an expiry date for the experiment) against all the points raised in #12. --- A quick recap of where we are: (1) There was confusion about when sendBeacon requests are allowed to be initiated. We addressed this in #24. (2) We identified a collection of Chrome implementation bugs: 572930 (fixed), 522124 (in progress), 522129. Due to a combination of (1) and (2) some content filtering extensions had no better choice than to rely on ResourceType. However, once issues in (2) are resolved, they'll have tabID's and origin attribution to correctly (and, more importantly, much more accurately!) apply their rules (see #31-33, #35) - yay. As such, my proposed next steps are: a) This issue is blocked until all the blocked-on issues are resolved. b) We make the suggested change to "xmlhttprequest" once (a) is resolved.
,
Jun 11 2016
#43: > I see, ok. At the same time, you can obtain this same data by patching sendBeacon() via a content script -- correct? Instrumenting API functions by patching them in content scripts isn't foolproof: a script running on a page is capable of detecting when functions like navigator.sendBeacon() have been redefined by a content script, and can decide to change their behaviour on that basis (or even attempt to bypass the instrumentation by getting a reference to the original navigator.sendBeacon). For our particular use case, that would severely compromise the accuracy of any data we collect (and is an acknowledged problem in academic web measurement studies). > I'm not suggesting this as an endorsed method, but weighing this particular use case (which, presumably has a relatively small population and an expiry date for the experiment) against all the points raised in #12. This isn't just about us and our use case, though (and it originally had nothing to do with us anyway: the request to assign a more meaningful resource type to beacons, #512406, was initiated by someone else): beacons aren't XMLHttpRequests, and shouldn't be labelled as such. You're proposing that an API lie to its users on the basis that they might do things you personally find objectionable with the information it provides. I don't consider that a justifiable reason to withhold the information at all.
,
Dec 16 2016
,
Jun 1 2017
sendBeacon is under big refactoring. Hope that we can fix this easily and cleanly once it's done.
,
Jun 1 2017
,
Jun 2 2017
I disagree with this being a bug in the first place. If everything gets lumper together in "fetch" (which maps to the "xmlhttprequest" type in the webRequest API), then the webRequest events should be extended with more information to assist extensions with classifying requests.
,
May 8 2018
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by rdevlin....@chromium.org
, May 12 2016