chrome.webRequest: resources reported with missing frameIds
Reported by
amiag...@gmail.com,
Apr 30 2018
|
|||||||||
Issue descriptionUserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.139 Safari/537.36 Steps to reproduce the problem: 1. Create an extension that listens for all requests using chrome.webRequest.onBeforeRequest. 2. Keep a mapping of tab IDs to frame IDs to frame URLs. 3. Using your mapping, try to associate a frame URL with every request. What is the expected behavior? You are able to supply context (document/frame URLs for every request). What went wrong? Some requests arrive with a frame ID that did not go through chrome.webRequest.onBeforeRequest (and therefore you do not know about). Did this work before? N/A Does this work in other browsers? N/A Chrome version: 66.0.3359.139 Channel: n/a OS Version: Flash Version: You can see the problem by loading the attached demo extension and watching for MISSING FRAME DATA messages printed in the background page. This problem seems to happen with and without cache disabled in Dev Tools. This bug is related to https://github.com/EFForg/privacybadger/issues/1997. I am trying to find a workaround for the attribution problem by comparing initiator URLs to frame URLs, but I found that some requests arrive with unknown-to-me frame IDs, which means I can't verify whether I assigned the correct parent document to those resources.
,
May 1 2018
,
May 3 2018
Tested the issue on chrome reported version 66.0.3359.139 using Ubuntu 14.04 with steps mentioned below: 1) Launched chrome reported version and installed the extension provided in comment#0 2) On chrome://extensions page for the extension installed, clicked on background page link 3) Developer tools got opened, didn't observed data on Console @Reporter: Please find the attached screencast for your reference and let us know if we missed anything in verifying the issue, if possible could you please provide the screencast of the issue which helps us in better understanding, any further inputs will be most helpful. Thanks!
,
May 3 2018
You have to visit a page where lots of frames get loaded, such as nytimes.com; sorry if that wasn't clear.
,
May 3 2018
Thank you for providing more feedback. Adding the requester to the cc list. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
May 11 2018
,
May 14 2018
Able to reproduce this issue on reported version 66.0.3359.139 and latest canary 68.0.3429.0 using Mac 10.13.3, windows 10 and Ubuntu 17.10. i.e: Observing "MISSING FRAME DATA FOR tabId=16 frameId=0, docUrls=Object, details= ..." in background page console. This issue is seen from M-60. Hence considering this issue as Non-Regression and marking as Untriaged. Thanks!
,
May 18 2018
,
Jun 7 2018
Please let me know if I can help with anything.
,
Jun 22 2018
Could somebody from the extensions team who knows the chrome.webRequest API please take a look at this? Privacy Badger is affected more than other extensions by bugs in the webRequest API as Privacy Badger makes blocking decisions based on request attribution (instead of manually composed lists of URL patterns). If Privacy Badger can't correctly attribute a request to the top-level document URL that originated it, Privacy Badger will block and/or allow resources it shouldn't have.
,
Jun 22 2018
Have you taken into account that not all requests will have a frame id by design? Not every web request comes from a render frame. E.g. requests made by the browser and those made on behalf of service workers. Can you describe the issue you are facing in more detail?
,
Jun 22 2018
If we stick to nytimes.com, what I see is requests that belong to (advertising-related) frames reported with valid (>0) frame IDs, but these IDs reference frames that never went through my webRequest listener. You can see this for yourself by loading the demo extension attached to this issue and visiting nytimes.com.
,
Jun 26 2018
So again this seems WAI to me. Consider a page with the following html: <html> <iframe srcdoc="<img src='https://www.w3schools.com/html/pic_trulli.jpg'/>"></iframe> </body> </html> If you run your extension on this page, it will print "MISSING FRAME DATA". Basically, it's possible for a frame to generate network requests while there being no requests for the frame itself. So I am not sure what the issue is amiagkov@?
,
Jun 26 2018
I'm just trying to figure out how to accurately assign every request to the precise tab URL it came from. This is harder than it seems; going off of tab ID is not enough as the document may have changed in the meantime. This (correctly attributing requests to top-level documents) is clearly an issue that affects many Chrome extensions. If you visit a resource-rich page and then navigate away from the page while it's still loading, your privacy/ad blocking extension is likely to mis-report resources belonging to the previous site on the site you just navigated to. I could reproduce this problem with Ghostery, uBlock Origin, etc. In Privacy Badger's case, this common problem is not just a visual nit. Since Privacy Badger learns from browsing, incorrect attribution can lead to incorrect blocking decisions.
,
Jun 26 2018
Thanks for pointing out that request frame IDs can point to inline frames. This wasn't obvious to me and may be worth noting in the docs.
,
Jun 26 2018
Can I rely on the initiator property of the request details object? https://developer.chrome.com/extensions/webRequest states: >The origin where the request was initiated. This does not change through redirects. If this is an opaque origin, the string 'null' will be used. Are these always tab (top-level document) URLs? What are "opaque origins"?
,
Jun 26 2018
>> I'm just trying to figure out how to accurately assign every request to the precise tab URL it came from. This is harder than it seems; going off of tab ID is not enough as the document may have changed in the meantime. And it's also really hard within Chromium currently. The last I looked into it, some refactoring was needed to support this. >> This (correctly attributing requests to top-level documents) is clearly an issue that affects many Chrome extensions. If you visit a resource-rich page and then navigate away from the page while it's still loading, your privacy/ad blocking extension is likely to mis-report resources belonging to the previous site on the site you just navigated to. I could reproduce this problem with Ghostery, uBlock Origin, etc. Yeah I think most of such extensions rely on some combination of webRequest + webNavigation API to support their use cases. But agreed, it would be better if we could just send the top level document url. >> Thanks for pointing out that request frame IDs can point to inline frames. This wasn't obvious to me and may be worth noting in the docs. I think it's implied. The web request API will only notify users of actual network requests. >> Are these always tab (top-level document) URLs? What are "opaque origins"? No, this the origin of the requesting frame. If an iframe to xyz.com makes a request to abc.com, then the initiator for the request would be xyz.com. >> Can I rely on the initiator property of the request details object? Not all requests are made by a render frame. So this may be 'null'. There may be other cases I am missing here. Will close this for now. Feel free to open a feature request for top level document url and we can track it there.
,
Jun 26 2018
>I think it's implied. The web request API will only notify users of actual network requests. Well, yes, but when one looks at documentation and sees frameId on the details object, one might expect to be able to construct a hierarchy of frames using this information, but that's a gotcha, you can't, as frames might be inline in the HTML. I don't mean to argue but based on my experience this is unexpected and not user-friendly. I can see how an extension may want to keep track of frames for other purposes, not just in an attempt to compensate for not being provided tab URLs. But yes, this is getting off track, thank you for your help, I'll open a new issue.
,
Jun 26 2018
Filed issue 856766.
,
Jul 2
The NextAction date has arrived: 2018-07-02 |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by amiag...@gmail.com
, Apr 30 2018