Making sure that snap_page script can serialize document's iframes |
|||||
Issue descriptionchrishtr@ found that telemetry/bin/snap_page may not support serializing the content of iframe currently. We should verify and fix it if that's the case.
,
Sep 7 2017
This is for cross-domain iframes in particular.
,
Sep 7 2017
Web security does not allow script to modify such frames, so the web security model needs to be bypassed and the script injected into each. This is why snap-it has a two-component model of snapping - one to collect info from a particular frame, and another to collect all the frames together into a final output.
,
Sep 7 2017
chrishtr@/pdr@: do you have pointer to how snap-it's Chrome extension package all the serialized form of the top level document & the iframes together? Telemetry does support injecting script in iframe, but I am not sure what's the good way to package the iframes' snapshot together with the host's snapshot.
,
Sep 7 2017
See this file: https://github.com/progers/snap-it/blob/master/popup.js completeProcess is passed a list of (JSON?) representing each frame's content. There are 'holes' in it to represent things like iframes ("frameHoles"). These are pieced into the final output in fillRemainingHolesAndMinimizeStyles. See also this presentation https://docs.google.com/presentation/d/1V9_0cqyANqOcTloJA0UtiIRjqWJ0g11otRo7FypPqis/edit#slide=id.g165452952a_0_37 (google-internal).
,
Sep 7 2017
So there needs to be one place where popup.js runs (can be in the main frame), and a way to inject scripts into all iframes, to serialize them, and then pipe their results back to the main frame's callback.
,
Sep 14 2017
I look into this a bit more. Turn out that we today don't have any good method to list out all the iframe context in Telemetry. The closest we have to this is https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_runtime.py?rcl=49fbcfa16b5d148a595cc50036d5e8354739f13f&l=60 that enable access to iframes & return the number of iframe contexts. Yet, there is no way to access those iframe without knowing their context id. Once we know it, tab.EvaluvateJavascript(..) can take in a context_id variable which allows inject script into an iframe. +Pavel: do you know if there is devtool API for listing out all the iframe context id?
,
Sep 14 2017
Chris is saying this is for the oopifs. Oopifs are separate targets in the target tree, you can attach to them as you attach to the page and do everything you do to a regular page. Use "Target" domain to list all the oopifs in the page and attach to them for debugging.
,
Sep 14 2017
I wasn't talking about oopifs actually. I was talking about cross-domain iframes within the same render process. Script in the parent of a cross-domain iframe cannot inject script into the child iframe, it needs help from outside of Blink to inject it.
,
Sep 14 2017
Ah, that's easier. In this case, when you do Runtime.enable, you start receiving Runtime.executionContextCreated events with frame ids to correlate those. See https://chromedevtools.github.io/devtools-protocol/tot/Runtime/#event-executionContextCreated. Side note: you might be able to get markup for the frames over DOM domain, you need a document handle, see DOM.getOuterHTML for it.
,
Sep 14 2017
Though Pavel brought a great point about OOPIF. When OOPIF is the default, we would need two different mechanisms for injecting script into cross-domain iframes?
,
Sep 14 2017
Yes, new mechanisms will probably be needed for OOPIF, but what he suggested in comment 8 sounds like it will work.
,
Sep 14 2017
@chrishtr: curious about your use case. We've recently added a bunch of capabilities for automation into the protocol for Puppeteer project, maybe you could use some of those to make your tests more straightforward.
,
Sep 14 2017
The use case is to snapshot exact pixel contents of a webpage into a serialized form (similar to SnappySnippet).
,
Sep 15 2017
I am taking this one.
,
Sep 20 2017
I have WIP CL in https://codereview.chromium.org/3017573002 . This is failing on sites like cnn & wired because sending the serialized doms of the iframes from Telemetry to the tab breaks the devtool connection & crash it. stdout: [18598:57859:0920/141852.530754:ERROR:http_connection.cc(37)] Too large read data is pending: capacity=1048576, max_buffer_size=1048576, read=1048576 I am thinking about trying to run the subdoms combination step in an headless environment. If anyone has better idea to address this, please let me know.
,
Sep 20 2017
A few thoughts in no order: 1. Have telemetry script upload the serialized dom to GCS and then just pass sufficient key info to the JS, and have JS fetch from GCS. Slow and clutters up GCS but it seems like it could work? 2. Log size of data, see how much it's exceeding the http conn limit by, look at increasing http conn limit in some manner for use in exceptional cases like this. 3. Perhaps gzip and base64-encode, then decode/gunzip in JS.
,
Sep 20 2017
Thanks for the suggestion in #17, Walter. I ended up stringify the json blob, then chunk it into strings of size 100k characters max & send them back to the browser. Work fine for a few content heavy sites (CNN, politico, theverge, amazon, ads-blocker.com/testing/).
,
Sep 22 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/f3ede803cda7e7ba7874d05990addc9d8340a3bf commit f3ede803cda7e7ba7874d05990addc9d8340a3bf Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org> Date: Fri Sep 22 05:05:28 2017 Roll src/third_party/catapult/ 703485470..b233ea0e0 (3 commits) https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/703485470a88..b233ea0e0e23 $ git log 703485470..b233ea0e0 --date=short --no-merges --format='%ad %ae %s' 2017-09-21 yukiy Add a comment for 'ensure_diskcache' 2017-09-21 nednguyen Refactor EnableAllContexts to return all activated contexts 2017-09-21 crouleau Allow pages to set custom options. Created with: roll-dep src/third_party/catapult BUG= 763119 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=sullivan@chromium.org Change-Id: I176cd29240e17b18af40bd927d5148517115ad49 Reviewed-on: https://chromium-review.googlesource.com/678119 Reviewed-by: <catapult-deps-roller@chromium.org> Commit-Queue: <catapult-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#503663} [modify] https://crrev.com/f3ede803cda7e7ba7874d05990addc9d8340a3bf/DEPS
,
Sep 26 2017
I can confirm that the CL enables us capture more content on the page. Before: https://screenshot.googleplex.com/8RtPu3vsjuo.png (Googler only - sorry) After: https://screenshot.googleplex.com/BAThQHpFfXN.png (Googler only - sorry)
,
Sep 26 2017
,
Sep 26 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/15edc187d1318ee8ac8d556b690a680978cc0b57 commit 15edc187d1318ee8ac8d556b690a680978cc0b57 Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org> Date: Tue Sep 26 13:51:53 2017 Roll src/third_party/catapult/ 639e972bf..0b563bed3 (1 commit) https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/639e972bf15c..0b563bed301e $ git log 639e972bf..0b563bed3 --date=short --no-merges --format='%ad %ae %s' 2017-09-26 nednguyen Roll snap-it to the latest version Created with: roll-dep src/third_party/catapult BUG= 763119 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=sullivan@chromium.org Change-Id: I96c44060e96843f8352d87872b0be5aea0602f46 Reviewed-on: https://chromium-review.googlesource.com/684554 Reviewed-by: <catapult-deps-roller@chromium.org> Commit-Queue: <catapult-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#504354} [modify] https://crrev.com/15edc187d1318ee8ac8d556b690a680978cc0b57/DEPS
,
Sep 28 2017
The following revision refers to this bug: https://chromium.googlesource.com/catapult/+/4022e0e9056de39cfe65e79cf201a3cb62b354a3 commit 4022e0e9056de39cfe65e79cf201a3cb62b354a3 Author: Nghia Nguyen <nednguyen@google.com> Date: Thu Sep 28 23:14:05 2017 Make sure snap_page combined iframe serialized dom The original CL is in https://codereview.chromium.org/3017573002/ and now moved here due to transition of code review server from gerrit to rietveld. This patch also include refactoring the code for transmitting large JSON object to a separate method & unittest it. BUG= chromium:763119 Change-Id: If5c00a45ecd7f91e8cbac60340a4f2124872da45 Reviewed-on: https://chromium-review.googlesource.com/690855 Reviewed-by: Walter Korman <wkorman@chromium.org> Commit-Queue: Ned Nguyen <nednguyen@google.com> [modify] https://crrev.com/4022e0e9056de39cfe65e79cf201a3cb62b354a3/telemetry/telemetry/internal/snap_page_util.py [modify] https://crrev.com/4022e0e9056de39cfe65e79cf201a3cb62b354a3/telemetry/telemetry/internal/snap_page_util_unittest.py
,
Sep 28 2017
,
Sep 29 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9b6e6685559448b0e587b58f840eb5cc1262973d commit 9b6e6685559448b0e587b58f840eb5cc1262973d Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org> Date: Fri Sep 29 04:36:29 2017 Roll src/third_party/catapult/ f8b2cb4ef..4022e0e90 (1 commit) https://chromium.googlesource.com/catapult.git/+log/f8b2cb4efd8c..4022e0e9056d $ git log f8b2cb4ef..4022e0e90 --date=short --no-merges --format='%ad %ae %s' 2017-09-28 nednguyen Make sure snap_page combined iframe serialized dom Created with: roll-dep src/third_party/catapult BUG= 763119 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=sullivan@chromium.org Change-Id: Ie996c1b46a32dcdcea15c4575ba139aa6e6567e9 Reviewed-on: https://chromium-review.googlesource.com/691159 Reviewed-by: <catapult-deps-roller@chromium.org> Commit-Queue: <catapult-deps-roller@chromium.org> Cr-Commit-Position: refs/heads/master@{#505291} [modify] https://crrev.com/9b6e6685559448b0e587b58f840eb5cc1262973d/DEPS |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by nedngu...@google.com
, Sep 7 2017