New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 763119 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocking:
issue 762230



Sign in to add a comment

Making sure that snap_page script can serialize document's iframes

Project Member Reported by nedngu...@google.com, Sep 7 2017

Issue description

chrishtr@ found that telemetry/bin/snap_page may not support serializing the content of iframe currently. 

We should verify and fix it if that's the case.

 
Cc: chrishtr@chromium.org
This is for cross-domain iframes in particular.
Web security does not allow script to modify such frames, so the web security
model needs to be bypassed and the script injected into each. This is why
snap-it has a two-component model of snapping - one to collect info from a particular
frame, and another to collect all the frames together into a final output.
chrishtr@/pdr@: do you have pointer to how snap-it's Chrome extension package all the serialized form of the top level document & the iframes together? 

Telemetry does support injecting script in iframe, but I am not sure what's the good way to package the iframes' snapshot together with the host's snapshot.
See this file:

https://github.com/progers/snap-it/blob/master/popup.js

completeProcess is passed a list of (JSON?) representing each frame's content.
There are 'holes' in it to represent things like iframes ("frameHoles").

These are pieced into the final output in fillRemainingHolesAndMinimizeStyles.

See also this presentation https://docs.google.com/presentation/d/1V9_0cqyANqOcTloJA0UtiIRjqWJ0g11otRo7FypPqis/edit#slide=id.g165452952a_0_37 (google-internal).
So there needs to be one place where popup.js runs (can be in the main frame), and
a way to inject scripts into all iframes, to serialize them, and then pipe their results
back to the main frame's callback.
Cc: pfeldman@chromium.org
Status: Available (was: Untriaged)
I look into this a bit more. Turn out that we today don't have any good method to list out all the iframe context in Telemetry. The closest we have to this is https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_runtime.py?rcl=49fbcfa16b5d148a595cc50036d5e8354739f13f&l=60 that enable access to iframes & return the number of iframe contexts.

Yet, there is no way to access those iframe without knowing their context id. Once we know it, tab.EvaluvateJavascript(..) can take in a context_id variable which allows inject script into an iframe.

+Pavel: do you know if there is devtool API for listing out all the iframe context id?
Chris is saying this is for the oopifs. Oopifs are separate targets in the target tree, you can attach to them as you attach to the page and do everything you do to a regular page. Use "Target" domain to list all the oopifs in the page and attach to them for debugging.
I wasn't talking about oopifs actually. I was talking about cross-domain iframes
within the same render process. Script in the parent of a cross-domain iframe
cannot inject script into the child iframe, it needs help from outside of Blink
to inject it.
Ah, that's easier. In this case, when you do Runtime.enable, you start receiving Runtime.executionContextCreated events with frame ids to correlate those. See https://chromedevtools.github.io/devtools-protocol/tot/Runtime/#event-executionContextCreated.

Side note: you might be able to get markup for the frames over DOM domain, you need a document handle, see DOM.getOuterHTML for it.
Though Pavel brought a great point about OOPIF. When OOPIF is the default, we would need two different mechanisms for injecting script into cross-domain iframes? 
Yes, new mechanisms will probably be needed for OOPIF, but what he suggested in
comment 8 sounds like it will work.
@chrishtr: curious about your use case. We've recently added a bunch of capabilities for automation into the protocol for Puppeteer project, maybe you could use some of those to make your tests more straightforward.
The use case is to snapshot exact pixel contents of a webpage into a serialized
form (similar to SnappySnippet).
Owner: nedngu...@google.com
Status: Assigned (was: Available)
I am taking this one.
I have WIP CL in https://codereview.chromium.org/3017573002 . This is failing on sites like cnn & wired because sending the serialized doms of the iframes from Telemetry to the tab breaks the devtool connection & crash it.

stdout: 
[18598:57859:0920/141852.530754:ERROR:http_connection.cc(37)] Too large read data is pending: capacity=1048576, max_buffer_size=1048576, read=1048576

I am thinking about trying to run the subdoms combination step in an headless environment. If anyone has better idea to address this, please let me know.
A few thoughts in no order:

1. Have telemetry script upload the serialized dom to GCS and then just pass sufficient key info to the JS, and have JS fetch from GCS. Slow and clutters up GCS but it seems like it could work?

2. Log size of data, see how much it's exceeding the http conn limit by, look at increasing http conn limit in some manner for use in exceptional cases like this.

3. Perhaps gzip and base64-encode, then decode/gunzip in JS.
Thanks for the suggestion in #17, Walter. I ended up stringify the json blob, then chunk it into strings of size 100k characters max & send them back to the browser. Work fine for a few content heavy sites (CNN, politico, theverge, amazon, ads-blocker.com/testing/).


Project Member

Comment 19 by bugdroid1@chromium.org, Sep 22 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f3ede803cda7e7ba7874d05990addc9d8340a3bf

commit f3ede803cda7e7ba7874d05990addc9d8340a3bf
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Fri Sep 22 05:05:28 2017

Roll src/third_party/catapult/ 703485470..b233ea0e0 (3 commits)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/703485470a88..b233ea0e0e23

$ git log 703485470..b233ea0e0 --date=short --no-merges --format='%ad %ae %s'
2017-09-21 yukiy Add a comment for 'ensure_diskcache'
2017-09-21 nednguyen Refactor EnableAllContexts to return all activated contexts
2017-09-21 crouleau Allow pages to set custom options.

Created with:
  roll-dep src/third_party/catapult
BUG= 763119 


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: I176cd29240e17b18af40bd927d5148517115ad49
Reviewed-on: https://chromium-review.googlesource.com/678119
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#503663}
[modify] https://crrev.com/f3ede803cda7e7ba7874d05990addc9d8340a3bf/DEPS

I can confirm that the CL enables us capture more content on the page.

Before: https://screenshot.googleplex.com/8RtPu3vsjuo.png (Googler only - sorry)

After: https://screenshot.googleplex.com/BAThQHpFfXN.png (Googler only - sorry)
Labels: -Pri-3 Pri-2
Project Member

Comment 22 by bugdroid1@chromium.org, Sep 26 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/15edc187d1318ee8ac8d556b690a680978cc0b57

commit 15edc187d1318ee8ac8d556b690a680978cc0b57
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Tue Sep 26 13:51:53 2017

Roll src/third_party/catapult/ 639e972bf..0b563bed3 (1 commit)

https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/639e972bf15c..0b563bed301e

$ git log 639e972bf..0b563bed3 --date=short --no-merges --format='%ad %ae %s'
2017-09-26 nednguyen Roll snap-it to the latest version

Created with:
  roll-dep src/third_party/catapult
BUG= 763119 


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: I96c44060e96843f8352d87872b0be5aea0602f46
Reviewed-on: https://chromium-review.googlesource.com/684554
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#504354}
[modify] https://crrev.com/15edc187d1318ee8ac8d556b690a680978cc0b57/DEPS

Project Member

Comment 23 by bugdroid1@chromium.org, Sep 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/catapult/+/4022e0e9056de39cfe65e79cf201a3cb62b354a3

commit 4022e0e9056de39cfe65e79cf201a3cb62b354a3
Author: Nghia Nguyen <nednguyen@google.com>
Date: Thu Sep 28 23:14:05 2017

Make sure snap_page combined iframe serialized dom

The original CL is in https://codereview.chromium.org/3017573002/ and now moved
here due to transition of code review server from gerrit to rietveld.

This patch also include refactoring the code for transmitting large JSON object
to a separate method & unittest it.

BUG= chromium:763119 

Change-Id: If5c00a45ecd7f91e8cbac60340a4f2124872da45
Reviewed-on: https://chromium-review.googlesource.com/690855
Reviewed-by: Walter Korman <wkorman@chromium.org>
Commit-Queue: Ned Nguyen <nednguyen@google.com>

[modify] https://crrev.com/4022e0e9056de39cfe65e79cf201a3cb62b354a3/telemetry/telemetry/internal/snap_page_util.py
[modify] https://crrev.com/4022e0e9056de39cfe65e79cf201a3cb62b354a3/telemetry/telemetry/internal/snap_page_util_unittest.py

Status: Fixed (was: Assigned)
Project Member

Comment 25 by bugdroid1@chromium.org, Sep 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9b6e6685559448b0e587b58f840eb5cc1262973d

commit 9b6e6685559448b0e587b58f840eb5cc1262973d
Author: catapult-deps-roller@chromium.org <catapult-deps-roller@chromium.org>
Date: Fri Sep 29 04:36:29 2017

Roll src/third_party/catapult/ f8b2cb4ef..4022e0e90 (1 commit)

https://chromium.googlesource.com/catapult.git/+log/f8b2cb4efd8c..4022e0e9056d

$ git log f8b2cb4ef..4022e0e90 --date=short --no-merges --format='%ad %ae %s'
2017-09-28 nednguyen Make sure snap_page combined iframe serialized dom

Created with:
  roll-dep src/third_party/catapult
BUG= 763119 


Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, see:
http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls


CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=sullivan@chromium.org

Change-Id: Ie996c1b46a32dcdcea15c4575ba139aa6e6567e9
Reviewed-on: https://chromium-review.googlesource.com/691159
Reviewed-by: <catapult-deps-roller@chromium.org>
Commit-Queue: <catapult-deps-roller@chromium.org>
Cr-Commit-Position: refs/heads/master@{#505291}
[modify] https://crrev.com/9b6e6685559448b0e587b58f840eb5cc1262973d/DEPS

Sign in to add a comment