New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 728081 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Jul 24
Cc:
Components:
EstimatedDays: ----
NextAction: 2018-07-23
OS: ----
Pri: 3
Type: Bug

Blocking:
issue 624697



Sign in to add a comment

browse:news:toi failing while navigating to article

Project Member Reported by perezju@chromium.org, May 31 2017

Issue description

browse:news:toi recently started failing consistently on Android bots.

Traceback (most recent call last):
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py", line 90, in _RunStoryAndProcessErrorIfNeeded
    state.RunStory(results)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/page/shared_page_state.py", line 296, in RunStory
    self._current_page.Run(self)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/page/__init__.py", line 112, in Run
    self.RunPageInteractions(action_runner)
  File "/b/swarming/w/ir/tools/perf/page_sets/system_health/system_health_story.py", line 138, in RunPageInteractions
    self._DidLoadDocument(action_runner)
  File "/b/swarming/w/ir/tools/perf/page_sets/system_health/browsing_stories.py", line 84, in _DidLoadDocument
    self._ScrollMainPage(action_runner)
  File "/b/swarming/w/ir/tools/perf/page_sets/system_health/browsing_stories.py", line 98, in _ScrollMainPage
    action_runner.tab.WaitForDocumentReadyStateToBeComplete()
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 74, in WaitForDocumentReadyStateToBeComplete
    'document.readyState == "complete"', timeout=timeout)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 191, in WaitForJavaScriptCondition
    return self._inspector_backend.WaitForJavaScriptCondition(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
    return func(*args, **kwargs)
  File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 293, in WaitForJavaScriptCondition
    e.message + '\n' + debug_message)
TimeoutException: Timed out while waiting 90s for IsJavaScriptExpressionTrue.
https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus5X_Perf%2F119%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.memory_mobile_on_Android%2F0%2Fstdout

Screenshot doesn't show anything out of the ordinary:
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/profiler-file-id_12-2017-05-30_18-41-4847493.png

I'll proceed to disable the story and try to see if a bisect can tell us more.
 
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, May 31 2017


=== BISECT JOB RESULTS ===
Bisect failed for unknown reasons

Please contact the team (see below) and report the error.


Bisect Details
  Configuration: android_nexus5X_perf_bisect
  Benchmark    : system_health.memory_mobile
  Metric       : memory:chrome:gpu_process:reported_by_chrome:effective_size_avg/browse_news/browse_news_toi


Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=browse.news.toi system_health.memory_mobile

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8978085089039045136

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=6411051794432000


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Project Member

Comment 5 by bugdroid1@chromium.org, May 31 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c8dbcf2d2bc2ea9ca225f24d78b39df7e5801fd5

commit c8dbcf2d2bc2ea9ca225f24d78b39df7e5801fd5
Author: Juan A. Navarro Perez <perezju@chromium.org>
Date: Wed May 31 11:53:43 2017

[tools/perf] Disable browse:news:toi story

Story started failing consistently on most Android bots.

Bug:  728081 
Change-Id: If456c63b818ccc4fcb90529f6d636c9d0c79f2ba
Reviewed-on: https://chromium-review.googlesource.com/519383
Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org>
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Juan Antonio Navarro Pérez <perezju@chromium.org>
Cr-Commit-Position: refs/heads/master@{#475881}
[modify] https://crrev.com/c8dbcf2d2bc2ea9ca225f24d78b39df7e5801fd5/tools/perf/page_sets/system_health/browsing_stories.py

Blockedon: 728110
Devices on the N5X bisect bot are hosed (issue 728110).

Will retry on another bot.
Project Member

Comment 8 by 42576172...@developer.gserviceaccount.com, May 31 2017

Cc: hirosh...@chromium.org
Owner: hirosh...@chromium.org

=== Auto-CCing suspected CL author hiroshige@chromium.org ===

Hi hiroshige@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Test failure found with culprit

Suspected Commit
  Author : hiroshige
  Commit : 67e066b84ec7953ec06ecf252f466c2c2c063f33
  Date   : Tue May 30 15:09:31 2017
  Subject: Reland of remove EventSender from ImageLoader (patchset #1 id:1 of https://codereview.chromium.org/2903773002/ )

Bisect Details
  Configuration: android_nexus5X_perf_bisect
  Benchmark    : system_health.memory_mobile
  Metric       : memory:chrome:gpu_process:reported_by_chrome:effective_size_avg/browse_news/browse_news_toi

Revision             Exit Code      N
chromium@475518      0 +- N/A       10      good
chromium@475527      0 +- N/A       10      good
chromium@475528      1 +- N/A       10      bad       <--
chromium@475529      1 +- N/A       10      bad
chromium@475530      1 +- N/A       10      bad
chromium@475532      1 +- N/A       10      bad
chromium@475536      1 +- N/A       10      bad

Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=browse.news.toi system_health.memory_mobile

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8978083953568670000

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=6411051794432000


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Project Member

Comment 9 by 42576172...@developer.gserviceaccount.com, May 31 2017


=== BISECT JOB RESULTS ===
Test failure found with culprit

Suspected Commit
  Author : hiroshige
  Commit : 67e066b84ec7953ec06ecf252f466c2c2c063f33
  Date   : Tue May 30 15:09:31 2017
  Subject: Reland of remove EventSender from ImageLoader (patchset #1 id:1 of https://codereview.chromium.org/2903773002/ )

Bisect Details
  Configuration: android_nexus6_perf_bisect
  Benchmark    : system_health.memory_mobile
  Metric       : memory:chrome:gpu_process:reported_by_chrome:effective_size_avg/browse_news/browse_news_toi

Revision             Exit Code      N
chromium@475518      0 +- N/A       10      good
chromium@475527      0 +- N/A       10      good
chromium@475528      1 +- N/A       10      bad       <--
chromium@475529      1 +- N/A       10      bad
chromium@475530      1 +- N/A       10      bad
chromium@475532      1 +- N/A       10      bad
chromium@475536      1 +- N/A       10      bad

Please refer to the following doc on diagnosing memory regressions:
  https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=browse.news.toi system_health.memory_mobile

Debug Info
  https://chromeperf.appspot.com/buildbucket_job_status/8978078946716869616

Is this bisect wrong?
  https://chromeperf.appspot.com/bad_bisect?try_job_id=4997151916032000


| O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq
|  X  | for more information addressing perf regression bugs. For feedback,
| / \ | file a bug with component Speed>Bisection.  Thank you!
Issue 728234 has been merged into this issue.
Cc: benhenry@chromium.org
Note: This has been duped by a bisect for the following significant memory regressions:

https://chromeperf.appspot.com/group_report?bug_id=728081

hiroshige, unless you are already working on a fix for the issues introduced by your CL, we'll have to revert soon.

+benhenry FYI
Blocking: 624697
Cc: bashi@chromium.org
As for memory regressions:

My CL seems to cause both regressions and improvements in system_health.memory_mobile.
Improvements in the same revision range:
https://chromeperf.appspot.com/group_report?rev=1496265664

Slight timing changes of document load events tend to cause both regressions and improvements on memory benchmarks (e.g.  Issue 663823 ), and I suspect the test is sensitive to a slight timing change that is not a real memory regression (depending the order of document load, GC, measurements, etc.?), but I haven't figured out why.

+bashi@, what do you think?


As for browse:news:toi, document seems to never become complete, so there might be a functional regression in document/image load event handling.

I'll handle the browse:news:toi regression, independently from the memory regression, because the root causes of them look different.
Cc: jasontiller@chromium.org
 Issue 728307  has been merged into this issue.
Blockedon: -728110
Issue 728232 has been merged into this issue.
I filed issue 729008 to focus on the memory regressions and, as suggested, let's keep this about the probable functional regression in document/image load event handling.
Issue 728306 has been merged into this issue.
Cc: perezju@chromium.org
 Issue 729764  has been merged into this issue.
Cc: miu@google.com
 Issue 729241  has been merged into this issue.
Any updates?
Status: Started (was: Assigned)
Started investigation.

Reproduced locally on Linux with command line:
./tools/perf/run_benchmark --also-run-disabled-tests --browser=release --story-filter=browse:news:toi system_health.memory_mobile

According to DevTools, the browser repeatedly requests http://static.toiimg.com/photo/24487812.cms and gets 404 (from disk cache), which prevents the Document load completion and causing timeout.

Components: Blink>Loader
Labels: Performance-Memory
<img> with onerror: "this.src='http://static.toiimg.com/photo/24487812.cms'" is created by the following JavaScript:
http://m.timesofindia.com/merg_js.cms?minify=1&donotshowurl=1&template=ipl_livescorewidget_js,navigation_test_js_new2,storage_js,utils_js_new2,breaking_news_new2_js,search_js_new2,imglazyload_js,gatrack_js,geo_js_new2,home_js,toi_slikevp_js,jquery_mobile_events_min_js,login_js_new2,newsletter_js,mitr_trending&version=492&v=492&msid=home&tn=default

Therefore, the <img> causes repeated requests to 24487812.cms (request -> 404 response -> onerror handler -> request -> ...).

After my CL, this delays document load event forever, because ImageLoader delays the document load event from setting this.src until onerror handler.
Before my CL, the repeated requests still occur, but don't delay the load event forever, because onerror handler can't delay the load event when it is called from Document::ImplicitClose().

Test case:
<body onload="alert('onload')"><img onerror="this.src='fox'" src="fox"></body>

Is onload called?
Chromium (before r475528): Yes
Chromium (on or after r475528): No
Firefox (53.0): No

So I think:
1. r475528 does cause behavior change, but this change is probably OK, because it aligns with Firefox and with the spec (the things that occur in img's load/error events seem to further delay the document load event, according to the spec; I'm not so sure though).
2. the browse:news:toi test anyway contains repeated requests to 24487812.cms, which looks inappropriate for test stability and performance. Does the current live website still behave in the same way? If No, then probably we can fix the test by just updating the archived data for the test.

Cc: nedngu...@google.com ssid@chromium.org
Owner: ----
Status: Available (was: Started)
Thanks for the investigation!

+nednguyen, +ssid could either of you look into re-recording the story?
Owner: ashleymarie@chromium.org
Status: Assigned (was: Available)
Ashley: can you do the re-recording of this story?
(I gonna show you how in person)
Definitely! Let's walk through how to do it tomorrow morning

Comment 27 by m...@chromium.org, Jun 19 2017

Cc: -miu@google.com
Project Member

Comment 29 by bugdroid1@chromium.org, Jun 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8a0a93fdbca72e4455bc3dd876e28bf88473433b

commit 8a0a93fdbca72e4455bc3dd876e28bf88473433b
Author: ashleymarie <ashleymarie@chromium.org>
Date: Wed Jun 28 13:19:26 2017

Revert of Rerecorded browse:news:toi (patchset #2 id:20001 of https://codereview.chromium.org/2959743003/ )

Reason for revert:
Looks like this recording didn't fix the bug
It's running and failing here: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5X%20Perf/builds/227

Original issue's description:
> Rerecorded browse:news:toi
>
> BUG= chromium:728081 
>
> Review-Url: https://codereview.chromium.org/2959743003
> Cr-Commit-Position: refs/heads/master@{#482663}
> Committed: https://chromium.googlesource.com/chromium/src/+/7c12598ed00f30cc125b9af98a7826edcbf47a39

TBR=nednguyen@google.com
# Skipping CQ checks because original CL landed less than 1 days ago.
NOPRESUBMIT=true
NOTREECHECKS=true
NOTRY=true
BUG= chromium:728081 

Review-Url: https://codereview.chromium.org/2961013003
Cr-Commit-Position: refs/heads/master@{#482963}

[modify] https://crrev.com/8a0a93fdbca72e4455bc3dd876e28bf88473433b/tools/perf/benchmarks/system_health_smoke_test.py
[modify] https://crrev.com/8a0a93fdbca72e4455bc3dd876e28bf88473433b/tools/perf/page_sets/data/system_health_mobile.json
[delete] https://crrev.com/fe7f0f48f41642e5c0427e6384db0b00f9dd50af/tools/perf/page_sets/data/system_health_mobile_064.wpr.sha1
[modify] https://crrev.com/8a0a93fdbca72e4455bc3dd876e28bf88473433b/tools/perf/page_sets/system_health/expectations.py

Cc: altimin@chromium.org
 Issue 737239  has been merged into this issue.
Hmm.

> #22
> the browser repeatedly requests http://static.toiimg.com/photo/24487812.cms and gets 404 (from disk cache)

Loading of https://static.toiimg.com/photo/24487812.cms
is triggered by onerror of <img> for
https://static.toiimg.com/thumb/59269412.cms?resizemode=1&width=180&height=135

I found both of the URLs exist on the live website (HTTP 200 with valid image responses), so perhaps the recorder missed those URLs.
How the recorder works? Does it depend on behavior of Chromium of a specific version and/or on a specific device?

Project Member

Comment 32 by bugdroid1@chromium.org, Nov 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e454e3d6f7c0c5dc24a7ba924a47da339297a7f8

commit e454e3d6f7c0c5dc24a7ba924a47da339297a7f8
Author: Mythri Alle <mythria@chromium.org>
Date: Mon Nov 06 11:38:04 2017

Disable toi on v8.runtimestats.browsing_mobile

This was already disabled on v8.browsing_mobile, so I think
it should be disabled on v8.runtimestats as well.

Also updated the bug id for cnn as well.

Bug:  chromium:728081 
Change-Id: Ifc152a4a5db870b78ec8d0f9709decafabf11d6a
Reviewed-on: https://chromium-review.googlesource.com/753385
Reviewed-by: Annie Sullivan <sullivan@chromium.org>
Commit-Queue: Mythri Alle <mythria@chromium.org>
Cr-Commit-Position: refs/heads/master@{#514110}
[modify] https://crrev.com/e454e3d6f7c0c5dc24a7ba924a47da339297a7f8/tools/perf/benchmarks/v8_browsing.py

NextAction: 2018-07-23
Owner: xunji...@chromium.org
Helen, it looks like this is a case where we'd expect WPR to record a resource but, for whatever reason, it's not doing so. Would you be willing to look into the problem?
I think it's maybe a lot easier to just rerecord the page
Re#34: Ned, ashley re-record the page in https://codereview.chromium.org/2959743003. That didn't fix the issue. Why do you say it's easier to just rerecord?
Oh, this is this an old bug. I can rerecord using wprgo.

Project Member

Comment 37 by bugdroid1@chromium.org, Jul 19

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9c4d6685c70ac682be20c91ab5e41baddcb82148

commit 9c4d6685c70ac682be20c91ab5e41baddcb82148
Author: Helen Li <xunjieli@chromium.org>
Date: Thu Jul 19 23:42:16 2018

Re-enable system_health.memory_mobile/browse:news:toi with a new recording

Re-recorded browse:news:toi using:
./tools/perf/record_wpr --also-run-disabled-tests --story-filter=browse:news:toi system_health_memory_mobile

Confirmed the benchmark passes using:
./tools/perf/run_benchmark --also-run-disabled-tests --story-filter=browse:news:toi system_health.memory_mobile

Uploaded the recording using:
upload_to_google_storage.py --bucket chrome-partner-telemetry \
    system_health_mobile_064.wprgo

Bug:  728081 
Change-Id: Ie0660440cf679358133d0a82ec5be4034b4e93c1
Reviewed-on: https://chromium-review.googlesource.com/1144220
Reviewed-by: Annie Sullivan <sullivan@chromium.org>
Commit-Queue: Helen Li <xunjieli@chromium.org>
Cr-Commit-Position: refs/heads/master@{#576693}
[modify] https://crrev.com/9c4d6685c70ac682be20c91ab5e41baddcb82148/tools/perf/benchmarks/system_health_smoke_test.py
[modify] https://crrev.com/9c4d6685c70ac682be20c91ab5e41baddcb82148/tools/perf/page_sets/data/system_health_mobile.json
[add] https://crrev.com/9c4d6685c70ac682be20c91ab5e41baddcb82148/tools/perf/page_sets/data/system_health_mobile_064.wprgo.sha1

Status: Fixed (was: Assigned)
The NextAction date has arrived: 2018-07-23
Status: Started (was: Fixed)
Helen, we also need to reenable the story & see if it passes to declare fix. I will reenable the story
Project Member

Comment 41 by bugdroid1@chromium.org, Jul 23

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/cde2cd23ac0f256d45b5dd77f22cbe29cbc228d1

commit cde2cd23ac0f256d45b5dd77f22cbe29cbc228d1
Author: Ned Nguyen <nednguyen@google.com>
Date: Mon Jul 23 16:33:29 2018

Reenable browse:news:toi on Mobile

TBR=xunjieli@chromium.org

Bug:728081
Change-Id: I18ac62be4f68fb8501ab6c7064fc273b4ed58ff4

NOTRY=true # webkit_layout_tests flake  crbug.com/866520 

Change-Id: I18ac62be4f68fb8501ab6c7064fc273b4ed58ff4
Reviewed-on: https://chromium-review.googlesource.com/1146495
Commit-Queue: Ned Nguyen <nednguyen@google.com>
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Reviewed-by: Helen Li <xunjieli@chromium.org>
Cr-Commit-Position: refs/heads/master@{#577185}
[modify] https://crrev.com/cde2cd23ac0f256d45b5dd77f22cbe29cbc228d1/tools/perf/expectations.config

Status: Fixed (was: Started)

Sign in to add a comment