Issue metadata
Sign in to add a comment
|
browse:news:toi failing while navigating to article |
||||||||||||||||||||||
Issue description
browse:news:toi recently started failing consistently on Android bots.
Traceback (most recent call last):
File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/story_runner.py", line 90, in _RunStoryAndProcessErrorIfNeeded
state.RunStory(results)
File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
return func(*args, **kwargs)
File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/page/shared_page_state.py", line 296, in RunStory
self._current_page.Run(self)
File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/page/__init__.py", line 112, in Run
self.RunPageInteractions(action_runner)
File "/b/swarming/w/ir/tools/perf/page_sets/system_health/system_health_story.py", line 138, in RunPageInteractions
self._DidLoadDocument(action_runner)
File "/b/swarming/w/ir/tools/perf/page_sets/system_health/browsing_stories.py", line 84, in _DidLoadDocument
self._ScrollMainPage(action_runner)
File "/b/swarming/w/ir/tools/perf/page_sets/system_health/browsing_stories.py", line 98, in _ScrollMainPage
action_runner.tab.WaitForDocumentReadyStateToBeComplete()
File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
return func(*args, **kwargs)
File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 74, in WaitForDocumentReadyStateToBeComplete
'document.readyState == "complete"', timeout=timeout)
File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
return func(*args, **kwargs)
File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/browser/web_contents.py", line 191, in WaitForJavaScriptCondition
return self._inspector_backend.WaitForJavaScriptCondition(*args, **kwargs)
File "/b/swarming/w/ir/third_party/catapult/common/py_trace_event/py_trace_event/trace_event_impl/decorators.py", line 75, in traced_function
return func(*args, **kwargs)
File "/b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py", line 293, in WaitForJavaScriptCondition
e.message + '\n' + debug_message)
TimeoutException: Timed out while waiting 90s for IsJavaScriptExpressionTrue.
https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus5X_Perf%2F119%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.memory_mobile_on_Android%2F0%2Fstdout
Screenshot doesn't show anything out of the ordinary:
https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/profiler-file-id_12-2017-05-30_18-41-4847493.png
I'll proceed to disable the story and try to see if a bisect can tell us more.
,
May 31 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8978085089039045136
,
May 31 2017
=== BISECT JOB RESULTS === Bisect failed for unknown reasons Please contact the team (see below) and report the error. Bisect Details Configuration: android_nexus5X_perf_bisect Benchmark : system_health.memory_mobile Metric : memory:chrome:gpu_process:reported_by_chrome:effective_size_avg/browse_news/browse_news_toi Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=browse.news.toi system_health.memory_mobile Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8978085089039045136 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=6411051794432000 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
May 31 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8978083953568670000
,
May 31 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c8dbcf2d2bc2ea9ca225f24d78b39df7e5801fd5 commit c8dbcf2d2bc2ea9ca225f24d78b39df7e5801fd5 Author: Juan A. Navarro Perez <perezju@chromium.org> Date: Wed May 31 11:53:43 2017 [tools/perf] Disable browse:news:toi story Story started failing consistently on most Android bots. Bug: 728081 Change-Id: If456c63b818ccc4fcb90529f6d636c9d0c79f2ba Reviewed-on: https://chromium-review.googlesource.com/519383 Reviewed-by: Juan Antonio Navarro Pérez <perezju@chromium.org> Reviewed-by: Ned Nguyen <nednguyen@google.com> Commit-Queue: Juan Antonio Navarro Pérez <perezju@chromium.org> Cr-Commit-Position: refs/heads/master@{#475881} [modify] https://crrev.com/c8dbcf2d2bc2ea9ca225f24d78b39df7e5801fd5/tools/perf/page_sets/system_health/browsing_stories.py
,
May 31 2017
Devices on the N5X bisect bot are hosed (issue 728110). Will retry on another bot.
,
May 31 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8978078946716869616
,
May 31 2017
=== Auto-CCing suspected CL author hiroshige@chromium.org === Hi hiroshige@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Test failure found with culprit Suspected Commit Author : hiroshige Commit : 67e066b84ec7953ec06ecf252f466c2c2c063f33 Date : Tue May 30 15:09:31 2017 Subject: Reland of remove EventSender from ImageLoader (patchset #1 id:1 of https://codereview.chromium.org/2903773002/ ) Bisect Details Configuration: android_nexus5X_perf_bisect Benchmark : system_health.memory_mobile Metric : memory:chrome:gpu_process:reported_by_chrome:effective_size_avg/browse_news/browse_news_toi Revision Exit Code N chromium@475518 0 +- N/A 10 good chromium@475527 0 +- N/A 10 good chromium@475528 1 +- N/A 10 bad <-- chromium@475529 1 +- N/A 10 bad chromium@475530 1 +- N/A 10 bad chromium@475532 1 +- N/A 10 bad chromium@475536 1 +- N/A 10 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=browse.news.toi system_health.memory_mobile Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8978083953568670000 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=6411051794432000 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
May 31 2017
=== BISECT JOB RESULTS === Test failure found with culprit Suspected Commit Author : hiroshige Commit : 67e066b84ec7953ec06ecf252f466c2c2c063f33 Date : Tue May 30 15:09:31 2017 Subject: Reland of remove EventSender from ImageLoader (patchset #1 id:1 of https://codereview.chromium.org/2903773002/ ) Bisect Details Configuration: android_nexus6_perf_bisect Benchmark : system_health.memory_mobile Metric : memory:chrome:gpu_process:reported_by_chrome:effective_size_avg/browse_news/browse_news_toi Revision Exit Code N chromium@475518 0 +- N/A 10 good chromium@475527 0 +- N/A 10 good chromium@475528 1 +- N/A 10 bad <-- chromium@475529 1 +- N/A 10 bad chromium@475530 1 +- N/A 10 bad chromium@475532 1 +- N/A 10 bad chromium@475536 1 +- N/A 10 bad Please refer to the following doc on diagnosing memory regressions: https://chromium.googlesource.com/chromium/src/+/master/docs/memory-infra/memory_benchmarks.md To Run This Test src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=browse.news.toi system_health.memory_mobile Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8978078946716869616 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=4997151916032000 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
Jun 1 2017
Issue 728234 has been merged into this issue.
,
Jun 1 2017
Note: This has been duped by a bisect for the following significant memory regressions: https://chromeperf.appspot.com/group_report?bug_id=728081 hiroshige, unless you are already working on a fix for the issues introduced by your CL, we'll have to revert soon. +benhenry FYI
,
Jun 1 2017
As for memory regressions: My CL seems to cause both regressions and improvements in system_health.memory_mobile. Improvements in the same revision range: https://chromeperf.appspot.com/group_report?rev=1496265664 Slight timing changes of document load events tend to cause both regressions and improvements on memory benchmarks (e.g. Issue 663823 ), and I suspect the test is sensitive to a slight timing change that is not a real memory regression (depending the order of document load, GC, measurements, etc.?), but I haven't figured out why. +bashi@, what do you think?
,
Jun 1 2017
As for browse:news:toi, document seems to never become complete, so there might be a functional regression in document/image load event handling. I'll handle the browse:news:toi regression, independently from the memory regression, because the root causes of them look different.
,
Jun 1 2017
,
Jun 2 2017
,
Jun 2 2017
Issue 728232 has been merged into this issue.
,
Jun 2 2017
I filed issue 729008 to focus on the memory regressions and, as suggested, let's keep this about the probable functional regression in document/image load event handling.
,
Jun 2 2017
Issue 728306 has been merged into this issue.
,
Jun 7 2017
,
Jun 7 2017
,
Jun 8 2017
Any updates?
,
Jun 13 2017
Started investigation. Reproduced locally on Linux with command line: ./tools/perf/run_benchmark --also-run-disabled-tests --browser=release --story-filter=browse:news:toi system_health.memory_mobile According to DevTools, the browser repeatedly requests http://static.toiimg.com/photo/24487812.cms and gets 404 (from disk cache), which prevents the Document load completion and causing timeout.
,
Jun 13 2017
<img> with onerror: "this.src='http://static.toiimg.com/photo/24487812.cms'" is created by the following JavaScript: http://m.timesofindia.com/merg_js.cms?minify=1&donotshowurl=1&template=ipl_livescorewidget_js,navigation_test_js_new2,storage_js,utils_js_new2,breaking_news_new2_js,search_js_new2,imglazyload_js,gatrack_js,geo_js_new2,home_js,toi_slikevp_js,jquery_mobile_events_min_js,login_js_new2,newsletter_js,mitr_trending&version=492&v=492&msid=home&tn=default Therefore, the <img> causes repeated requests to 24487812.cms (request -> 404 response -> onerror handler -> request -> ...). After my CL, this delays document load event forever, because ImageLoader delays the document load event from setting this.src until onerror handler. Before my CL, the repeated requests still occur, but don't delay the load event forever, because onerror handler can't delay the load event when it is called from Document::ImplicitClose(). Test case: <body onload="alert('onload')"><img onerror="this.src='fox'" src="fox"></body> Is onload called? Chromium (before r475528): Yes Chromium (on or after r475528): No Firefox (53.0): No So I think: 1. r475528 does cause behavior change, but this change is probably OK, because it aligns with Firefox and with the spec (the things that occur in img's load/error events seem to further delay the document load event, according to the spec; I'm not so sure though). 2. the browse:news:toi test anyway contains repeated requests to 24487812.cms, which looks inappropriate for test stability and performance. Does the current live website still behave in the same way? If No, then probably we can fix the test by just updating the archived data for the test.
,
Jun 14 2017
Thanks for the investigation! +nednguyen, +ssid could either of you look into re-recording the story?
,
Jun 14 2017
Ashley: can you do the re-recording of this story? (I gonna show you how in person)
,
Jun 14 2017
Definitely! Let's walk through how to do it tomorrow morning
,
Jun 19 2017
,
Jun 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7c12598ed00f30cc125b9af98a7826edcbf47a39 commit 7c12598ed00f30cc125b9af98a7826edcbf47a39 Author: ashleymarie <ashleymarie@chromium.org> Date: Tue Jun 27 17:25:10 2017 Rerecorded browse:news:toi BUG= chromium:728081 Review-Url: https://codereview.chromium.org/2959743003 Cr-Commit-Position: refs/heads/master@{#482663} [modify] https://crrev.com/7c12598ed00f30cc125b9af98a7826edcbf47a39/tools/perf/benchmarks/system_health_smoke_test.py [modify] https://crrev.com/7c12598ed00f30cc125b9af98a7826edcbf47a39/tools/perf/page_sets/data/system_health_mobile.json [add] https://crrev.com/7c12598ed00f30cc125b9af98a7826edcbf47a39/tools/perf/page_sets/data/system_health_mobile_064.wpr.sha1 [modify] https://crrev.com/7c12598ed00f30cc125b9af98a7826edcbf47a39/tools/perf/page_sets/system_health/expectations.py
,
Jun 28 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8a0a93fdbca72e4455bc3dd876e28bf88473433b commit 8a0a93fdbca72e4455bc3dd876e28bf88473433b Author: ashleymarie <ashleymarie@chromium.org> Date: Wed Jun 28 13:19:26 2017 Revert of Rerecorded browse:news:toi (patchset #2 id:20001 of https://codereview.chromium.org/2959743003/ ) Reason for revert: Looks like this recording didn't fix the bug It's running and failing here: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5X%20Perf/builds/227 Original issue's description: > Rerecorded browse:news:toi > > BUG= chromium:728081 > > Review-Url: https://codereview.chromium.org/2959743003 > Cr-Commit-Position: refs/heads/master@{#482663} > Committed: https://chromium.googlesource.com/chromium/src/+/7c12598ed00f30cc125b9af98a7826edcbf47a39 TBR=nednguyen@google.com # Skipping CQ checks because original CL landed less than 1 days ago. NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true BUG= chromium:728081 Review-Url: https://codereview.chromium.org/2961013003 Cr-Commit-Position: refs/heads/master@{#482963} [modify] https://crrev.com/8a0a93fdbca72e4455bc3dd876e28bf88473433b/tools/perf/benchmarks/system_health_smoke_test.py [modify] https://crrev.com/8a0a93fdbca72e4455bc3dd876e28bf88473433b/tools/perf/page_sets/data/system_health_mobile.json [delete] https://crrev.com/fe7f0f48f41642e5c0427e6384db0b00f9dd50af/tools/perf/page_sets/data/system_health_mobile_064.wpr.sha1 [modify] https://crrev.com/8a0a93fdbca72e4455bc3dd876e28bf88473433b/tools/perf/page_sets/system_health/expectations.py
,
Jun 28 2017
,
Jul 11 2017
Hmm. > #22 > the browser repeatedly requests http://static.toiimg.com/photo/24487812.cms and gets 404 (from disk cache) Loading of https://static.toiimg.com/photo/24487812.cms is triggered by onerror of <img> for https://static.toiimg.com/thumb/59269412.cms?resizemode=1&width=180&height=135 I found both of the URLs exist on the live website (HTTP 200 with valid image responses), so perhaps the recorder missed those URLs. How the recorder works? Does it depend on behavior of Chromium of a specific version and/or on a specific device?
,
Nov 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e454e3d6f7c0c5dc24a7ba924a47da339297a7f8 commit e454e3d6f7c0c5dc24a7ba924a47da339297a7f8 Author: Mythri Alle <mythria@chromium.org> Date: Mon Nov 06 11:38:04 2017 Disable toi on v8.runtimestats.browsing_mobile This was already disabled on v8.browsing_mobile, so I think it should be disabled on v8.runtimestats as well. Also updated the bug id for cnn as well. Bug: chromium:728081 Change-Id: Ifc152a4a5db870b78ec8d0f9709decafabf11d6a Reviewed-on: https://chromium-review.googlesource.com/753385 Reviewed-by: Annie Sullivan <sullivan@chromium.org> Commit-Queue: Mythri Alle <mythria@chromium.org> Cr-Commit-Position: refs/heads/master@{#514110} [modify] https://crrev.com/e454e3d6f7c0c5dc24a7ba924a47da339297a7f8/tools/perf/benchmarks/v8_browsing.py
,
Jul 19
Helen, it looks like this is a case where we'd expect WPR to record a resource but, for whatever reason, it's not doing so. Would you be willing to look into the problem?
,
Jul 19
I think it's maybe a lot easier to just rerecord the page
,
Jul 19
Re#34: Ned, ashley re-record the page in https://codereview.chromium.org/2959743003. That didn't fix the issue. Why do you say it's easier to just rerecord?
,
Jul 19
Oh, this is this an old bug. I can rerecord using wprgo.
,
Jul 19
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9c4d6685c70ac682be20c91ab5e41baddcb82148 commit 9c4d6685c70ac682be20c91ab5e41baddcb82148 Author: Helen Li <xunjieli@chromium.org> Date: Thu Jul 19 23:42:16 2018 Re-enable system_health.memory_mobile/browse:news:toi with a new recording Re-recorded browse:news:toi using: ./tools/perf/record_wpr --also-run-disabled-tests --story-filter=browse:news:toi system_health_memory_mobile Confirmed the benchmark passes using: ./tools/perf/run_benchmark --also-run-disabled-tests --story-filter=browse:news:toi system_health.memory_mobile Uploaded the recording using: upload_to_google_storage.py --bucket chrome-partner-telemetry \ system_health_mobile_064.wprgo Bug: 728081 Change-Id: Ie0660440cf679358133d0a82ec5be4034b4e93c1 Reviewed-on: https://chromium-review.googlesource.com/1144220 Reviewed-by: Annie Sullivan <sullivan@chromium.org> Commit-Queue: Helen Li <xunjieli@chromium.org> Cr-Commit-Position: refs/heads/master@{#576693} [modify] https://crrev.com/9c4d6685c70ac682be20c91ab5e41baddcb82148/tools/perf/benchmarks/system_health_smoke_test.py [modify] https://crrev.com/9c4d6685c70ac682be20c91ab5e41baddcb82148/tools/perf/page_sets/data/system_health_mobile.json [add] https://crrev.com/9c4d6685c70ac682be20c91ab5e41baddcb82148/tools/perf/page_sets/data/system_health_mobile_064.wprgo.sha1
,
Jul 19
,
Jul 23
The NextAction date has arrived: 2018-07-23
,
Jul 23
Helen, we also need to reenable the story & see if it passes to declare fix. I will reenable the story
,
Jul 23
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/cde2cd23ac0f256d45b5dd77f22cbe29cbc228d1 commit cde2cd23ac0f256d45b5dd77f22cbe29cbc228d1 Author: Ned Nguyen <nednguyen@google.com> Date: Mon Jul 23 16:33:29 2018 Reenable browse:news:toi on Mobile TBR=xunjieli@chromium.org Bug:728081 Change-Id: I18ac62be4f68fb8501ab6c7064fc273b4ed58ff4 NOTRY=true # webkit_layout_tests flake crbug.com/866520 Change-Id: I18ac62be4f68fb8501ab6c7064fc273b4ed58ff4 Reviewed-on: https://chromium-review.googlesource.com/1146495 Commit-Queue: Ned Nguyen <nednguyen@google.com> Reviewed-by: Ned Nguyen <nednguyen@google.com> Reviewed-by: Helen Li <xunjieli@chromium.org> Cr-Commit-Position: refs/heads/master@{#577185} [modify] https://crrev.com/cde2cd23ac0f256d45b5dd77f22cbe29cbc228d1/tools/perf/expectations.config
,
Jul 24
|
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by perezju@chromium.org
, May 31 2017