Issue metadata
Sign in to add a comment
|
system_health.common_desktop's browse:media:flickr_infinite_scroll generating a too-big trace on Windows |
||||||||||||||||||||||
Issue descriptionsystem_health.common_desktop failing on 5 builders Builders failed on: - Win 10 High-DPI Perf: https://build.chromium.org/p/chromium.perf/builders/Win%2010%20High-DPI%20Perf - Win 7 ATI GPU Perf: https://build.chromium.org/p/chromium.perf/builders/Win%207%20ATI%20GPU%20Perf - Win 7 Intel GPU Perf: https://build.chromium.org/p/chromium.perf/builders/Win%207%20Intel%20GPU%20Perf - Win 7 Nvidia GPU Perf: https://build.chromium.org/p/chromium.perf/builders/Win%207%20Nvidia%20GPU%20Perf - Win 7 x64 Perf: https://build.chromium.org/p/chromium.perf/builders/Win%207%20x64%20Perf
,
Jun 2 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8977876431121517248
,
Jun 2 2017
I saw a MemoryError in this log: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_10_High-DPI_Perf%2F663%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.common_desktop_on_Intel_GPU_on_Windows_on_Windows-10-10240%2F0%2Fstdout I bet this is related to me enabling the CPU time metric and the "toplevel" trace category yesterday, which would definitely cause the size of the traces to grow. It seems like all of the failures are on our Windows bots, where Python is limited to 2GB.
,
Jun 2 2017
(Python is limited to 2GB because we use the 32 bit version, and 32 bit Windows processes are limited to using 2GB of RAM.)
,
Jun 2 2017
Yea, according to https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_7_ATI_GPU_Perf%2F808%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.common_desktop_on_ATI_GPU_on_Windows_on_Windows-2008ServerR2-SP1%2F0%2Fstdout, the size of the Chrome trace itself is >2GB, so we're going to have problems here. https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_7_ATI_GPU_Perf%2F808%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.common_desktop_on_ATI_GPU_on_Windows_on_Windows-2008ServerR2-SP1%2F0%2Fstdout
,
Jun 2 2017
One thing that's interesting: the Mac version of the same trace (browse:media:flickr_infinite_scroll) seems to have a *significantly* smaller trace (150MB on Mac, as opposed to 1.2GB on Windows). That story, which causes the rest of the Windows system health stories to not even run, is among the largest Mac traces, but isn't _the_ largest. misc:multitab, for example, is right around the same size, as is browse:news:nytimes. I think that something on Windows is causing the trace size to be significantly larger than it should be.
,
Jun 2 2017
Revert in progress here: https://chromium-review.googlesource.com/c/522887
,
Jun 2 2017
Unfortunately, the only trace that was collected on Windows before the failure was browse_news_cnn, which is disabled on Mac. Comparing it to Linux, though, it's actually pretty comparable: the Windows trace is something like 110MB, and the Linux trace is about 90MB. This makes me think that there's not some universal problem where Windows always issues way more trace events than other platforms. However, looking at flickr_infinite_scroll, it looks like there may be some problem specific to Windows. The Linux trace size is 400MB, whereas the Windows trace size is more like 2GB.
,
Jun 2 2017
Ack! I don't think this is actually my CLs fault: I think this is due to Ned's migration of the v8.infinite_scroll stories to the system health pageset. His CL (https://codereview.chromium.org/2890283002) is in the list of changelists for the first failing CL. I'm going to go ahead and disable this story on Windows.
,
Jun 2 2017
Going to go ahead and assign this to Ned.
,
Jun 2 2017
Aaaand there's already been someone that did all of this investigation :-(
,
Jun 2 2017
(Just to save someone the trouble of clicking the dupe link: this story was disabled on Windows by Stephen 24h ago and is passing in the last run: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FWin_10_High-DPI_Perf%2F664%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.common_desktop_on_Intel_GPU_on_Windows_on_Windows-10-10240%2F0%2Fstdout)
,
Jun 3 2017
=== BISECT JOB RESULTS === Bisect failed for unknown reasons Please contact the team (see below) and report the error. Bisect Details Configuration: winx64_high_dpi_perf_bisect Benchmark : system_health.common_desktop Metric : after_load:power_avg/after_load:power_avg To Run This Test src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests system_health.common_desktop Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8977876431121517248 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=5825151280611328 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
Jun 3 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8977797067289172512
,
Jun 4 2017
=== BISECT JOB RESULTS === Bisect failed for unknown reasons Please contact the team (see below) and report the error. Bisect Details Configuration: winx64_high_dpi_perf_bisect Benchmark : system_health.common_desktop Metric : after_load:power_avg/after_load:power_avg To Run This Test src/tools/perf/run_benchmark -v --browser=release_x64 --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests system_health.common_desktop Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8977797067289172512 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=5825151280611328 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you! |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by simonhatch@chromium.org
, Jun 2 2017I see this in log: Traceback (most recent call last): RunBenchmark at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\story_runner.py:384 expectations=expectations) Run at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\story_runner.py:253 _RunStoryAndProcessErrorIfNeeded(story, results, state, test) _RunStoryAndProcessErrorIfNeeded at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\internal\story_runner.py:90 state.RunStory(results) traced_function at c:\b\s\w\ir\third_party\catapult\common\py_trace_event\py_trace_event\trace_event_impl\decorators.py:75 return func(*args, **kwargs) RunStory at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\page\shared_page_state.py:296 self._current_page.Run(self) Run at c:\b\s\w\ir\third_party\catapult\telemetry\telemetry\page\__init__.py:112 self.RunPageInteractions(action_runner) RunPageInteractions at c:\b\s\w\ir\tools\perf\page_sets\system_health\system_health_story.py:142 self._DidLoadDocument(action_runner) _DidLoadDocument at c:\b\s\w\ir\tools\perf\page_sets\system_health\browsing_stories.py:807 self._Scroll(action_runner, self.SCROLL_DISTANCE, self.SCROLL_STEP) _Scroll at c:\b\s\w\ir\tools\perf\page_sets\system_health\browsing_stories.py:828 raise Exception('Scrolling stuck at %d' % remaining) Exception: Scrolling stuck at 18560 Locals: action_runner : <telemetry.internal.actions.action_runner.ActionRunner object at 0x040508D0> distance : 25000 new_remaining : 18560 remaining : 18560 retry_count : 3 step_size : 1000