v8.runtimestats.browsing_mobile failing on multiple bots |
||||||||
Issue description
,
Feb 21 2017
So when I look at this bot: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%282%29?numbuilds=200 Run https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%282%29/builds/5074 Does not contain the test, Run https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%282%29/builds/5075 does but I dont see any CL that would have added or re-enabled it... Ned, have you ever seen something like this before? Similar thing on these two runs: Doesn't have: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus6%20Perf%20%282%29/builds/5078 Has:https: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus6%20Perf%20%282%29/builds/5079
,
Feb 21 2017
+V8 folks: anyone know when did this benchmark got reenabled?
,
Feb 22 2017
This benchmark is added by this cl: https://codereview.chromium.org/2639213002/. This is the new v8 browse benchmark that collects runtime statistics, which landed on Friday. The commit position is in between the two builds. When I checked it locally, it is working fine on the nexus 5 device. I have this cl: https://codereview.chromium.org/2709173002/ to disable these benchmarks.
,
Feb 22 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8e74b82d093c3612228d2812eadb56e1f1854a96 commit 8e74b82d093c3612228d2812eadb56e1f1854a96 Author: mythria <mythria@chromium.org> Date: Wed Feb 22 10:49:12 2017 [perfbot health] Disable v8.runtimestats.browsing_mobile benchmarks. Disable v8.runtimestats.browsing_mobile, v8.runtimestats.browsing_mobile_turbo benchmarks. BUG= chromium:694658 TBR=nednguyen@google.com Review-Url: https://codereview.chromium.org/2709173002 Cr-Commit-Position: refs/heads/master@{#451985} [modify] https://crrev.com/8e74b82d093c3612228d2812eadb56e1f1854a96/tools/perf/benchmarks/v8_browsing.py
,
Feb 22 2017
I disabled the benchmarks on reference. These benchmarks haven't failed in the last three recent builds. I will watch them closely and if it continues to fail, I will disable the benchmarks and investigate further.
,
Mar 29 2017
Friendly sheriff ping.
,
Mar 31 2017
On Nexus5 perf 2 bot, the benchmark usually fails with the following error: PersistentDataError: No data for test v8.runtimestats.browsing_mobile found. Which seems to be related to telemetry or some problem with the device. Last four builds were successful on Nexus 5 and last two builds were successful on Nexus 6 bot. I will monitor it for few more builds and update the bug.
,
Mar 31 2017
That is an infra failure, not a test failure. It means the device was offline when they ran and those runs should be purple. These two runs were red runs, and chrome crashes during them: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus6_Perf__2_%2F5425%2F%2B%2Frecipes%2Fsteps%2Fv8.runtimestats.browsing_mobile%2F0%2Fstdout https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus6_Perf__2_%2F5424%2F%2B%2Frecipes%2Fsteps%2Fv8.runtimestats.browsing_mobile%2F0%2Fstdout These failures go back too far to find a good bisect range. having a few runs passing isn't a good measure of this being resolved. Its flaky, not a consistent failure. We might have to get people more familiar with debugging crashes involved.
,
Mar 31 2017
Issue 697918 has been merged into this issue.
,
Mar 31 2017
This benchmark started failing on Nexus 5X bot. Last 4 test runs have failures... https://uberchromegw.corp.google.com/i/chromium.perf/builders/Android%20Nexus5X%20Perf%20(2)
,
Mar 31 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8983579991717953792
,
Apr 1 2017
=== BISECT JOB RESULTS === Bisect was unable to run to completion Error: INFRA_FAILURE The bisect was able to narrow the range, you can try running with: good_revision: dd3783e3fbc682b2140333c56f446c1570cbe4b8 bad_revision : 38f2a30295a7b589b5746c3c7d1a6a5ef6cc3438 If failures persist contact the team (see below) and report the error. Bisect Details Configuration: android_nexus5X_perf_bisect Benchmark : v8.runtimestats.browsing_mobile Metric : v8-gc-total_std/browse_media/browse_media_youtube Revision Exit Code N chromium@460687 0 +- N/A 20 good chromium@460703 0 +- N/A 20 good chromium@460719 1 +- N/A 20 bad To Run This Test src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests v8.runtimestats.browsing_mobile Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8983579991717953792 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=4992652312838144 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
Apr 3 2017
on the Nexus 5x bot, all the four failures were on youtube page. That page also failed on system_health.common_mobile benchmark. I think this cl: https://codereview.chromium.org/2785333002 fixed the failures on the Nexus 5x bot. On the Nexus 5 bot, it is more stable than earlier. Though, there are still some failures. I was looking at this failure: https://build.chromium.org/p/chromium.perf/builders/Android%20Nexus5%20Perf%20%282%29/builds/5341 which fails when taking a screenshot: CRITICAL:root:(TimeoutThread-1-for-MainThread) Exception on TakeScreenshot(03849495003bfd06, host_path=/tmp/tmprhw7uH.png, timeout=30, retries=3). https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus5_Perf__2_%2F5341%2F%2B%2Frecipes%2Fsteps%2Fv8.runtimestats.browsing_mobile%2F0%2Fstdout Is it because something has crashed earlier? I am not quite familiar with this. Also cc'ing telemetry people if they can add more information.
,
Apr 3 2017
,
Apr 3 2017
The log in #14 is failing due to the device dying: Locals: args : ['shell', '( p=org.chromium.chrome;if [[ "$(ps)" = *$p* ]]; then am force-stop $p; fi );echo %$?'] check_error : False cls : <class 'devil.android.sdk.adb_wrapper.AdbWrapper'> cpu_affinity : None device_serial : '03849495003bfd06' output : "error: device '03849495003bfd06' not found\n" retries : 2 status : 255 timeout : 30 INFO:root:Try printing formatted exception: None None None These are infra and not test failures. Ned, we should probably think on how to get this "purple" rather than "red".
,
Apr 3 2017
Dirk, Maruel: do you have any opinion about Juan's proposal in #16? My crazy idea is probably we should have a "post verification" in the swarming infrastructure. If swarming detected that bot is dead, it's likely that the test run was compromised & should have the "purple" status.
,
Apr 3 2017
https://uberchromegw.corp.google.com/i/chromium.perf/builders/Android%20Nexus5X%20Perf%20%282%29/builds/3465 This build is not using Swarming. Migrate this to Swarming first then I can help.
,
Apr 4 2017
+jbudorick I think it's hard to classify whether a device dies as an infra problem or a test problem a priori. It kinda depends on *why* the device died, doesn't it? That's a separate question from whether that means the results of that particular step should be considered invalid.
,
Apr 4 2017
If a test does something that causes a device to crash, is it an infra failure or a test failure? I think you could make reasonable arguments for both sides. Looking at the test run from #14, note that: - the device didn't die, per se; it came back online, and we were able to successfully communicate with it afterwards - the device crashes in three consecutive v8 tests, but not in any of the other tests it ran (I obviously agree w/ Dirk's assertion in #19 that it's difficult to classify these things ahead of time.)
,
Apr 4 2017
Good find, John. It seems extremely unlikely to me that a user program can cause a device to crash. If it's, we should find a way to reproduce it & file an Android bug.
,
Apr 4 2017
Without getting too into the weeds, I'm not sure I'd agree with "extremely unlikely." I also don't think it'd matter if we came up with a repro case for a KK MR1 crash. More recent versions of the OS might be more interesting and less likely to get immediately WontFixed.
,
Apr 4 2017
I wouldn't tend to agree with "extremely unlikely" either; I've seen plenty of cases on different platforms where one of our tests has caused a machine to crash or otherwise become useless.
,
Apr 10 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8982667382454105200
,
Apr 19 2017
The v8.runtimestats benchmark was stable for some time on Nexus 5 but started failing recently on the browse:news:globo page. This page times out when waiting for a javascript condition. The log is here: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus5_Perf__2_%2F5467%2F%2B%2Frecipes%2Fsteps%2Fv8.runtimestats.browsing_mobile%2F0%2Fstdout This page works fine on the system_health.common.mobile benchmark. This page also seems to be working fine on the Nexus 6. Any ideas what could have gone wrong?
,
Apr 19 2017
Also, would it be better if I split this into two bugs one for Nexus 5 and the other for Nexus 6 bots. The behaviour seems to be different on both these bots, so I suspect the underlying problem is different in these. On Nexus 6, both v8.runtimestats and system_health.common are failing which may point to issues on the device or with the page set. Whereas on the Nexus 5 v8.runtimestats fails on the browse:news:globo page consistently over the last 5 runs. Before that it seemed stable.
,
Apr 19 2017
Yes, I filed issue 713036 for the problem with browse:news:globo, which I'll disable soon. What is the specific problem happening on the Nexus 6 now?
,
Apr 19 2017
Thanks Juan. On nexus 6 both v8.runtimestats and system_health failed on browse:chrome:newtab recently. This is the stack trace:
File "/b/c/b/Android_Nexus6_Perf__2_/src/third_party/catapult/telemetry/third_party/websocket-client/websocket/_socket.py", line 83, in recv
raise WebSocketTimeoutException(message)
TimeoutException:
********************************************************************************
(/b/c/b/Android_Nexus6_Perf__2_/src/third_party/catapult/telemetry/telemetry/internal/backends/chrome_inspector/inspector_backend.py:477 _ConvertExceptionFromInspectorWebsocket) The app is probably crashed
Here is the log: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus6_Perf__2_%2F5601%2F%2B%2Frecipes%2Fsteps%2Fv8.runtimestats.browsing_mobile%2F0%2Fstdout
Earlier around builds 5559-5572, they were failing on multiple pages (facebook, hackernews, twitter, youtube, washingtonpost). I haven't checked all the builds but most of them time out waiting for something to load. For ex: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Nexus6_Perf__2_%2F5571%2F%2B%2Frecipes%2Fsteps%2Fv8.runtimestats.browsing_mobile%2F0%2Fstdout.
I am not sure if something fixed it after build 5572 or if they are just flaky.
,
Apr 19 2017
browse:chrome:newtab got recently disabled at issue 712590 . So that should be fine. We've recently improved a lot on other stories that were having flaky issues (they tend to also show up on system health benchmarks), so hopefully they should be more stable now. If you don't see new errors happening in the next few builds I would suggest to close this bug. We can open new bugs later if other stories start failing/flaking.
,
Apr 28 2017
runtimestats benchmark is now more stable on both these bots. Marking this as fixed. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by rnep...@chromium.org
, Feb 21 2017