testEnsurePCv1Cold and testEnsurePCv1WarmFromScratch failures on ChromeOS |
|||||||||||
Issue descriptionBoth telemetry informational pfq bots failed since 2/16. https://build.chromium.org/p/chromiumos.chromium/builders/amd64-generic-telemetry/builds/10833 https://build.chromium.org/p/chromiumos.chromium/builders/x86-generic-telemetry/builds/11748 From the log, i am suspecting this is the cl to blame: https://chromium.googlesource.com/chromium/src/+/820dcd54f703f06bb811f77d54797f645d7a3480 Bot telemetry informational bots started to failed since the build with this cl, and the error seems related to the metric changes: 02/21 06:11:36.179 INFO |run_chromeos_tests:0052| FAIL: add_1_and_2 (browser_tests.simple_numeric_test.SimpleTest) 02/21 06:11:36.180 INFO |run_chromeos_tests:0052| ---------------------------------------------------------------------- 02/21 06:11:36.180 INFO |run_chromeos_tests:0052| Traceback (most recent call last): 02/21 06:11:36.180 INFO |run_chromeos_tests:0052| File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/testing/serially_executed_browser_test_case.py", line 158, in <lambda> 02/21 06:11:36.181 INFO |run_chromeos_tests:0052| return lambda self: based_method(self, *args) 02/21 06:11:36.181 INFO |run_chromeos_tests:0052| File "/usr/local/telemetry/src/third_party/catapult/telemetry/examples/browser_tests/simple_numeric_test.py", line 57, in AdderTest 02/21 06:11:36.181 INFO |run_chromeos_tests:0052| self.assertEqual(a + b, partial_sum) 02/21 06:11:36.182 INFO |run_chromeos_tests:0052| AssertionError: 3 != 5 02/21 06:11:36.182 INFO |run_chromeos_tests:0052| 02/21 06:11:36.182 INFO |run_chromeos_tests:0052| ====================================================================== 02/21 06:11:36.182 INFO |run_chromeos_tests:0052| FAIL: add_7_and_3 (browser_tests.simple_numeric_test.SimpleTest) 02/21 06:11:36.183 INFO |run_chromeos_tests:0052| ---------------------------------------------------------------------- 02/21 06:11:36.183 INFO |run_chromeos_tests:0052| Traceback (most recent call last): 02/21 06:11:36.183 INFO |run_chromeos_tests:0052| File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/testing/serially_executed_browser_test_case.py", line 158, in <lambda> 02/21 06:11:36.184 INFO |run_chromeos_tests:0052| return lambda self: based_method(self, *args) 02/21 06:11:36.184 INFO |run_chromeos_tests:0052| File "/usr/local/telemetry/src/third_party/catapult/telemetry/examples/browser_tests/simple_numeric_test.py", line 57, in AdderTest 02/21 06:11:36.184 INFO |run_chromeos_tests:0052| self.assertEqual(a + b, partial_sum) 02/21 06:11:36.184 INFO |run_chromeos_tests:0052| AssertionError: 10 != 5
,
Feb 22 2017
,
Feb 22 2017
I think we want our gardeners to be able to handle failures such as this. Clicking through the catapult roller CL, you see there are two catapult CLs that are potentially at fault: https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/88e9135e3e0a..d885da830d7a Now for the VM failure, click on the link for telemetry_UnitTests which takes you to pantheon, then click on telemetry_UnitTests.user -> debug -> telemetry_UnitTests.user.DEBUG This opens a log file; search for 'failed unexpectedly'. You can see that the failing tests are: telemetry.page.cache_temperature_unittest.CacheTempeartureTests.testEnsurePCv1Cold telemetry.page.cache_temperature_unittest.CacheTempeartureTests.testEnsurePCv1WarmFromScratch Juan's CL: https://codereview.chromium.org/2704493002 has the title 'Fix bug in markers for cache temperature', so this is probably the culprit. If need be, we can also disable these tests for ChromeOS, and I believe there's a YAQs entry on how to do that.
,
Feb 22 2017
I agree that this is something that gardeners should know how to handle. achuith@ - Could you add a YAQS entry with the above? When I searched for 'catapult' or 'telemetry_UnitTests' I didn't find anything that was especially helpful. Thanks!
,
Feb 27 2017
How are these tests run? I'm trying to figure out how to reproduce locally.
,
Feb 27 2017
Here are the stack traces in case it helps:
02/27 03:19:35.412 INFO |run_chromeos_tests:0052| [556/1082] telemetry.page.cache_temperature_unittest.CacheTempeartureTests.testEnsurePCv1Cold failed unexpectedly 8.7635s:
02/27 03:19:35.412 INFO |run_chromeos_tests:0052| Traceback (most recent call last):
02/27 03:19:35.412 INFO |run_chromeos_tests:0052| File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/testing/browser_test_case.py", line 41, in WrappedMethod
02/27 03:19:35.413 INFO |run_chromeos_tests:0052| method(self)
02/27 03:19:35.413 INFO |run_chromeos_tests:0052| File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/page/cache_temperature_unittest.py", line 63, in testEnsurePCv1Cold
02/27 03:19:35.413 INFO |run_chromeos_tests:0052| self.assertIn('telemetry.internal.ensure_diskcache.start', markers)
02/27 03:19:35.413 INFO |run_chromeos_tests:0052| AssertionError: 'telemetry.internal.ensure_diskcache.start' not found in set([u'68cdb6de-5b1f-4213-83ca-add2e06b7aae', u'telemetry.internal.ensure_diskcache.end'])
02/27 03:19:44.585 INFO |run_chromeos_tests:0052| [556/1082] telemetry.page.cache_temperature_unittest.CacheTempeartureTests.testEnsurePCv1WarmAfterPCv1ColdRun queued
02/27 03:19:44.585 INFO |run_chromeos_tests:0052| [557/1082] telemetry.page.cache_temperature_unittest.CacheTempeartureTests.testEnsurePCv1WarmAfterPCv1ColdRun passed 9.1585s
02/27 03:19:59.078 INFO |run_chromeos_tests:0052| [557/1082] telemetry.page.cache_temperature_unittest.CacheTempeartureTests.testEnsurePCv1WarmFromScratch queued
02/27 03:19:59.078 INFO |run_chromeos_tests:0052| [558/1082] telemetry.page.cache_temperature_unittest.CacheTempeartureTests.testEnsurePCv1WarmFromScratch failed unexpectedly 14.4923s:
02/27 03:19:59.079 INFO |run_chromeos_tests:0052| Traceback (most recent call last):
02/27 03:19:59.079 INFO |run_chromeos_tests:0052| File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/testing/browser_test_case.py", line 41, in WrappedMethod
02/27 03:19:59.079 INFO |run_chromeos_tests:0052| method(self)
02/27 03:19:59.080 INFO |run_chromeos_tests:0052| File "/usr/local/telemetry/src/third_party/catapult/telemetry/telemetry/page/cache_temperature_unittest.py", line 92, in testEnsurePCv1WarmFromScratch
02/27 03:19:59.080 INFO |run_chromeos_tests:0052| self.assertIn('telemetry.internal.warmCache.start', markers)
02/27 03:19:59.080 INFO |run_chromeos_tests:0052| AssertionError: 'telemetry.internal.warmCache.start' not found in set([u'c7d0d7da-bff1-420e-8b23-64274307ce7d', u'telemetry.internal.warmCache.end'])
,
Feb 27 2017
This looks more like a problem with tracing on ChromeOS (the test is working fine on other platforms). These new tests (added in my CL) are correctly complaining that some markers expected to appear in the trace are not there. I could disable the tests on ChromeOS, but the underlying issue will still be there.
,
Feb 27 2017
Juan - are you still interested in trying to repo this problem on ChromeOS? It sounds like there isn't a lot that can be done besides disabling this test?
,
Feb 27 2017
The error suggests that some benchmarks/metrics may not be working as expected on ChromeOS (and the problem might be subtle, giving wrong numbers rather than failing with a hard error; see issue 692929 for the problem my CL fixed). Telemetry does console.time and console.timeEnd in a few places to inject markers into the trace. The failing test shows that some of these markers are inserted but not found as expected on the trace. You might want to assign someone from ChromeOS team to investigate (I'm happy to provide guidance). Alternatively, if you have other means to ensure the metrics you care about are working fine, then it's OK to just keep the failing tests disabled and close this issue.
,
Feb 27 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/966bd35d17bb18e4c2a953d9816773566c88bb40 commit 966bd35d17bb18e4c2a953d9816773566c88bb40 Author: catapult-deps-roller <catapult-deps-roller@chromium.org> Date: Mon Feb 27 16:20:14 2017 Roll src/third_party/catapult/ a75c463e8..645770e40 (1 commit). https://chromium.googlesource.com/external/github.com/catapult-project/catapult.git/+log/a75c463e8416..645770e4019b $ git log a75c463e8..645770e40 --date=short --no-merges --format='%ad %ae %s' 2017-02-27 achuith Disable testEnsurePCv1Cold and testEnsurePCv1WarmFromScratch Created with: roll-dep src/third_party/catapult BUG=694722 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.android:android_optional_gpu_tests_rel TBR=catapult-sheriff@chromium.org Review-Url: https://codereview.chromium.org/2720693005 Cr-Commit-Position: refs/heads/master@{#453229} [modify] https://crrev.com/966bd35d17bb18e4c2a953d9816773566c88bb40/DEPS
,
Feb 27 2017
,
Feb 27 2017
,
Feb 28 2017
Achuith, the remaining failures that I can see at the moment are all due to ValueError("No JSON object could be decoded") reported in Issue 696553.
,
Feb 28 2017
Yup, I disabled these two tests here: https://codereview.chromium.org/2722563002/
,
Feb 28 2017
,
Apr 27 2017
Is there more to be done on this issue?
,
Jan 24 2018
,
Jan 24 2018
,
Jan 16
,
Jan 16
|
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by steve...@chromium.org
, Feb 21 2017