Chrome OS perf (amd & x86 generic telemetry) failing in VMTest consistently |
||||||
Issue descriptionSee https://uberchromegw.corp.google.com/i/chromiumos.chromium/builders/x86-generic-telemetry?numbuilds=200 and https://uberchromegw.corp.google.com/i/chromiumos.chromium/builders/amd64-generic-telemetry?numbuilds=200 It seems that the VMTest has been failing consistently for more than a month. dshi@, could you take a look at it and help find the right owner? Thanks!
,
Mar 15 2016
They run this test: https://cs.corp.google.com/chromeos_public/src/third_party/autotest/files/client/site_tests/telemetry_UnitTests/telemetry_UnitTests.py with all 3 control files (control.user, control.perf and control.guest). Last I looked, running these tests would cause the VM to halt, and a gardener needs to look into why this is happening. These bots run the telemetry unit tests, which are an indicator of the health of the cros performance tests. The cros performance tests are used by various teams for comparing performance of boards, build toolchains, graphics performance, etc. We shouldn't get rid of these bots.
,
Mar 15 2016
I looked into the log file, and I think it resembles Issue 528035 . achuith@, do you have any thought on the failure? Selected sample log: /tmp/cbuildbot4Ipbzy/test_harness/all/SimpleTestVerify/1_autotest_tests/results-1-telemetry_UnitTests_perf [ FAILED ] /tmp/cbuildbot4Ipbzy/test_harness/all/SimpleTestVerify/1_autotest_tests/results-1-telemetry_UnitTests_perf/telemetry_UnitTests.perf [ FAILED ] /tmp/cbuildbot4Ipbzy/test_harness/all/SimpleTestVerify/1_autotest_tests/results-2-telemetry_UnitTests [ FAILED ] /tmp/cbuildbot4Ipbzy/test_harness/all/SimpleTestVerify/1_autotest_tests/results-2-telemetry_UnitTests/telemetry_UnitTests.user [ FAILED ] /tmp/cbuildbot4Ipbzy/test_harness/all/SimpleTestVerify/1_autotest_tests/results-3-telemetry_UnitTests_guest [ FAILED ] /tmp/cbuildbot4Ipbzy/test_harness/all/SimpleTestVerify/1_autotest_tests/results-3-telemetry_UnitTests_guest/telemetry_UnitTests.guest [ FAILED ] ------------------------------------------------------------------------------- Total PASS: 0/6 (0%) Sample traceback (most recent call last): Traceback (most recent call last): File "/usr/local/autotest/common_lib/log.py", line 70, in decorated_func fn(*args, **dargs) File "/usr/local/autotest/bin/base_sysinfo.py", line 374, in log_before_each_iteration board = utils.get_board_with_frequency_and_memory() File "/usr/local/autotest/bin/site_utils.py", line 864, in get_board_with_frequency_and_memory frequency = int(round(get_cpu_max_frequency() * 1e-8)) * 0.1 File "/usr/local/autotest/bin/site_utils.py", line 788, in get_cpu_max_frequency paths = _get_cpufreq_paths('cpuinfo_max_freq') File "/usr/local/autotest/bin/site_utils.py", line 925, in _get_cpufreq_paths paths = utils.run(cmd, verbose=False).stdout.splitlines() File "/usr/local/autotest/common_lib/base_utils.py", line 893, in run "Command returned non-zero exit status") * Command: ls /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq Exit status: 2 Duration: 0.000597953796387 stderr: ls: cannot access /sys/devices/system/cpu/cpu*/cpufreq/cpuinfo_max_freq: No such file or directory ========================== Saving memory snapshot to /b/cbuild/shared_external/chroot/tmp/cbuildbot4Ipbzy/test_harness/failed/SimpleTestVerify/1_autotest_tests/chromiumos_qemu_mem.bin.rJW_Fg... freezing VM... saving memory, piping through cat... flushing disk buffers... asking KVM to quit... done. ====================================================================== FAIL: SimpleTestVerify (crostestutils.au_test_harness.au_test.AUTest) Test that only verifies the target image. ---------------------------------------------------------------------- ---------------------------------------------------------------------- Ran 1 test in 1013.556s FAILED (failures=1) 2016/03/15 09:15:56 - cros_build_lib.py - ERROR : <class 'chromite.lib.parallel.BackgroundFailure'>: Test harness failed. See logs for details. Traceback (most recent call last): File "/b/cbuild/shared_external/chromite/lib/parallel.py", line 440, in _Run self._task(*self._task_args, **self._task_kwargs) File "/b/cbuild/shared_external/src/platform/crostestutils/au_test_harness/cros_au_test_harness.py", line 77, in run raise parallel.BackgroundFailure(msg) BackgroundFailure: Test harness failed. See logs for details.
,
Mar 15 2016
https://chromium-review.googlesource.com/#/c/332638/1 This fixes the failure with not being able to run the test, but the VM still reboots when the test runs, and that will need to be investigated.
,
Mar 15 2016
+davidjames@ could you help bring some thought on this issue?
,
Mar 15 2016
https://codereview.chromium.org/1805733002/
,
Mar 15 2016
Re #6: Thanks for taking over this issue!
,
Mar 15 2016
Thanks for finding http://crbug.com/528035 - that was key to solving this problem.
,
Mar 16 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/26dc38548ec0632ea0a2eb8841291fd3d7d76ed6 commit 26dc38548ec0632ea0a2eb8841291fd3d7d76ed6 Author: Achuith Bhandarkar <achuith@chromium.org> Date: Tue Mar 15 20:05:07 2016 Update telemetry path to catapult. BUG= chromium:595055 TEST=this is the test. Change-Id: I611d262d400291126a2d30accc8c8de2c622a06c Reviewed-on: https://chromium-review.googlesource.com/332638 Commit-Ready: Achuith Bhandarkar <achuith@chromium.org> Tested-by: Achuith Bhandarkar <achuith@chromium.org> Tested-by: Xiaoqian Dai <xdai@chromium.org> Reviewed-by: Achuith Bhandarkar <achuith@chromium.org> Reviewed-by: Xiaoqian Dai <xdai@chromium.org> [modify] https://crrev.com/26dc38548ec0632ea0a2eb8841291fd3d7d76ed6/client/site_tests/telemetry_UnitTests/telemetry_UnitTests.py
,
Mar 18 2016
So after the fix, the VMTest is still failing. But the error message changed. This is the first build with the fix in #9: https://uberchromegw.corp.google.com/i/chromiumos.chromium/builders/x86-generic-telemetry/builds/8822 Selected log: ------------------------------------------------------------------------------------------------------------------------------------------------- /tmp/cbuildbotCDyLmz/test_harness/all/SimpleTestVerify/1_autotest_tests/results-1-telemetry_UnitTests_perf [ FAILED ] /tmp/cbuildbotCDyLmz/test_harness/all/SimpleTestVerify/1_autotest_tests/results-1-telemetry_UnitTests_perf FAIL: The unit tests of /usr/local/telemetry/src/tools/perf failed. /tmp/cbuildbotCDyLmz/test_harness/all/SimpleTestVerify/1_autotest_tests/results-1-telemetry_UnitTests_perf/telemetry_UnitTests.perf [ FAILED ] /tmp/cbuildbotCDyLmz/test_harness/all/SimpleTestVerify/1_autotest_tests/results-1-telemetry_UnitTests_perf/telemetry_UnitTests.perf FAIL: The unit tests of /usr/local/telemetry/src/tools/perf failed. /tmp/cbuildbotCDyLmz/test_harness/all/SimpleTestVerify/1_autotest_tests/results-2-telemetry_UnitTests [ FAILED ] /tmp/cbuildbotCDyLmz/test_harness/all/SimpleTestVerify/1_autotest_tests/results-2-telemetry_UnitTests FAIL: Autotest client terminated unexpectedly: DUT rebooted during the test run. /tmp/cbuildbotCDyLmz/test_harness/all/SimpleTestVerify/1_autotest_tests/results-2-telemetry_UnitTests/telemetry_UnitTests.user [ FAILED ] /tmp/cbuildbotCDyLmz/test_harness/all/SimpleTestVerify/1_autotest_tests/results-3-telemetry_UnitTests_guest [ PASSED ] /tmp/cbuildbotCDyLmz/test_harness/all/SimpleTestVerify/1_autotest_tests/results-3-telemetry_UnitTests_guest/telemetry_UnitTests.guest [ PASSED ] ------------------------------------------------------------------------------------------------------------------------------------------------- Total PASS: 2/6 (33%) ====================================================================== FAIL: SimpleTestVerify (crostestutils.au_test_harness.au_test.AUTest) Test that only verifies the target image. ---------------------------------------------------------------------- FAILED (failures=1) 2016/03/15 23:04:40 - cros_build_lib.py - ERROR : <class 'chromite.lib.parallel.BackgroundFailure'>: Test harness failed. See logs for details. Traceback (most recent call last): File "/b/cbuild/shared_external/chromite/lib/parallel.py", line 440, in _Run self._task(*self._task_args, **self._task_kwargs) File "/b/cbuild/shared_external/src/platform/crostestutils/au_test_harness/cros_au_test_harness.py", line 77, in run raise parallel.BackgroundFailure(msg) BackgroundFailure: Test harness failed. See logs for details. VMTest Error Traceback (most recent call last): File "bin/ctest", line 293, in <module> main() File "bin/ctest", line 284, in main opts.suite) File "bin/ctest", line 210, in RunAUTestHarness error_code_ok=True, kill_timeout=10 * 60) File "/b/cbuild/shared_external/chromite/lib/cros_build_lib.py", line 589, in RunCommand (cmd_result.output, cmd_result.error) = proc.communicate(input) File "/usr/lib/python2.7/subprocess.py", line 751, in communicate self.wait() File "/usr/lib/python2.7/subprocess.py", line 1291, in wait pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) File "/usr/lib/python2.7/subprocess.py", line 478, in _eintr_retry_call return func(*args) File "/b/cbuild/shared_external/chromite/lib/cros_build_lib.py", line 344, in _KillChildProcess raise TerminateRunCommandError('Received signal %i' % signum, cmd_result) chromite.lib.cros_build_lib.TerminateRunCommandError: return code: 3840; command: /b/cbuild/shared_external/src/platform/crostestutils/au_test_harness/cros_au_test_harness.py '--base_image=/b/cbuild/shared_external/src/build/images/x86-generic/latest-cbuildbot/chromiumos_test_image.bin' '--target_image=/b/cbuild/shared_external/src/build/images/x86-generic/latest-cbuildbot/chromiumos_test_image.bin' '--board=x86-generic' '--type=vm' '--remote=0.0.0.0' --verbose '--jobs=16' '--ssh_private_key=/b/cbuild/shared_external/src/build/images/x86-generic/latest-cbuildbot/id_rsa' '--verify_suite_name=telemetry_unit' '--test_prefix=SimpleTestVerify' '--test_results_root=/b/cbuild/shared_external/chroot/tmp/cbuildbotSosBB7/test_harness' --no_graphics Received signal 15 22:08:35: ERROR: !!!VMTests failed!!! @@@STEP_WARNINGS@@@ 22:08:39: WARNING: Traceback (most recent call last): File "/b/cbuild/shared_external/chromite/cbuildbot/stages/generic_stages.py", line 467, in Run self.PerformStage() File "/b/cbuild/shared_external/chromite/cbuildbot/stages/test_stages.py", line 215, in PerformStage self._RunTest(test_type, test_results_dir) File "/b/cbuild/shared_external/chromite/cbuildbot/stages/test_stages.py", line 204, in _RunTest ssh_private_key=ssh_private_key) File "/b/cbuild/shared_external/chromite/cbuildbot/commands.py", line 675, in RunTestSuite kill_timeout=10 * 60) File "/b/cbuild/shared_external/chromite/lib/cros_build_lib.py", line 589, in RunCommand (cmd_result.output, cmd_result.error) = proc.communicate(input) File "/usr/lib/python2.7/subprocess.py", line 751, in communicate self.wait() File "/usr/lib/python2.7/subprocess.py", line 1291, in wait pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0) File "/usr/lib/python2.7/subprocess.py", line 478, in _eintr_retry_call return func(*args) File "/b/cbuild/shared_external/chromite/lib/timeout_util.py", line 53, in kill_us raise TimeoutError(error_message % {'time': max_run_time}) TimeoutError: Timeout occurred- waited 3600 seconds.
,
Mar 22 2016
I've opened a separate bug for the timeout issue: https://bugs.chromium.org/p/chromium/issues/detail?id=596945 The x86-generic builder is now green, but it seems to always succeed on the retry: https://uberchromegw.corp.google.com/i/chromiumos.chromium/builders/x86-generic-telemetry/builds/8868 I've landed a few CLs disabling failing tests, fixing the test harness to run the test, and the crash during the test. I'm going to close this bug. 596945 is assigned to the shadow gardener this week.
,
Apr 11 2016
,
May 23 2016
Bulk verified |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by achuith@chromium.org
, Mar 15 2016Owner: x...@chromium.org