New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 915841 link

Starred by 2 users

Issue metadata

Status: Closed
Owner:
Closed: Today
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug



Sign in to add a comment

tast.mustpass-android failure on reef-paladin

Project Member Reported by dverkamp@chromium.org, Dec 17

Issue description

This CQ run seems to have failed on reef-paladin due to tast.mustpass-android: https://ci.chromium.org/p/chromeos/builders/luci.chromeos.general/CQ/b8926950010744113040

I'm totally lost trying to figure out where the logs for this test end up, but none of the CLs in the run seem relevant to Android at a first glance.

Please reassign to a more appropriate owner - I am not sure where this test even lives in the tree to find the right assignee.

Some possibly relevant output from the SkylabHWTest step:

2018-12-16 14:52:34,473 INFO | ################# SUITE REPORTING #################
2018-12-16 14:52:34,473 INFO | Suite Job bvt-tast-cq COMPLETED (FAILURE)
2018-12-16 14:52:34,473 INFO | Start outputing test results:
2018-12-16 14:52:34,473 INFO | tast.mustpass-chrome    COMPLETED (SUCCESS)
2018-12-16 14:52:34,473 INFO | tast.mustpass-android   COMPLETED (FAILURE)
2018-12-16 14:52:34,474 INFO | tast.mustpass-system    COMPLETED (SUCCESS)
2018-12-16 14:52:34,474 INFO | Links to tests:
2018-12-16 14:52:34,474 INFO | Suite Job bvt-tast-cq https://chromeos-swarming.appspot.com/user/task/41d1fcfde3d8d811
2018-12-16 14:52:34,474 INFO | tast.mustpass-chrome  https://chromeos-swarming.appspot.com/user/task/41d1fd4873935510
2018-12-16 14:52:34,474 INFO | tast.mustpass-android  https://chromeos-swarming.appspot.com/user/task/41d1fd3cc000e610
2018-12-16 14:52:34,474 INFO | tast.mustpass-system  https://chromeos-swarming.appspot.com/user/task/41d1fd2f69b0ea10
2018-12-16 14:52:34,474 INFO | Links for buildbot:
2018-12-16 14:52:34,614 INFO | Will return from run_suite_skylab.py with status: ERROR
######## END Output for buildbot annotations ######## 
 
Cc: rspangler@chromium.org akes...@chromium.org derat@chromium.org ihf@chromium.org
Components: -Tests>Tast -Infra>Client>ChromeOS>CI Infra>Client>ChromeOS>Test
Owner: dverkamp@chromium.org
http://go/tast-infra has some more details, but in short, these tast.mustpass-* tests are just thin Autotest server tests that run Tast tests. And yes, the way that these results are displayed needs some help (issue 905360). :-)

The actual results are at http://stainless/browse/chromeos-autotest-results/swarming-41d1fd3cc000e611/ (found by logging in through the Luci error message at the bottom of the SkyLab logs page linked off the failed builder stage).

It looks like there was an Autotest issue that made everything blow up before the tast process was executed:

12/16 14:43:01.459 INFO |            repair:0349| Verifying this condition: Python on the host is installed and working
12/16 14:43:01.962 INFO |        server_job:0217| 	GOOD	----	verify.python	timestamp=1545000181	localtime=Dec 16 14:43:01	
12/16 14:43:01.963 INFO |            repair:0349| Verifying this condition: Legacy host verification checks
12/16 14:43:02.885 INFO |      base_classes:0356| Checking for >= 0.7 GB of space under /usr/local/autotest on machine chromeos6-row3-rack12-host15
12/16 14:43:03.857 INFO |      base_classes:0377| Found 23.064 GB >= 0.7 GB of space under /usr/local/autotest on machine chromeos6-row3-rack12-host15
12/16 14:43:04.362 INFO |      base_classes:0392| Checking for >= 100000 i-nodes under /mnt/stateful_partition on machine chromeos6-row3-rack12-host15
12/16 14:43:04.945 INFO |      base_classes:0401| Found 1571086 >= 100000 i-nodes under /mnt/stateful_partition on machine chromeos6-row3-rack12-host15
12/16 14:43:04.945 INFO |      base_classes:0356| Checking for >= 0.7 GB of space under /mnt/stateful_partition on machine chromeos6-row3-rack12-host15
12/16 14:43:06.021 INFO |      base_classes:0377| Found 23.064 GB >= 0.7 GB of space under /mnt/stateful_partition on machine chromeos6-row3-rack12-host15
12/16 14:43:06.494 INFO |      base_classes:0356| Checking for >= 0.1 GB of space under /mnt/stateful_partition/encrypted on machine chromeos6-row3-rack12-host15
12/16 14:43:07.567 INFO |      base_classes:0377| Found 7.726 GB >= 0.1 GB of space under /mnt/stateful_partition/encrypted on machine chromeos6-row3-rack12-host15
12/16 14:43:09.062 ERROR|             utils:0287| [stderr] [1216/144308.408653:INFO:update_engine_client.cc(535)] Querying Update Engine status...
12/16 14:48:09.103 ERROR|            repair:0354| Failed: Legacy host verification checks
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/hosts/repair.py", line 351, in _verify_host
    self.verify(host)
  File "/usr/local/autotest/server/hosts/repair_utils.py", line 62, in verify
    host.verify_software()
  File "/usr/local/autotest/server/hosts/cros_host.py", line 1168, in verify_software
    self.verify_cros_version_label()
  File "/usr/local/autotest/server/hosts/cros_host.py", line 972, in verify_cros_version_label
    host__hostname=self.hostname)
  File "/usr/local/autotest/server/frontend.py", line 534, in get_labels
    labels = self.run('get_labels', **dargs)
  File "/usr/local/autotest/server/cros/dynamic_suite/frontend_wrappers.py", line 131, in run
    self, call, **dargs)
  File "/usr/local/autotest/site-packages/chromite/lib/retry_util.py", line 249, in GenericRetry
    return _run()
  File "/usr/local/autotest/site-packages/chromite/lib/retry_util.py", line 176, in _Wrapper
    self._retry_delay.Sleep(attempt)
  File "/usr/local/autotest/site-packages/chromite/lib/retry_util.py", line 72, in Sleep
    time.sleep(total)
  File "/usr/local/autotest/site-packages/chromite/lib/timeout_util.py", line 88, in kill_us
    raise TimeoutError(error_message % {'time': max_run_time})
TimeoutError: Timeout occurred- waited 300.0 seconds.
12/16 14:48:09.105 INFO |        server_job:0217| 	FAIL	----	verify.cros	timestamp=1545000489	localtime=Dec 16 14:48:09	Timeout occurred- waited 300.0 seconds.
12/16 14:48:09.105 INFO |            repair:0349| Verifying this condition: /dev/kvm should exist if device supports Linux VMs
12/16 14:48:09.583 INFO |        server_job:0217| 	GOOD	----	verify.ec_reset	timestamp=1545000489	localtime=Dec 16 14:48:09	
12/16 14:48:09.584 INFO |            repair:0110| Skipping this operation: All host verification checks pass
12/16 14:48:09.585 ERROR|             reset:0037| Reset failed due to Exception.
Traceback (most recent call last):
  File "/usr/local/autotest/server/control_segments/reset", line 33, in reset
    target.verify()
  File "/usr/local/autotest/server/hosts/cros_host.py", line 1190, in verify
    self._repair_strategy.verify(self)
  File "/usr/local/autotest/client/common_lib/hosts/repair.py", line 756, in verify
    self._verify_root._verify_host(host, silent)
  File "/usr/local/autotest/client/common_lib/hosts/repair.py", line 348, in _verify_host
    self._verify_dependencies(host, silent)
  File "/usr/local/autotest/client/common_lib/hosts/repair.py", line 226, in _verify_dependencies
    self._verify_list(host, self._dependency_list, silent)
  File "/usr/local/autotest/client/common_lib/hosts/repair.py", line 216, in _verify_list
    raise AutoservVerifyDependencyError(self, failures)
AutoservVerifyDependencyError: Timeout occurred- waited 300.0 seconds.
12/16 14:48:09.586 INFO |        server_job:0217| END FAIL	----	reset	timestamp=1545000489	localtime=Dec 16 14:48:09	
12/16 14:48:10.598 ERROR|         traceback:0013| Traceback (most recent call last):
12/16 14:48:10.599 ERROR|         traceback:0013|   File "/usr/local/autotest/server/autoserv", line 558, in run_autoserv
12/16 14:48:10.599 ERROR|         traceback:0013|     job.reset(job_labels)
12/16 14:48:10.599 ERROR|         traceback:0013|   File "/usr/local/autotest/server/server_job.py", line 510, in reset
12/16 14:48:10.599 ERROR|         traceback:0013|     self._execute_code(RESET_CONTROL_FILE, namespace, protect=False)
12/16 14:48:10.600 ERROR|         traceback:0013|   File "/usr/local/autotest/server/server_job.py", line 1340, in _execute_code
12/16 14:48:10.600 ERROR|         traceback:0013|     execfile(code_file, namespace, namespace)
12/16 14:48:10.600 ERROR|         traceback:0013|   File "/usr/local/autotest/server/control_segments/reset", line 48, in <module>
12/16 14:48:10.600 ERROR|         traceback:0013|     job.parallel_simple(reset, machines)
12/16 14:48:10.601 ERROR|         traceback:0013|   File "/usr/local/autotest/server/server_job.py", line 619, in parallel_simple
12/16 14:48:10.601 ERROR|         traceback:0013|     log=log, timeout=timeout, return_results=return_results)
12/16 14:48:10.601 ERROR|         traceback:0013|   File "/usr/local/autotest/server/subcommand.py", line 98, in parallel_simple
12/16 14:48:10.601 ERROR|         traceback:0013|     function(arg)
12/16 14:48:10.602 ERROR|         traceback:0013|   File "/usr/local/autotest/server/control_segments/reset", line 41, in reset
12/16 14:48:10.602 ERROR|         traceback:0013|     raise Exception('')
12/16 14:48:10.602 ERROR|         traceback:0013| Exception
12/16 14:48:10.611 INFO |            client:0570| Attempting refresh to obtain initial access_token
12/16 14:48:10.677 INFO |            client:0873| Refreshing access_token
12/16 14:48:11.144 ERROR|          autoserv:0810| Uncaught SystemExit with code 1
Traceback (most recent call last):
  File "/usr/local/autotest/server/autoserv", line 806, in main
    use_ssp)
  File "/usr/local/autotest/server/autoserv", line 616, in run_autoserv
    sys.exit(exit_code)
SystemExit: 1

Sorry, this is probably in the sheriff's purview, so back to you. Happy to answer any Tast questions, but it looks like things even didn't get that far.
Labels: M-73
get_labels() sounds like a failure to look up the database for the installed cros version label.
Status: Assigned (was: Untriaged)
This issue has an owner, a component and a priority, but is still listed as untriaged or unconfirmed. By definition, this bug is triaged. Changing status to "assigned". Please reach out to me if you disagree with how I've done this.

Comment 4 by dverkamp@chromium.org, Today (10 hours ago)

Status: Closed (was: Assigned)
I haven't had a chance to look into this any further, but I haven't seen it reoccur, so I'm going to mark it up as a test flake and close this bug.

Sign in to add a comment