[USB_Detect - Peach-pit] platform_ExternalUsbPeripherals.detect.reboot_login test fails with 'Client job was aborted' |
||
Issue descriptionDUT: Peach-pit Host: chromeos15-row13a-rack1-host11 Failure reason: client job was aborted Sometimes other platform_ExternalUsbPeripherals.* tests fails with no failure reason. https://stainless.corp.google.com/search?view=list&first_date=2018-06-27&last_date=2018-07-10&suite=usb_detect&test=platform_ExternalUsbPeripherals*&build=%5ER69*&board=%5Epeach_pit%24&status=FAIL&status=ERROR&status=ABORT&exclude_cts=false&exclude_not_run=true&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=true Failures screenshot: https://screenshot.googleplex.com/CsieW21aPBe From debug logs: ========================================== 07/10 08:03:49.631 DEBUG| server_job:1370| Client state file /usr/local/autotest/results/215794987-chromeos-test/platform_ExternalUsbPeripherals.detect.reboot_login/control.autoserv.state not found 07/10 08:03:49.666 DEBUG| base_job:0399| Persistent state client.* deleted 07/10 08:03:49.677 DEBUG| autotest:1122| Autotest job finishes. 07/10 08:03:49.677 DEBUG| test:0410| Test failed due to client job was aborted. Exception log follows the after_iteration_hooks. 07/10 08:03:49.677 DEBUG| test:0415| Starting after_iteration_hooks for platform_ExternalUsbPeripherals.detect.reboot_login 07/10 08:03:49.678 DEBUG| test:0420| after_iteration_hooks completed 07/10 08:03:49.678 WARNI| test:0637| The test failed with the following exception Traceback (most recent call last): File "/usr/local/autotest/client/common_lib/test.py", line 631, in _exec _call_test_function(self.execute, *p_args, **p_dargs) File "/usr/local/autotest/client/common_lib/test.py", line 831, in _call_test_function return func(*args, **dargs) File "/usr/local/autotest/client/common_lib/test.py", line 495, in execute dargs) File "/usr/local/autotest/client/common_lib/test.py", line 362, in _call_run_once_with_retry postprocess_profiled_run, args, dargs) File "/usr/local/autotest/client/common_lib/test.py", line 400, in _call_run_once self.run_once(*args, **dargs) File "/usr/local/autotest/server/site_tests/platform_ExternalUsbPeripherals/platform_ExternalUsbPeripherals.py", line 353, in run_once self.action_login() File "/usr/local/autotest/server/site_tests/platform_ExternalUsbPeripherals/platform_ExternalUsbPeripherals.py", line 64, in action_login exit_without_logout=True) File "/usr/local/autotest/server/autotest.py", line 638, in run_test *args, **dargs) File "/usr/local/autotest/server/autotest.py", line 626, in run_timed_test client_disconnect_timeout=client_disconnect_timeout) File "/usr/local/autotest/server/autotest.py", line 479, in run client_disconnect_timeout, use_packaging=use_packaging) File "/usr/local/autotest/server/autotest.py", line 562, in _do_run client_disconnect_timeout=client_disconnect_timeout) File "/usr/local/autotest/server/autotest.py", line 1054, in execute_control logger, client_disconnect_timeout) File "/usr/local/autotest/server/autotest.py", line 999, in execute_section raise err AutotestRunError: client job was aborted ==========================================
,
Jul 10
Actually the hostname is chromeos15-row13a-rack1-host11. plug this one in the command above. I am puzzled no other tests are failing like that. And device space looks Ok now: $ ssh root@chromeos15-row13a-rack1-host11 localhost ~ # df -h Filesystem Size Used Avail Use% Mounted on /dev/root 1.2G 939M 283M 77% / devtmpfs 1000M 0 1000M 0% /dev tmp 1001M 136K 1001M 1% /tmp run 1001M 432K 1000M 1% /run shmfs 1001M 2.3M 998M 1% /dev/shm /dev/mmcblk0p1 11G 1.3G 8.4G 14% /mnt/stateful_partition /dev/mmcblk0p8 12M 28K 12M 1% /usr/share/oem /dev/mapper/encstateful 3.1G 30M 3.0G 1% /mnt/stateful_partition/encrypted media 1001M 0 1001M 0% /media none 1001M 0 1001M 0% /sys/fs/cgroup imageloader 1001M 0 1001M 0% /run/imageloader Still, re-imaging should help.
,
Jul 10
If not, we have to get another peach_pit DUT.
,
Jul 10
Re-imaged DUT with command from #1. Will wait for the tests to run and update the status later.
,
Jul 13
Still the test is failing and the latest file system info is about the same as above. localhost ~ # df -h Filesystem Size Used Avail Use% Mounted on /dev/root 1.2G 962M 259M 79% / devtmpfs 1000M 0 1000M 0% /dev tmp 1001M 120K 1001M 1% /tmp run 1001M 428K 1000M 1% /run shmfs 1001M 4.3M 996M 1% /dev/shm /dev/mmcblk0p1 11G 1.4G 8.4G 14% /mnt/stateful_partition /dev/mmcblk0p8 12M 28K 12M 1% /usr/share/oem /dev/mapper/encstateful 3.0G 47M 3.0G 2% /mnt/stateful_partition/encrypted media 1001M 0 1001M 0% /media none 1001M 0 1001M 0% /sys/fs/cgroup imageloader 1001M 0 1001M 0% /run/imageloader @kalin, send request for another peach_pit DUT as mentioned?
,
Jul 13
Not yet. Lets figure out how ONLY this test is affected. If needed - lets remove this test from the suite.
,
Jul 13
Ohh, I see this is the ONLY test that REBOOTs as first step. All the rest tests first step is LOGIN. Yes, lets replace the DUT.
,
Jul 16
Recovered DUT with USB stick on friday 07/13, but forgot to unlock the device. Will update status once we have results.
,
Jul 17
Test is still failing. Screenshot: https://screenshot.googleplex.com/ZGMUKknZogQ
,
Jul 18
Rebooted Servo and ran test twice locally on the same host and the test is passing now. Will update autotest results later.
,
Jul 18
hah, that's a new 'good-to-know-about-servo'. Hopefully no further action will be needed.
,
Jul 20
No luck, test is still failing. Screenshot: https://screenshot.googleplex.com/NSgxUqLKJY7
,
Jul 20
Will try replacing the Servo SD card.
,
Jul 23
This issue is still going on(though test passing sometimes). Did you replace the SD card?
,
Jul 23
Not yet. Will do it this afternoon.
,
Jul 23
Replaced SD card on Servo. Now servo is not pingable at all. Tried changing network cables, rebooted few times etc.
,
Jul 25
Test still failing, filed report @ go/acs-device-failure
,
Jul 25
Ok, then lets replace the DUT. Be sure we do not send the old device before we put the new in the lab cell. Thanks.
,
Aug 3
This bug has an owner, thus, it's been triaged. Changing status to "assigned".
,
Sep 28
Did we eventually replace DUT? Test is still mostly failing. There were few good runs recently. https://stainless.corp.google.com/search?view=matrix&row=build&col=test&days=28&suite=usb_detect&board=peach_pit&exclude_cts=false&exclude_not_run=true&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=true |
||
►
Sign in to add a comment |
||
Comment 1 by ka...@chromium.org
, Jul 10