New issue
Advanced search Search tips
Starred by 2 users
Status: Fixed
Owner:
Closed: Dec 5
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment
HWTest bvt-arc keeps timing out on a few boards
Project Member Reported by drinkcat@chromium.org, Nov 27 Back to list
https://luci-milo.appspot.com/buildbot/chromeos/relm-release/1709
https://luci-milo.appspot.com/buildbot/chromeos/relm-release/1710

End up with:
HWTest [bvt-arc] HWTest [bvt-arc]
[Test-Logs]: Suite job: ABORT

Looking at test result, I see that some jobs complete (most of them towards the end of the 3 hours timeout window, though), and then the rest times out:
11/25 20:54:33.507 DEBUG|             suite:1186| Scheduled 23 tests, writing the total to keyval.
11/25 20:54:33.508 DEBUG|     dynamic_suite:0626| Waiting on suite.
11/25 21:41:19.725 INFO |        server_job:0213| END GOOD	158688602-chromeos-test/chromeos6-row1-rack16-host7/cheets_StartAndroid.stress	cheets_StartAndroid.stress	timestamp=1511674732	localtime=Nov 25 21:38:52	
11/25 21:49:16.711 INFO |        server_job:0213| END GOOD	158688572-chromeos-test/chromeos6-row2-rack15-host2/cheets_CTS_N.CtsAccelerationTestCases	cheets_CTS_N.CtsAccelerationTestCases	
11/25 21:49:16.958 INFO |        server_job:0213| END GOOD	158688587-chromeos-test/chromeos6-row1-rack16-host5/cheets_CTS_N.CtsAppUsageHostTestCases	cheets_CTS_N.CtsAppUsageHostTestCases	timestamp=1511675130	localtime=Nov 25 21:45:30	
11/25 21:49:17.223 INFO |        server_job:0213| END GOOD	158688596-chromeos-test/chromeos6-row1-rack16-host7/cheets_CTS_N.CtsAccountManagerTestCases	cheets_CTS_N.CtsAccountManagerTestCases	timestamp=1511675201	localtime=Nov 25 21:46:41	
11/25 21:49:17.490 INFO |        server_job:0213| END GOOD	158688598-chromeos-test/chromeos6-row2-rack15-host5/cheets_CTS_N.CtsDramTestCases	cheets_CTS_N.CtsDramTestCases	timestamp=1511675216	11/25 23:30:20.938 INFO |        server_job:0213| END GOOD	158688577-chromeos-test/chromeos6-row1-rack15-host17/cheets_CTS_N.CtsAppUsageHostTestCases	cheets_CTS_N.CtsAppUsageHostTestCases	timestamp=1511681075	localtime=Nov 25 23:24:35	
11/25 23:33:02.887 INFO |        server_job:0213| END GOOD	158688582-chromeos-test/chromeos6-row1-rack15-host17/cheets_GTS.GtsNetTestCases	cheets_GTS.GtsNetTestCases	timestamp=1511681433	localtime=Nov 25 23:30:33	
11/25 23:40:48.523 INFO |        server_job:0213| END GOOD	158688575-chromeos-test/chromeos6-row2-rack15-host5/cheets_GTS.GtsAdminTestCases	cheets_GTS.GtsAdminTestCases	timestamp=1511681874	localtime=Nov 25 23:37:54	
11/25 23:40:48.726 INFO |        server_job:0213| END GOOD	158688585-chromeos-test/chromeos6-row1-rack15-host17/cheets_CTS_N.CtsDramTestCases	cheets_CTS_N.CtsDramTestCases	timestamp=1511681824	localtime=Nov 25 23:37:04	
11/25 23:47:33.757 INFO |        server_job:0213| END GOOD	158688580-chromeos-test/chromeos6-row2-rack15-host2/cheets_GTS.GtsPlacementTestCases	cheets_GTS.GtsPlacementTestCases	timestamp=1511682140	localtime=Nov 25 23:42:20	
11/25 23:47:34.001 INFO |        server_job:0213| END GOOD	158688591-chromeos-test/chromeos6-row1-rack15-host17/cheets_CTS_N.CtsAccountManagerTestCases	cheets_CTS_N.CtsAccountManagerTestCases	timestamp=1511682202	localtime=Nov 25 23:43:22	
11/25 23:47:34.244 INFO |        server_job:0213| END GOOD	158688593-chromeos-test/chromeos6-row2-rack15-host5/cheets_CTS_N.CtsAccelerationTestCases	cheets_CTS_N.CtsAccelerationTestCases	timestamp=1511682270	localtime=Nov 25 23:44:30	
11/25 23:49:23.116 INFO |        server_job:0213| END GOOD	158688605-chromeos-test/chromeos6-row2-rack15-host2/cheets_ContainerSmokeTest	cheets_ContainerSmokeTest	timestamp=1511682387	localtime=Nov 25 23:46:27	
11/25 23:49:23.336 INFO |        server_job:0213| END GOOD	158688609-chromeos-test/chromeos6-row1-rack15-host17/cheets_LoginScreen	cheets_LoginScreen	timestamp=1511682326	localtime=Nov 25 23:45:26	
11/25 23:52:15.388 INFO |        server_job:0213| END GOOD	158688611-chromeos-test/chromeos6-row2-rack15-host5/cheets_NotificationTest	cheets_NotificationTest	timestamp=1511682542	localtime=Nov 25 23:49:02	
11/25 23:52:15.591 INFO |        server_job:0213| END GOOD	158688614-chromeos-test/chromeos6-row1-rack15-host17/cheets_DownloadsFilesystem	cheets_DownloadsFilesystem	timestamp=1511682523	localtime=Nov 25 23:48:43	
11/25 23:52:15.798 INFO |        server_job:0213| END GOOD	158688616-chromeos-test/chromeos6-row2-rack15-host2/security_NetworkListeners	security_NetworkListeners	timestamp=1511682564	localtime=Nov 25 23:49:24	
11/25 23:55:15.019 INFO |        server_job:0213| END GOOD	158688621-chromeos-test/chromeos6-row1-rack15-host17/cheets_SELinuxTest	cheets_SELinuxTest	timestamp=1511682759	localtime=Nov 25 23:52:39	
11/25 23:57:35.483 INFO |        server_job:0213| END GOOD	158688623-chromeos-test/chromeos6-row2-rack15-host5/cheets_KeyboardTest	cheets_KeyboardTest	timestamp=1511682835	localtime=Nov 25 23:53:55	
11/25 23:57:35.718 INFO |        server_job:0213| END GOOD	158688626-chromeos-test/chromeos6-row2-rack15-host2/cheets_SettingsBridge	cheets_SettingsBridge	timestamp=1511682843	localtime=Nov 25 23:54:03	
11/25 23:57:35.718 DEBUG|             suite:1379| Adding job keyval for cheets_SettingsBridge=158688626-chromeos-test
11/26 00:00:19.467 INFO |        server_job:0213| START	----	relm-release/R64-10162.0.0/bvt-arc/telemetry_LoginTest.arc	timestamp=1511683219	localtime=Nov 26 00:00:19	
11/26 00:00:19.468 INFO |        server_job:0213| 	ABORT	----	relm-release/R64-10162.0.0/bvt-arc/telemetry_LoginTest.arc	timestamp=1511683219	localtime=Nov 26 00:00:19	
11/26 00:00:19.469 INFO |        server_job:0213| END ABORT	----	relm-release/R64-10162.0.0/bvt-arc/telemetry_LoginTest.arc	timestamp=1511683219	localtime=Nov 26 00:00:19	
11/26 00:00:19.685 INFO |        server_job:0213| START	----	relm-release/R64-10162.0.0/bvt-arc/graphics_Idle.arc	timestamp=1511683219	localtime=Nov 26 00:00:19	
11/26 00:00:19.685 INFO |        server_job:0213| 	ABORT	----	relm-release/R64-10162.0.0/bvt-arc/graphics_Idle.arc	timestamp=1511683219	localtime=Nov 26 00:00:19	
11/26 00:00:19.686 INFO |        server_job:0213| END ABORT	----	relm-release/R64-10162.0.0/bvt-arc/graphics_Idle.arc	timestamp=1511683219	localtime=Nov 26 00:00:19	
11/26 00:00:19.912 INFO |        server_job:0213| START	158688631-chromeos-test/chromeos6-row1-rack15-host17/desktopui_ExitOnSupervisedUserCrash.arc	desktopui_ExitOnSupervisedUserCrash.arc	timestamp=1511682903	localtime=Nov 25 23:55:03	
11/26 00:00:19.913 INFO |        server_job:0213| 	GOOD	158688631-chromeos-test/chromeos6-row1-rack15-host17/desktopui_ExitOnSupervisedUserCrash.arc	desktopui_ExitOnSupervisedUserCrash.arc	timestamp=1511682930	localtime=Nov 25 23:55:30	completed successfully
11/26 00:00:19.914 INFO |        server_job:0213| END GOOD	158688631-chromeos-test/chromeos6-row1-rack15-host17/desktopui_ExitOnSupervisedUserCrash.arc	desktopui_ExitOnSupervisedUserCrash.arc	timestamp=1511682930	localtime=Nov 25 23:55:30	
11/26 00:00:19.914 DEBUG|             suite:1379| Adding job keyval for desktopui_ExitOnSupervisedUserCrash.arc=158688631-chromeos-test

Maybe we need more than 5 DUTs in pool:bvt?

 ./site_utils/dut_status.py -b relm -p bvt
hostname                       S   last checked         URL
chromeos6-row1-rack16-host7    OK  2017-11-26 23:21:01  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack16-host7/2172239-repair/
chromeos6-row1-rack16-host5    OK  2017-11-26 23:32:18  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack16-host5/2172277-repair/
chromeos6-row2-rack15-host5    OK  2017-11-26 23:30:41  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row2-rack15-host5/2172269-repair/
chromeos6-row2-rack15-host2    OK  2017-11-26 23:20:51  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row2-rack15-host2/2172237-repair/
chromeos6-row1-rack15-host17   OK  2017-11-26 23:57:39  http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos6-row1-rack15-host17/2172400-provision/
 
Cc: dhadd...@chromium.org
From https://viceroy.corp.google.com/chromeos/suite_details?job_id=158831044 we can see  the 5 DUTs are fully occupied. 

As a temporary fix I add bvt pool from 5 DUTs to 8 DUTs.


But why RESET always fail after autoupdateE2E test which causes a time-consuming repair? cc @dhaddock.
Owner: dhadd...@chromium.org
Oh wow :( I did not know about this RESET check. 

I've checked a few different suites on different boards and you are right it always fails. Even when the suite passes, it still does the unnecessary repair.

I'll look into this more
I think I know what the problem is: the AU test does not clobber stateful, and provision does.

During setup of provision/AU (auto_updater.PreSetupCrOSUpdate()), we create a file /var/tmp/provision_failed 

Then in the RESET job we check that the file is removed. This code assumes that all AU tests will wipe stateful

Project Member Comment 4 by bugdroid1@chromium.org, Nov 28
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/ea64fd26322aaa5b4776ac24272a33e9a682c9a4

commit ea64fd26322aaa5b4776ac24272a33e9a682c9a4
Author: David Haddock <dhaddock@chromium.org>
Date: Tue Nov 28 23:44:23 2017

[chromite] Stop running unnecessary repair after successful AU test.

After every run of autoupdate_EndToEndTest the lab does a RESET job to
make sure the DUT is in a good state. This RESET job fails the check:"Last AU
Passed" every single time whether the AU was succesful or not. It then kicks
off a repair job that takes between 15-30minutes!!!!!!!!

The shared provision code flow creates a file /var/tmp/provision_failed at the
start of the test. This file is cleared when we wipe the stateful
partition. The RESET job assumes that all AU tests will clear the
stateful partition. This is not true for autoupdateEndToEndTest.

This CL will stop creation of /var/tmp/provision_failed in the
au_e2etest.

BUG= chromium:788628 
TEST=autoupdate_e2etest pass and /var/tmp/provision_failed not there

Change-Id: Ieea0b942312a92ac5849d6f4d8114c8e79652344
Reviewed-on: https://chromium-review.googlesource.com/791911
Commit-Ready: David Haddock <dhaddock@chromium.org>
Tested-by: David Haddock <dhaddock@chromium.org>
Reviewed-by: Xixuan Wu <xixuan@chromium.org>

[modify] https://crrev.com/ea64fd26322aaa5b4776ac24272a33e9a682c9a4/lib/auto_updater.py

Cc: edcourtney@chromium.org yusukes@chromium.org pho...@chromium.org
 Issue 789420  has been merged into this issue.
Status: Assigned
Summary: HWTest bvt-arc keeps timing out on a few boards (was: relm-release: HWTest bvt-arc keeps timing out)
Waiting on devserver push for the change in #4 to kick in 
I don't think quawks paladin timed out. I think it just failed one of the tests:
http://cautotest-prod.corp.google.com/afe/#tab_id=view_job&object_id=159323506
Status: Fixed
Confirmed that the AU repair issue is fixed. Closing now. Reopen if we still see timeouts 
Sign in to add a comment