CQ: several builders failed HWTest with "Android did not boot!" errors |
|||||||||||||
Issue descriptionSeems to have been an intermittent ssh failure in the provisioning step for this build: https://uberchromegw.corp.google.com/i/chromeos/builders/cyan-paladin/builds/1329 HWTest log: Triggered task: cyan-paladin/R57-9169.0.0-rc3-arc-bvt-cq Waiting for results from the following shards: 0 Waiting for results from the following shards: 0 Waiting for results from the following shards: 0 Waiting for results from the following shards: 0 Waiting for results from the following shards: 0 chromeos-server22-273: 33a264d525da6110 1 Autotest instance: cautotest 01-10-2017 [10:10:47] Created suite job: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=95510916 @@@STEP_LINK@Link to suite@http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=95510916@@@ The suite job has another 2:29:45.064020 till timeout. The suite job has another 1:59:38.377287 till timeout. Suite job [ PASSED ] video_ChromeRTCHWEncodeUsed.arc [ INFO ] video_ChromeRTCHWEncodeUsed.arc TEST_NA: Skipping: test not supported on this board. provision [ FAILED ] provision FAIL: Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row12-rack11-host3: SSHConnectionError: ssh: connect to host chromeos4-row12-rack11-host3 port 22: Connection timed out cheets_CTS.com.android.cts.dram [ FAILED ] cheets_CTS.com.android.cts.dram FAIL: cheets_CTSHelper client test did not pass. cheets_GTS.google.admin [ FAILED ] cheets_GTS.google.admin FAIL: cheets_CTSHelper client test did not pass. cheets_ContainerSmokeTest [ FAILED ] cheets_ContainerSmokeTest FAIL: Android did not boot! cheets_NotificationTest [ FAILED ] https://uberchromegw.corp.google.com/i/chromeos/builders/cyan-paladin/builds/1332 (log looks the same as 1329) https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_minnie-paladin/builds/1329 chromeos-server31-209: 33a09074e14b7b10 1 Autotest instance: cautotest 01-10-2017 [01:39:11] Created suite job: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=95446250 @@@STEP_LINK@Link to suite@http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=95446250@@@ The suite job has another 2:29:46.625109 till timeout. Suite job [ PASSED ] cheets_CTS.6.0_r13.x86.com.android.cts.dram [ INFO ] cheets_CTS.6.0_r13.x86.com.android.cts.dram TEST_NA: Skipping: test not supported on this board. cheets_CTS.6.0_r13.x86.android.core.tests.libcore.package.harmony_java_math [ INFO ] cheets_CTS.6.0_r13.x86.android.core.tests.libcore.package.harmony_java_math TEST_NA: Skipping: test not supported on this board. video_ChromeHWDecodeUsed.vp9.arc [ INFO ] video_ChromeHWDecodeUsed.vp9.arc TEST_NA: Skipping: test not supported on this board. video_ChromeCameraMJpegHWDecodeUsed.arc [ INFO ] video_ChromeCameraMJpegHWDecodeUsed.arc TEST_NA: Skipping: test not supported on this board. cheets_CTS.android.core.tests.libcore.package.harmony_java_math [ FAILED ] cheets_CTS.android.core.tests.libcore.package.harmony_java_math FAIL: cheets_CTSHelper client test did not pass. cheets_CTS.com.android.cts.dram [ FAILED ] cheets_CTS.com.android.cts.dram FAIL: cheets_CTSHelper client test did not pass. cheets_GTS.google.admin [ FAILED ] cheets_GTS.google.admin FAIL: cheets_CTSHelper client test did not pass. cheets_ContainerSmokeTest [ FAILED ] cheets_ContainerSmokeTest FAIL: Android did not boot https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_minnie-paladin/builds/1332 (log looks the same as 1329) Not sure if networking connectivity also explains some of the intervening failures on cyan-paladin, which have a different signature: https://uberchromegw.corp.google.com/i/chromeos/builders/cyan-paladin/builds/1330 chromeos-server31-298: 33a118e994fdcd10 1 Autotest instance: cautotest 01-10-2017 [04:08:15] Created suite job: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=95460355 @@@STEP_LINK@Link to suite@http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=95460355@@@ The suite job has another 2:29:47.216332 till timeout. Suite job [ PASSED ] video_ChromeRTCHWEncodeUsed.arc [ INFO ] video_ChromeRTCHWEncodeUsed.arc TEST_NA: Skipping: test not supported on this board. video_ChromeHWDecodeUsed [ PASSED ] telemetry_LoginTest [ PASSED ] video_VideoSanity [ PASSED ] video_ChromeHWDecodeUsed [ PASSED ] video_ChromeRTCHWDecodeUsed [ PASSED ] video_VideoSanity [ PASSED ] security_NetworkListeners [ PASSED ] cheets_SettingsBridge [ FAILED ] cheets_SettingsBridge FAIL: adb is not ready in 60 seconds. graphics_Idle [ PASSED ] video_ChromeRTCHWDecodeUsed [ PASSED ] video_ChromeHWDecodeUsed [ PASSED ] desktopui_ExitOnSupervisedUserCrash [ PASSED ] video_VideoSanity [ PASSED ] cheets_ContainerSmokeTest [ FAILED ] cheets_ContainerSmokeTest FAIL: adb is not ready in 60 seconds. cheets_ContainerSmokeTest retry_count: 1 cheets_CTS.com.android.cts.dram [ FAILED ] cheets_CTS.com.android.cts.dram FAIL: Error: Failed to set up adb connection cheets_CTS.com.android.cts.dram retry_count: 1 cheets_CTS.android.core.tests.libcore.package.harmony_java_math [ FAILED ] cheets_CTS.android.core.tests.libcore.package.harmony_java_math FAIL: Error: Failed to set up adb connection cheets_CTS.android.core.tests.libcore.package.harmony_java_math retry_count: 1 cheets_GTS.google.admin [ FAILED ] cheets_GTS.google.admin FAIL: Error: Failed to set up adb connection cheets_GTS.google.admin retry_count: 1 cheets_CTS.android.core.tests.libcore.package.harmony_java_math [ FAILED ] cheets_CTS.android.core.tests.libcore.package.harmony_java_math FAIL: Error: Failed to set up adb connection cheets_CTS.android.core.tests.libcore.package.harmony_java_math retry_count: 1 cheets_CTS.com.android.cts.dram [ FAILED ] cheets_CTS.com.android.cts.dram FAIL: Error: Failed to set up adb connection cheets_CTS.com.android.cts.dram retry_count: 1 cheets_NotificationTest [ FAILED ] cheets_NotificationTest FAIL: adb is not ready in 60 seconds. cheets_NotificationTest retry_count: 1 But for that build, veyron_minnie succeeded: https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_minnie-paladin/builds/1330
,
Jan 11 2017
+ARC constables I first suspected the IP address change referred in #1, but CQ failures by "Android did not boot!" does not look necessary always coinciding with the breaking changes: https://luci-milo.appspot.com/buildbot/chromeos/veyron_minnie-paladin/
,
Jan 11 2017
,
Jan 11 2017
CQ is failing dozens of times in a row, and nearly half of them involves this
,
Jan 11 2017
https://uberchromegw.corp.google.com/i/chromeos/builders/cyan-paladin/builds/1332 is caused by a crash in session_manager. I believe the culprit is https://chromium-review.googlesource.com/#/c/425879/2/libcontainer/libcontainer.c Thread 0 (crashed) 0 libc-2.23.so!strlen + 0x26 rax = 0x0000000000000000 rdx = 0x0000000000000000 rcx = 0x0000000000000000 rbx = 0x00000000ffffffea rsi = 0x000061ec3be348d0 rdi = 0x0000000000000000 rbp = 0x00007ffd85a59bb0 rsp = 0x00007ffd85a59b98 r8 = 0x007372656e696174 r9 = 0x000061ec3be34a00 r10 = 0x0000000000000010 r11 = 0x00007c27e6c188e0 r12 = 0x0000000000000000 r13 = 0x000061ec3bdb4318 r14 = 0x000061ec3be47130 r15 = 0x000061ec3be348d0 rip = 0x00007c27e6b22006 Found by: given as instruction pointer in context 1 libc-2.23.so!__strdup [strdup.c : 41 + 0x5] rbx = 0x00000000ffffffea rbp = 0x00007ffd85a59bb0 rsp = 0x00007ffd85a59ba0 r12 = 0x0000000000000000 r13 = 0x000061ec3bdb4318 r14 = 0x000061ec3be47130 r15 = 0x000061ec3be348d0 rip = 0x00007c27e6b21d0f Found by: call frame info 2 libcontainer.so!container_start [libcontainer.c : 1069 + 0xf] rbx = 0x00000000ffffffea rbp = 0x00007ffd85a5a040 rsp = 0x00007ffd85a59bc0 r12 = 0x000061ec3be47130 r13 = 0x000061ec3bdb4318 r14 = 0x000061ec3be47130 r15 = 0x000061ec3be348d0 rip = 0x00007c27e7819374 Found by: call frame info 3 session_manager!login_manager::ContainerManagerImpl::StartContainer(base::Callback<void (int, bool), (base::internal::CopyMode)1> const&) [container_manager_impl.cc : 167 + 0x8] rbx = 0x00007ffd85a5a078 rbp = 0x00007ffd85a5ad00 rsp = 0x00007ffd85a5a050 r12 = 0x000061ec3be47130 r13 = 0x000061ec3bdb4318 r14 = 0x00007c27e715c238 r15 = 0x000061ec3bdb4330 rip = 0x000061ec3b5cb7be Found by: call frame info 4 session_manager!login_manager::SessionManagerImpl::StartArcInstanceInternal(bool*, char const**, std::string*) [session_manager_impl.cc : 1117 + 0xc] rbx = 0x00007ffd85a5ad90 rbp = 0x00007ffd85a5af80 rsp = 0x00007ffd85a5ad10 r12 = 0x00007ffd85a5aff8 r13 = 0x000061ec3bde2d90 r14 = 0x000061ec3bdb4318 r15 = 0x00007ffd85a5ada0 rip = 0x000061ec3b5de979 Found by: call frame info
,
Jan 11 2017
Nice finding! Lowering the priority since the CQ is now secured by Verified-1. Just in case I'll keep looking the "adb is not ready in 60 seconds" failure case.
,
Jan 11 2017
Yup, I suspect there is another culprit (and one suspect is mine). I'm taking a look at https://uberchromegw.corp.google.com/i/chromeos/builders/cyan-paladin/builds/1331 now, which does not include vapier@'s change.
,
Jan 11 2017
+abhishekbh https://uberchromegw.corp.google.com/i/chromeos/builders/cyan-paladin/builds/1331 is failing on ARC check-in because network is down: 01-10 23:34:17.580 371 716 E CheckinTask: Checkin failed: https://android.clients.google.com/checkin (request #0): java.net.UnknownHostException: Unable to resolve host "andr oid.clients.google.com": No address associated with hostname So it's highly likely this is caused by container IP changes.
,
Jan 18 2017
since it was only failing in the CQ, and nya@ triaged this to find the bad CL (one of mine), there's nothing left to do here -- i fixed the CL to handle the NULL case
,
Mar 4 2017
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Oct 14 2017
,
Jun 21 2018
|
|||||||||||||
►
Sign in to add a comment |
|||||||||||||
Comment 1 by abhishekbh@google.com
, Jan 10 2017