veyron_speedy-paladin: provision failed with crashes |
|||||||
Issue descriptionhttps://luci-milo.appspot.com/buildbot/chromeos/veyron_speedy-paladin/6944 https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/149871424-chromeos-test START provision_AutoUpdate provision_AutoUpdate timestamp=1508326802 localtime=Oct 18 04:40:02 START ---- ---- timestamp=1508326828 localtime=Oct 18 04:40:28 GOOD ---- sysinfo.before timestamp=1508326828 localtime=Oct 18 04:40:28 END GOOD ---- ---- timestamp=1508326828 localtime=Oct 18 04:40:28 FAIL provision_AutoUpdate provision_AutoUpdate timestamp=1508328820 localtime=Oct 18 05:13:40 Unhandled DevServerException: CrOS auto-update failed for host chromeos4-row4-rack12-host19: 0) SSHConnectionError: ssh: connect to host chromeos4-row4-rack12-host19 port 22: Connection timed out , 1) SSHConnectionError: ssh: connect to host 100.115.195.176 port 22: Connection timed out , Traceback (most recent call last): File "/usr/local/autotest/client/common_lib/test.py", line 806, in _call_test_function return func(*args, **dargs) File "/usr/local/autotest/client/common_lib/test.py", line 470, in execute dargs) File "/usr/local/autotest/client/common_lib/test.py", line 347, in _call_run_once_with_retry postprocess_profiled_run, args, dargs) File "/usr/local/autotest/client/common_lib/test.py", line 380, in _call_run_once self.run_once(*args, **dargs) File "/usr/local/autotest/server/site_tests/provision_AutoUpdate/provision_AutoUpdate.py", line 121, in run_once with_cheets=with_cheets) File "/usr/local/autotest/server/afe_utils.py", line 124, in machine_install_and_update_labels *args, **dargs) File "/usr/local/autotest/server/hosts/cros_host.py", line 815, in machine_install_by_devserver force_original=force_original) File "/usr/local/autotest/client/common_lib/cros/dev_server.py", line 2308, in auto_update error_msg % (host_name, real_error)) DevServerException: CrOS auto-update failed for host chromeos4-row4-rack12-host19: 0) SSHConnectionError: ssh: connect to host chromeos4-row4-rack12-host19 port 22: Connection timed out , 1) SSHConnectionError: ssh: connect to host 100.115.195.176 port 22: Connection timed out , END FAIL provision_AutoUpdate provision_AutoUpdate timestamp=1508328820 localtime=Oct 18 05:13:40 END FAIL ---- provision timestamp=1508328820 localtime=Oct 18 05:13:40 INFO ---- ---- timestamp=1508328820 job_abort_reason= localtime=Oct 18 05:13:40 INFO ---- ---- timestamp=1508328835 localtime=Oct 18 05:13:55 Start crashcollection record INFO ---- New Crash Dump timestamp=1508328835 localtime=Oct 18 05:13:55 /usr/local/autotest/results/hosts/chromeos4-row4-rack12-host19/4592076-provision/20171810043926/crashinfo.chromeos4-row4-rack12-host19/sslh_fork.20171018.051305.2688.dmp INFO ---- Orphaned Crash Dump timestamp=1508328835 localtime=Oct 18 05:13:55 /var/spool/crash/lsb-release INFO ---- Orphaned Crash Dump timestamp=1508328835 localtime=Oct 18 05:13:55 /var/spool/crash/sslh_fork.20171018.051305.2688.meta INFO ---- Orphaned Crash Dump timestamp=1508328835 localtime=Oct 18 05:13:55 /var/spool/crash/sslh_fork.20171018.051305.2688.core INFO ---- Orphaned Crash Dump timestamp=1508328835 localtime=Oct 18 05:13:55 /var/spool/crash/os-release INFO ---- Orphaned Crash Dump timestamp=1508328835 localtime=Oct 18 05:13:55 /var/spool/crash/sslh_fork.20171018.051305.2688.dmp INFO ---- ---- timestamp=1508328835 localtime=Oct 18 05:13:55 End crashcollection record
,
Oct 18 2017
In /var/log/messages: 2017-10-18T11:52:55.317275+00:00 ERR kernel: [ 29.568626] brcmfmac: brcmf_sdio_drivestrengthinit: No SDIO Drive strength init done for chip 4354 rev 1 pmurev 24 2017-10-18T11:52:55.434901+00:00 WARNING crash_reporter[2071]: Could not load the device policy file. 2017-10-18T11:52:55.435785+00:00 WARNING crash_reporter[2071]: [user] Received crash notification for tzdatacheck[2023] sig 6, user 656360 (ignoring - PID filtered out) 2017-10-18T11:52:55.436428+00:00 WARNING crash_reporter[2071]: [ARC] Received crash notification for tzdatacheck[2023] sig 6, user 656360 (developer build - not testing - always dumping) 2017-10-18T11:52:55.438495+00:00 INFO crash_reporter[2071]: State of crashed process [2023]: S (sleeping) 2017-10-18T11:52:55.443922+00:00 ERR crash_reporter[2071]: Could not get active user sessions, using default. 2017-10-18T11:52:55.447293+00:00 INFO kernel: [ 29.698779] tpm_i2c_infineon 1-0020: command 0x1f (size 146) returned code 0x0 2017-10-18T11:52:55.483494+00:00 ERR session_manager[1319]: [ERROR:dbus_util.cc(13)] CreateError(...): Domain=dbus, Code=org.chromium.SessionManagerInterface.NotStarted, Message=ARC is not started yet. 2017-10-18T11:52:55.484047+00:00 ERR crash_reporter[2071]: Failed to call method: org.chromium.SessionManagerInterface.GetArcStartTimeTicks: object_path= /org/chromium/SessionManager: org.chromium.SessionManagerInterface.NotStarted: ARC is not started yet. 2017-10-18T11:52:55.484088+00:00 ERR crash_reporter[2071]: CallMethodAndBlockWithTimeout(...): Domain=dbus, Code=org.chromium.SessionManagerInterface.NotStarted, Message=ARC is not started yet. 2017-10-18T11:52:55.484123+00:00 ERR crash_reporter[2071]: Failed to get ARC uptime: ARC is not started yet. 2017-10-18T11:52:55.484189+00:00 INFO crash_reporter[2071]: Stored minidump to /home/chronos/crash/tzdatacheck.20171018.045255.2023.dmp 2017-10-18T05:13:25.173394-07:00 WARNING crash_reporter[2508]: Could not load the device policy file. 2017-10-18T05:13:25.174248-07:00 WARNING crash_reporter[2508]: [user] Received crash notification for tzdatacheck[2500] sig 6, user 656360 (ignoring - PID filtered out) 2017-10-18T05:13:25.176859-07:00 WARNING crash_reporter[2508]: [ARC] Received crash notification for tzdatacheck[2500] sig 6, user 656360 (developer build - not testing - always dumping) 2017-10-18T05:13:25.179803-07:00 INFO crash_reporter[2508]: State of crashed process [2500]: S (sleeping) 2017-10-18T05:13:25.186065-07:00 ERR crash_reporter[2508]: Could not get active user sessions, using default. 2017-10-18T05:13:25.204823-07:00 INFO session_manager[1222]: [INFO:server_backed_state_key_generator.cc(126)] Stable device secret missing! 2017-10-18T05:13:25.241088-07:00 ERR session_manager[1222]: [ERROR:dbus_util.cc(13)] CreateError(...): Domain=dbus, Code=org.chromium.SessionManagerInterface.NotStarted, Message=ARC is not started yet. 2017-10-18T05:13:25.242553-07:00 ERR crash_reporter[2508]: Failed to call method: org.chromium.SessionManagerInterface.GetArcStartTimeTicks: object_path= /org/chromium/SessionManager: org.chromium.SessionManagerInterface.NotStarted: ARC is not started yet. 2017-10-18T05:13:25.242603-07:00 ERR crash_reporter[2508]: CallMethodAndBlockWithTimeout(...): Domain=dbus, Code=org.chromium.SessionManagerInterface.NotStarted, Message=ARC is not started yet. 2017-10-18T05:13:25.242637-07:00 ERR crash_reporter[2508]: Failed to get ARC uptime: ARC is not started yet. 2017-10-18T05:13:25.242691-07:00 INFO crash_reporter[2508]: Stored minidump to /home/chronos/crash/tzdatacheck.20171018.051325.2500.dmp 2017-10-18T05:13:25.291846-07:00 INFO crash_reporter[2508]: Leaving core file at /home/chronos/crash/tzdatacheck.20171018.051325.2500.core due to developer image 2017-10-18T05:13:25.301772-07:00 DEBUG kernel: [ 15.901490] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs 2017-10-18T05:13:25.510597-07:00 INFO crash_reporter[2672]: Enabling kernel crash handling 2017-10-18T05:13:25.511244-07:00 WARNING crash_reporter[2672]: Could not load the device policy file. 2017-10-18T05:13:27.181835-07:00 INFO kernel: [ 17.786781] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready 2017-10-18T05:13:27.181862-07:00 WARNING kernel: [ 17.787991] smsc75xx 3-1:1.0 eth1: device_set_wakeup_enable error -22 2017-10-18T05:13:27.191828-07:00 INFO kernel: [ 17.791627] smsc75xx 3-1:1.0 eth1: link up, 1000Mbps, full-duplex, lpa 0xC5E1
,
Oct 18 2017
I think tzdatacheck crashes if android is not up, and this is just hiding some other problem.
,
Oct 18 2017
The tzdatacheck crash is benign (tracked in https://buganizer.corp.google.com/issues/67637900 and currently in the process of being fixed).
,
Oct 19 2017
blocking bug is fixed.
,
Oct 19 2017
Saw another failure example: https://luci-milo.appspot.com/buildbot/chromeos/veyron_speedy-paladin/6961 same root cause?
,
Oct 19 2017
,
Oct 19 2017
,
Nov 8 2017
Another failure this morning. Unclear if the same root cause. -> sheriff to investigate.
,
Nov 8 2017
The provision and repair logs:
2017-11-08 07:51:51 OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row4-rack11-host16/4779556-repair/
2017-11-08 07:18:35 -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos4-row4-rack11-host16/4779211-provision/
,
Nov 8 2017
Can't fine tzdatacheck in the log. I think it is not the same root cause. Look like it just finish autoupdate then reboot and ssh can't see it. The next veyron_speedy-paladin is green so I think this is just a flake
,
Jan 22 2018
,
Jan 23 2018
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by xiaochu@chromium.org
, Oct 18 2017