Cfm client side issue: Acer Chromebase rebooting during meetings |
||||||||||||||||
Issue descriptionChromeOS version: 63.0.3239.140 & 62 Cfm device model: Acer Chromebase 24 (Buddy) Case#: 14850855 Description:CfM Acer Chromebase rebooting during meetings Chromebases are rebooting during meetings. No issues in standby mode, but it will reboot itself 10-15 minutes into a meeting. Issue persists after wiping and USB recovery. Steps to reproduce: 1. Start the device 2. Access a meeting 3. After a few minutes (5, 10-15) device reboots Current Behavior / Reproduction: Chromebases are rebooting during meetings.(After a few minutes) Expected Behavior: CfM devices should not be rebooting during the meeting session. Device info. DQZ0HEK00173102E373000 : https://drive.google.com/open?id=1jAg6ZJVEZjrGuWGftCRnwaouqYc5B2oE DQZ0HEK00173102E4F3000 : https://drive.google.com/open?id=1HYA9VKmRnAPoNgKHvsWpWpJGL9uJiMyI Drive link to logs: DQZ0HEK00173102E373000 https://drive.google.com/open?id=1ZV4NABn2i3MscM8N0w3tNNoIhcFDZPkv (while still on v63) 9:10 AM (GMT+1) --eventlog-- 199 | 2018-01-31 08:47:21 | Wake Source | Power Button | 0 200 | 2018-01-31 08:47:21 | Wake Source | GPIO | 27 201 | 2018-01-31 09:03:40 | System boot | 157 202 | 2018-01-31 09:03:40 | SUS Power Fail 203 | 2018-01-31 09:03:40 | System Reset 204 | 2018-01-31 09:03:40 | ACPI Wake | S5 205 | 2018-01-31 09:03:40 | Wake Source | Power Button | 0 --message log-- 2018-01-31T08:48:05.027180+01:00 ERR session_manager[3057]: [ERROR:dbus_util.cc(13)] CreateError(...): Domain=dbus, Code=org.chromium.SessionManagerInterface.NotAvailable, Message=ARC not supported. 2018-01-31T08:48:05.051740+01:00 INFO session_manager[3057]: [INFO:server_backed_state_key_generator.cc(126)] Stable device secret missing! 2018-01-31T08:48:05.052682+01:00 ERR chrome[3081]: [3081:3081:0131/084805.052639:INFO:remote_commands_service.cc(36)] Fetching remote commands. 2018-01-31T08:48:05.053245+01:00 ERR chrome[3081]: [3081:3081:0131/084805.053219:INFO:system_log_uploader.cc(198)] Creating system log uploader. 2018-01-31T08:48:05.054059+01:00 ERR chrome[3081]: [3081:3081:0131/084805.054023:INFO:system_log_uploader.cc(340)] Scheduling next system log upload 0 s from now. 2018-01-31T08:48:05.056734+01:00 ERR chrome[3081]: [3081:3081:0131/084805.056698:INFO:system_log_uploader.cc(320)] System log upload is disabled, rescheduling. 2018-01-31T08:48:05.057172+01:00 ERR chrome[3081]: [3081:3081:0131/084805.057133:INFO:system_log_uploader.cc(340)] Scheduling next system log upload 43200 s from now. 2018-01-31T08:48:05.088126+01:00 INFO kernel: [ 43.345991] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0 2018-01-31T08:48:05.118109+01:00 INFO kernel: [ 43.375923] tpm_tis tpm_tis: command 0xba (size 18) returned code 0xc 2018-01-31T08:48:05.335712+01:00 ERR chrome[3081]: [3081:3081:0131/084805.335668:INFO:remote_commands_service.cc(161)] Remote commands fetched. 2018-01-31T08:48:05.356713+01:00 ERR chrome[3081]: [3081:3081:0131/084805.356667:INFO:remote_commands_invalidator.cc(32)] Initialize RemoteCommandsInvalidator. 2018-01-31T08:48:05.357053+01:00 ERR chrome[3081]: [3081:3081:0131/084805.357028:INFO:remote_commands_invalidator.cc(57)] Starting RemoteCommandsInvalidator. 2018-01-31T08:48:05.357352+01:00 ERR chrome[3081]: [3081:3081:0131/084805.357327:INFO:remote_commands_invalidator.cc(123)] RemoteCommandsInvalidator ReloadPolicyData. 2018-01-31T08:48:05.357632+01:00 ERR chrome[3081]: [3081:3081:0131/084805.357615:INFO:remote_commands_invalidator.cc(147)] Register RemoteCommandsInvalidator. 2018-01-31T08:48:05.357889+01:00 ERR chrome[3081]: [3081:3081:0131/084805.357871:INFO:remote_commands_invalidator.cc(83)] RemoteCommandsInvalidator state changed: 3 2018-01-31T08:48:05.358287+01:00 ERR chrome[3081]: [3081:3081:0131/084805.358262:INFO:remote_commands_invalidator.cc(83)] RemoteCommandsInvalidator state changed: 3 2018-01-31T08:48:05.358724+01:00 ERR chrome[3081]: [3081:3081:0131/084805.358698:INFO:remote_commands_invalidator.cc(94)] RemoteCommandsInvalidator received invalidation. 2018-01-31T08:48:05.359052+01:00 ERR chrome[3081]: [3081:3081:0131/084805.359032:INFO:remote_commands_invalidator.cc(110)] Invalidations acknowledged. 2018-01-31T08:48:05.359317+01:00 ERR chrome[3081]: [3081:3081:0131/084805.359298:INFO:remote_commands_service.cc(36)] Fetching remote commands. 2018-01-31T08:48:05.537766+01:00 INFO session_manager[3057]: [INFO:policy_store.cc(69)] Persisted policy to disk. 2018-01-31T08:48:05.632966+01:00 INFO session_manager[3057]: [INFO:child_exit_handler.cc(77)] Handling 3365 exit. 2018-01-31T08:48:05.634689+01:00 ERR chrome[3081]: [3081:3081:0131/084805.634651:INFO:remote_commands_invalidator.cc(123)] RemoteCommandsInvalidator ReloadPolicyData. 2018-01-31T08:48:05.680685+01:00 ERR chrome[3081]: [3081:3081:0131/084805.680643:INFO:remote_commands_service.cc(161)] Remote commands fetched. 2018-01-31T08:48:07.442283+01:00 ERR chrome[3081]: [3081:3081:0131/084807.442238:INFO:remote_commands_invalidator.cc(94)] RemoteCommandsInvalidator received invalidation. 2018-01-31T08:48:07.442644+01:00 ERR chrome[3081]: [3081:3081:0131/084807.442623:INFO:remote_commands_invalidator.cc(110)] Invalidations acknowledged. 2018-01-31T08:48:07.442904+01:00 ERR chrome[3081]: [3081:3081:0131/084807.442887:INFO:remote_commands_service.cc(36)] Fetching remote commands. 2018-01-31T08:48:07.631210+01:00 ERR chrome[3081]: [3081:3081:0131/084807.631168:INFO:remote_commands_service.cc(161)] Remote commands fetched. 2018-01-31T08:49:05.056012+01:00 ERR chrome[3081]: [3081:3081:0131/084905.055941:INFO:status_uploader.cc(222)] Starting status upload: have_device_status = 1 2018-01-31T08:49:05.564304+01:00 ERR chrome[3081]: [3081:3081:0131/084905.564231:INFO:status_uploader.cc(235)] Status upload successful 2018-01-31T08:51:10.259161+01:00 INFO kernel: [ 228.594978] input: Logitech Unifying Device. Wireless PID:101a as /devices/pci0000:00/0000:00:14.0/usb1/1-5/1-5:1.2/0003:046D:C52B.0004/001D:046D:101A.0006/input/input14 2018-01-31T08:51:10.260320+01:00 INFO kernel: [ 228.596176] hid-generic 001D:046D:101A.0006: input,hidraw3: <UNKNOWN> HID v1.11 Mouse [Logitech Unifying Device. Wireless PID:101a] on usb-0000:00:14.0-5:1 2018-01-31T08:51:10.278353+01:00 NOTICE root[3644]: /opt/google/input/device_added 2018-01-31T08:51:10.310851+01:00 NOTICE root[3651]: /opt/google/input/device_added /dev/input/event14 2018-01-31T08:51:13.798137+01:00 INFO kernel: [ 232.134854] tpm_tis tpm_tis: command 0x15 (size 14) returned code 0x0 2018-01-31T08:51:13.802490+01:00 INFO cryptohomed[2314]: RecursiveCopy: /home/.shadow/skeleton/log 2018-01-31T08:51:13.804104+01:00 INFO cryptohomed[2314]: RecursiveCopy: /home/.shadow/skeleton/.ssh 2018-01-31T08:51:13.804562+01:00 INFO cryptohomed[2314]: RecursiveCopy: /home/.shadow/skeleton/.pki 2018-01-31T08:51:13.804968+01:00 INFO cryptohomed[2314]: RecursiveCopy: /home/.shadow/skeleton/.pki/nssdb 2018-01-31T08:51:13.835802+01:00 INFO session_manager[3057]: [INFO:session_manager_service.cc(347)] Received RestartJob from :1.58 2018-01-31T08:51:13.836493+01:00 INFO session_manager[3057]: [INFO:policy_key.cc(51)] No policy key on disk at /home/root/996fc2617a18b399136c32698351b83eeeca6560/session_manager/policy/key 2018-01-31T08:51:13.841808+01:00 WARNING session_manager[3057]: [WARNING:device_policy_service.cc(448)] Could not verify that owner key belongs to this user. 2018-01-31T08:51:13.841839+01:00 ERR session_manager[3057]: [ERROR:dbus_util.cc(13)] CreateError(...): Domain=dbus, Code=org.chromium.SessionManagerInterface.PubkeySetIllegal, Message=Could not verify that owner key belongs to this user. 2018-01-31T08:51:13.844256+01:00 INFO session_manager[3057]: [INFO:session_manager_impl.cc(611)] Starting user session 2018-01-31T08:51:13.844452+01:00 INFO session_manager[3057]: [INFO:browser_job.cc(156)] Terminating process group: Restarting browser on-demand. 2018-01-31T08:51:13.845215+01:00 INFO session_manager[3057]: [INFO:system_utils_impl.cc(93)] Sending 9 to -3081 as 1000 2018-01-31T08:51:13.987551+01:00 ERR session_manager[3672]: [ERROR:process.cc(329)] Exec of /usr/bin/vm_launcher failed:: No such file or directory 2018-01-31T08:51:13.988565+01:00 INFO session_manager[3057]: [INFO:child_exit_handler.cc(77)] Handling 3081 exit. 2018-01-31T08:51:13.988586+01:00 ERR session_manager[3057]: [ERROR:child_exit_handler.cc(85)] Exited with signal 9 2018-01-31T08:51:13.988610+01:00 INFO session_manager[3057]: [INFO:session_manager_service.cc(289)] Exiting process is chrome. 2018-01-31T08:51:13.988623+01:00 INFO session_manager[3057]: [INFO:browser_job.cc(156)] Terminating process group: Ensuring browser processes are gone. 2018-01-31T08:51:13.988637+01:00 INFO session_manager[3057]: [INFO:system_utils_impl.cc(93)] Sending 9 to -3081 as 1000 2018-01-31T08:51:14.150833+01:00 ERR session_manager[3673]: [ERROR:process.cc(329)] Exec of /usr/bin/vm_launcher failed:: No such file or directory 2018-01-31T08:51:14.151536+01:00 ERR session_manager[3674]: [ERROR:process.cc(329)] Exec of /usr/bin/vm_launcher failed:: No such file or directory 2018-01-31T08:51:14.151973+01:00 INFO session_manager[3057]: [INFO:browser_job.cc(147)] Running child /opt/google/chrome/chrome --enable-logging --enable-native-gpu-memory-buffers --gpu-sandbox-failures-fatal=yes --ppapi-flash-path=/opt/google/chrome/pepper/libpepflashplayer.so --ppapi-flash-version=28.0.0.137 --ui-prioritize-in-gpu-process --use-cras --use-gl=egl --user-data-dir=/home/chronos --default-wallpaper-large=/usr/share/chromeos-assets/wallpaper/oem_large.jpg --default-wallpaper-small=/usr/share/chromeos-assets/wallpaper/oem_small.jpg --guest-wallpaper-large=/usr/share/chromeos-assets/wallpaper/guest_large.jpg --guest-wallpaper-small=/usr/share/chromeos-assets/wallpaper/guest_small.jpg --has-chromeos-keyboard --login-profile=user --bwsi --homepage=chrome://newtab/ --incognito --log-level=1 --login-user=$guest --login-user=$guest --login-profile=996fc2617a18b399136c32698351b83eeeca6560 --vmodule=tablet_power_button_controller=1,*chromeos/login/*=1,auto_enrollment_controller=1,*plugin*=2,*zygote*=1,*/ui/ozone/*=1,*/ui/display/manager/chromeos/*=1,*night_light*=1,power_button_observer=2,webui_login_view=2,lock_state_controller=2,webui_screen_locker=2,screen_locker=2 2018-01-31T08:51:14.821481+01:00 ERR chrome[3675]: [3675:3675:0131/085114.821426:INFO:remote_commands_service.cc(36)] Fetching remote commands. 2018-01-31T08:51:14.822983+01:00 ERR chrome[3675]: [3675:3675:0131/085114.822952:INFO:system_log_uploader.cc(198)] Creating system log uploader. 2018-01-31T08:51:14.823613+01:00 ERR chrome[3675]: [3675:3675:0131/085114.823578:INFO:system_log_uploader.cc(340)] Scheduling next system log upload 0 s from now. 2018-01-31T08:51:14.860110+01:00 INFO kernel: [ 233.197318] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0 2018-01-31T08:51:14.925420+01:00 ERR chrome[3675]: [3675:3675:0131/085114.925380:INFO:system_log_uploader.cc(320)] System log upload is disabled, rescheduling. 2018-01-31T08:51:14.928185+01:00 ERR chrome[3675]: [3675:3675:0131/085114.928137:INFO:system_log_uploader.cc(340)] Scheduling next system log upload 43200 s from now. 2018-01-31T08:51:15.723714+01:00 ERR chrome[3675]: [3675:3675:0131/085115.723674:INFO:remote_commands_invalidator.cc(32)] Initialize RemoteCommandsInvalidator. 2018-01-31T08:51:15.723756+01:00 ERR chrome[3675]: [3675:3675:0131/085115.723741:INFO:remote_commands_invalidator.cc(57)] Starting RemoteCommandsInvalidator. 2018-01-31T08:51:15.723779+01:00 ERR chrome[3675]: [3675:3675:0131/085115.723765:INFO:remote_commands_invalidator.cc(123)] RemoteCommandsInvalidator ReloadPolicyData. 2018-01-31T08:51:15.723801+01:00 ERR chrome[3675]: [3675:3675:0131/085115.723789:INFO:remote_commands_invalidator.cc(147)] Register RemoteCommandsInvalidator. 2018-01-31T08:51:15.723819+01:00 ERR chrome[3675]: [3675:3675:0131/085115.723808:INFO:remote_commands_invalidator.cc(83)] RemoteCommandsInvalidator state changed: 3 2018-01-31T08:51:15.723856+01:00 ERR chrome[3675]: [3675:3675:0131/085115.723844:INFO:remote_commands_invalidator.cc(83)] RemoteCommandsInvalidator state changed: 3 2018-01-31T08:51:20.041229+01:00 ERR chrome[3675]: [3675:3675:0131/085120.041149:INFO:remote_commands_service.cc(161)] Remote commands fetched. 2018-01-31T08:51:20.260143+01:00 INFO session_manager[3057]: [INFO:policy_store.cc(69)] Persisted policy to disk. 2018-01-31T08:51:20.341409+01:00 INFO session_manager[3057]: [INFO:child_exit_handler.cc(77)] Handling 3988 exit. 2018-01-31T08:51:20.345149+01:00 ERR chrome[3675]: [3675:3675:0131/085120.345054:INFO:remote_commands_invalidator.cc(123)] RemoteCommandsInvalidator ReloadPolicyData. 2018-01-31T08:52:14.825558+01:00 ERR chrome[3675]: [3675:3675:0131/085214.825488:INFO:status_uploader.cc(222)] Starting status upload: have_device_status = 1 2018-01-31T08:52:15.312400+01:00 ERR chrome[3675]: [3675:3675:0131/085215.312328:INFO:status_uploader.cc(235)] Status upload successful 2018-01-31T08:52:35.035366+01:00 INFO periodic_scheduler[4074]: crash_sender: running /sbin/crash_sender 2018-01-31T08:52:35.106194+01:00 INFO periodic_scheduler[4103]: crash_sender: job completed 2018-01-31T08:55:36.482537+01:00 INFO minijail0[4244]: libminijail[4244]: not running init loop, exiting immediately 2018-01-31T08:55:36.508518+01:00 INFO p2p-server[4245]: p2p-server starting [../../../../../../../../../mnt/host/source/src/platform2/p2p/server/main.cc:59] 2018-01-31T08:57:25.681349+01:00 ERR cras_server[2545]: Fail to create conversion matrix due to invalid channel layout 2018-01-31T09:00:21.275116+01:00 ERR kernel: [ 779.841212] go2001 0000:03:00.0: go2001_dec_g_fmt_cap:1579 Frame info not available yet 2018-01-31T09:03:42.412296+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpuset 2018-01-31T09:03:42.412373+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpu 2018-01-31T09:03:42.412377+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpuacct 2018-01-31T09:03:42.412379+01:00 NOTICE kernel: [ 0.000000] Linux version 3.14.0 (chrome-bot@cros-beefy258-c2) (gcc version 4.9.x 20150123 (prerelease) (4.9.2_cos_gg_4.9.2-r166-0c5a656a1322e137fa4a251f2ccc6c4022918c0a_4.9.2-r166) ) #1 SMP PREEMPT Mon Jan 8 23:11:39 PST 2018 2018-01-31T09:03:42.412381+01:00 INFO kernel: [ 0.000000] Command line: cros_secure console= loglevel=7 init=/sbin/init cros_secure oops=panic panic=-1 root=/dev/dm-0 rootwait ro dm_verity.error_behavior=3 dm_verity.max_bios=-1 dm_verity.dev_wait=1 dm="1 vroot none ro 1,0 3584000 verity payload=PARTUUID=4754016c-7857-1a4a-bdb5-54beb1a17368/PARTNROFF=1 hashtree=PARTUUID=4754016c-7857-1a4a-bdb5-54beb1a17368/PARTNROFF=1 hashstart=3584000 alg=sha1 root_hexdigest=99fa9847ae9f75049e8784e9481d8e3819bd504f salt=5a16d467e781b4f16a0b88dddfa8b898fa0fceffe9dc7a2081f8f3c0ca89f309" noinitrd vt.global_cursor_default=0 kern_guid=4754016c-7857-1a4a-bdb5-54beb1a17368 add_efi_memmap boot=local noresume noswap i915.modeset=1 tpm_tis.force=1 tpm_tis.interrupts=0 nmi_watchdog=panic,lapic 2018-01-31T09:03:42.412382+01:00 INFO kernel: [ 0.000000] e820: BIOS-provided physical RAM map: ------------ DQZ0HEK00173102E373000 https://drive.google.com/open?id=11A2ws5kuZwFSv71Z0W6tV1c14kel8v0F (after USB recovery) DQZ0HEK00173102E4F3000 https://drive.google.com/open?id=1srJdugivgsWc1UONPQjDvk9Nev8bM9A_ |======Troubleshooting and Workaround======| - Troubleshooting steps taken: - Customer has tried wiping the devices multiple times - Customer performed USB recovery using v62 on device DQZ0HEK00173102E373000, and issue persists
,
Feb 7 2018
derat@ Is there any potential problem if the customer does not touch the power button during the meeting?
,
Feb 7 2018
#2: Sorry, I don't understand the question. I didn't see any reboots in the logs that I looked at. It's expected that the device shuts down if the user is pressing the power button. I'll need more information about when the issue occurred and/or the correct logs in order to diagnose this further.
,
Feb 7 2018
,
Feb 8 2018
We're seeing this behavior with some devices here on our domain. Do not think this is an isolated incident.
,
Feb 8 2018
To clarify; we're seeing random reboots of our Chromebase units within out fleet. Can confirm users are not holding down the power button.
,
Feb 8 2018
I'm not entirely sure how to provide these logs, but in the past few days I had an Acer chromebase 24 suddenly get Error 002 - cannot connect to hangouts/meet. This occurs every few mins in a meeting and even at the calendar view. I pulled out another chromebase 24 yesterday and for about 24 hours it was fine, but it too is now exhibiting the same Error 002 issue. The power button is not being held during meetings or at the calendar view. More than happy to help figure out this issue.
,
Feb 9 2018
I'd like to add a secondary observation. Between the hours of 9AM CST-11:45AM CST (7AM-9:45AM PST), these same chromebases exhibit no issues. It appears that once people on the west coast "wake up", or in other words, once the entire US is using Meet/Hangouts, these chromebases begin exhibiting issues. The network that these chromebases are installed has no increase in traffic.
,
Feb 10 2018
You can save the logs by visiting chrome://net-internals/#chromeos and hitting the 'Store debug logs' button which should leave them in your Downloads directory
,
Feb 12 2018
Is that only for desktop mode? These devices are in the hangout mode, where the chrome browser isn't shown.
,
Feb 12 2018
egomez@pinterest.com, You can restart Cfm and enter ChromeOS mode by pressing Ctrl + Alt + S during boot, then get ChromeOS debug logs from chrome://net-internals/#chromeos. Please note the date/time of issue when you share logs with us. You can also contact Google Cloud support if you would like additional help in getting logs.
,
Feb 12 2018
By rebooting the CFM, it looks like I lose all logs prior to the reboot and/or debug logs only appear when on this page? If so, that will make this borderline impossible to debug/capture as the logs will be wiped during the issue. This issue just happened three times in an hour meeting and is highly disruptive. Out of 5 meetings, only the early morning one worked without Error 002
,
Feb 13 2018
No, just a reboot doesn't delete logs.
,
Feb 15 2018
We have got a fresh logs from the customer with a timestamp. https://drive.google.com/open?id=1rLfVqkQCPg7Gzzmj7zvfy2habd-8PhPA Device rebooted 2018-02-14 at 10:51 -10:54 2018-02-14T10:51:09.010837+01:00 ERR chrome[1061]: [1061:1061:0214/105109.010619:INFO:remote_commands_service.cc(36)] Fetching remote commands. 2018-02-14T10:51:09.622003+01:00 ERR chrome[1061]: [1061:1061:0214/105109.621916:INFO:remote_commands_service.cc(161)] Remote commands fetched. ^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@^@2018-02-14T10:54:50.454 206+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpuset 2018-02-14T10:54:50.454278+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpu 2018-02-14T10:54:50.454282+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpuacct I can also see another reboot on 2018-02-05 at 14:01 - 14:04 2018-02-05T14:01:02.460556+01:00 INFO kernel: [ 132.516655] haswell-pcm-audio haswell-pcm-audio: FW loaded, mailbox readback FW info: type 01, - version: 00.00, build 77, source commit id: 876ac6906f31a43b6772b 23c7c983ce9dcb18a19 2018-02-05T14:01:09.173955+01:00 ERR cras_server[2414]: Fail to create conversion matrix due to invalid channel layout 2018-02-05T14:01:10.191557+01:00 INFO kernel: [ 140.250548] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0 2018-02-05T14:01:33.705549+01:00 ERR kernel: [ 163.774883] go2001 0000:03:00.0: go2001_dec_g_fmt_cap:1579 Frame info not available yet 2018-02-05T14:04:20.461814+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpuset 2018-02-05T14:04:20.461883+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpu 2018-02-05T14:04:20.461888+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpuacct
,
Feb 16 2018
we have got another debug logs from customer: https://drive.google.com/open?id=1iGL8Mulvf32UDQiMkt4O15VfqkR-p5hh time stamps: 10:22 AM and 10:27 AM 2018-02-16T10:20:43.662155+01:00 INFO kernel: [ 173.031028] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0 2018-02-16T10:21:25.053278+01:00 ERR cras_server[2293]: Fail to create conversion matrix due to invalid channel layout 2018-02-16T10:22:32.639107+01:00 ERR kernel: [ 282018-02-16T10:23:36.524155+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpuset -- 2018-02-16T10:26:25.172104+01:00 INFO kernel: [ 170.903799] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0 2018-02-16T10:27:26.488836+01:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpus
,
Feb 16 2018
Thanks for the timestamps! eventlog.txt: ... 223 | 2018-02-16 10:17:49 | Kernel Event | Clean Shutdown 224 | 2018-02-16 10:17:50 | System boot | 183 225 | 2018-02-16 10:17:50 | System Reset 226 | 2018-02-16 10:17:50 | Wake Source | GPIO | 27 227 | 2018-02-16 10:23:34 | System boot | 184 228 | 2018-02-16 10:23:34 | SUS Power Fail 229 | 2018-02-16 10:23:34 | System Reset 230 | 2018-02-16 10:23:34 | ACPI Wake | S5 231 | 2018-02-16 10:23:34 | Wake Source | Power Button | 0 232 | 2018-02-16 10:23:34 | Wake Source | GPIO | 27 233 | 2018-02-16 10:27:24 | System boot | 185 234 | 2018-02-16 10:27:24 | SUS Power Fail 235 | 2018-02-16 10:27:24 | System Reset 236 | 2018-02-16 10:27:24 | ACPI Wake | S5 237 | 2018-02-16 10:27:24 | Wake Source | Power Button | 0 238 | 2018-02-16 10:27:24 | Wake Source | GPIO | 27 239 | 2018-02-16 10:28:21 | Kernel Event | Clean Shutdown 240 | 2018-02-16 10:28:21 | ACPI Enter | S5 241 | 2018-02-16 10:28:26 | System boot | 186 242 | 2018-02-16 10:28:26 | SUS Power Fail 243 | 2018-02-16 10:28:26 | System Reset 244 | 2018-02-16 10:28:26 | ACPI Wake | S5 245 | 2018-02-16 10:28:26 | Wake Source | Power Button | 0 246 | 2018-02-16 10:28:26 | Wake Source | GPIO | 27 EOF powerd.20180216-101753 ends abruptly after 10:22:22. powerd.20180216-102337 ends after 10:24:44 with some junk suggesting an unclean shutdown. powerd.20180216-102727 ends at 10:28:19 with a clean shutdown caused by the power button being pressed. In the messages file, I see the same two abrupt boots quoted in #16. There's 40-60 seconds of time between the last message and the cgroup message logged after the system booted again. There are a few more messages in net.log with timestamps shortly after the 10:21:25 message from cras_server: 1116:2018-02-16T10:21:33.395348+01:00 INFO shill[2050]: [INFO:service.cc(296)] Service 5 destroyed. 1117:2018-02-16T10:21:33.404934+01:00 INFO shill[2050]: [INFO:service.cc(296)] Service 4 destroyed. 1118:2018-02-16T10:21:33.405134+01:00 INFO shill[2050]: [INFO:wifi_service.cc(783)] Rep ep updated for 3. [SSID=MCB], bssid: 00:27:e3:a5:bb:4d, sig: -83, sec: 802_1x, freq: 5660 1119:2018-02-16T10:21:33.405691+01:00 INFO shill[2050]: [INFO:service.cc(296)] Service 6 destroyed. 1120:2018-02-16T10:21:33.406144+01:00 INFO shill[2050]: [INFO:service.cc(296)] Service 3 destroyed. 1121:2018-02-16T10:21:33.406570+01:00 INFO shill[2050]: [INFO:service.cc(296)] Service 7 destroyed. 1122:2018-02-16T10:21:33.407274+01:00 INFO shill[2050]: [INFO:service.cc(296)] Service 8 destroyed. 1123:2018-02-16T10:21:33.407867+01:00 INFO shill[2050]: [INFO:service.cc(312)] Suppressed autoconnect to service 2 (no carrier) I don't see any clues in the logs about what went wrong, though. I think that they rule out any decisions from userspace to reboot the system, and may also suggest that the problem is somewhere below the kernel.
,
Feb 16 2018
I definitely think the issue is low level and is a regression from some update, as my boxes continually crash even when not connected to a hangout/team meeting.
,
Feb 27 2018
Hi all, are there any updates I can share with the customer or any additional information we can provide? Thanks
,
Feb 27 2018
I hardwired my acer today and my crashing issues instantly went away, both at the calendar view and in the meetings. I'm now assuming something is wrong with the wireless driver/controller.
,
Feb 28 2018
Thanks for the debug and status update.
Just want to confirm by 'hardwired' you connected built-in LAN (RJ45 connector) via network cable or did you use a different method (USB -> ethernet dongle)?
Plausible roots could be as you mentioned something with wifi (iwlwifi)
Subsystem: Intel Corporation Dual Band Wireless-AC 7260
Or perhaps just that using a hardwired connection inhibits some deeper low power state entry which was causing the power loss/reboot.
Could try plugging usbkey into one of the chromebox ports which should inhibit deeper LP state and see if that makes wifi more stable.
Also does anyone have more details about date/channel when this was first seen?
When I look through commits for R62 (mentioned in reporting)
https://crosland.corp.google.com/log/9901.1.0..9901.77.0
Nothing jumps out to me ATM.
,
Feb 28 2018
Correct - built-in LAN/RJ45. As far as dates, I believe I first started seeing this issue around December 14th, or at the very least this was the first time I started discussing it internally. I don't have a USB dongle for network, but will report back if I see any issues come back. So far I have been in six successful meetings without any crashes.
,
Feb 28 2018
Oh actually I see you just wanted me to plugin a USB key in general. I actually have a USB microphone plugged in at all times (and yes I did try removing it for a while to see if that was the issue) which requires power and that didn't help, so I don't think it's the presence of USB itself.
,
Mar 7 2018
Hi! Is there any update on this?
,
Mar 8 2018
For the customer originally reported this issue: - I see customer moved Cfm devices to Beta and confirmed the issue persists, or even became worse. Let's thank them for cooperation and ask them to put back to Stable (+ USB recovery to M64 unfortunately, since changing channels doesn't downgrade from ChromeOS M65) - Is it possible to try wired network and see if the issue persists? We can confirm whether the original report and #c20 are the same issue. My guess is that multiple issues are discussed here, because the original report mentions "No issue outside of meetings" but #c18 states otherwise. - It would be good to have more recent samples. Can we ask them to submit feedback after they see the issue, if it takes time to manually gather ChromeOS debug logs?
,
Mar 8 2018
,
Mar 23 2018
Ryuta, are you able to work w/ customer to work through comments in #c25?
,
Mar 27 2018
,
Apr 6 2018
In logfiles I see following:
2018-01-31T02:10:37.811993+01:00 ERR kernel: [ 35.778039] ata1.00: exception Emask 0x0 SAct 0x1 SErr 0x0 action 0x6 frozen
2018-01-31T02:10:37.812016+01:00 ERR kernel: [ 35.778064] ata1.00: failed command: READ FPDMA QUEUED
2018-01-31T02:10:37.812018+01:00 ERR kernel: [ 35.778085] ata1.00: cmd 60/e0:00:78:9c:1e/00:00:00:00:00/40 tag 0 ncq 114688 in
2018-01-31T02:10:37.812021+01:00 ERR kernel: [ 35.778085] res 40/00:00:e0:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
2018-01-31T02:10:37.812023+01:00 ERR kernel: [ 35.778115] ata1.00: status: { DRDY }
2018-01-31T02:10:37.812024+01:00 INFO kernel: [ 35.778135] ata1: hard resetting link
2018-01-31T02:10:38.076950+01:00 INFO kernel: [ 36.043123] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
2018-01-31T02:10:38.076972+01:00 INFO kernel: [ 36.043834] ata1.00: configured for UDMA/133
2018-01-31T02:10:38.076975+01:00 INFO kernel: [ 36.043890] ata1: EH complete
2018-01-31T02:10:38.133002+01:00 INFO kernel: [ 36.099171] tpm_tis tpm_tis: command 0x65 (size 22) returned code 0x0
2018-01-31T02:10:38.163420+01:00 INFO kernel: [ 36.129837] tpm_tis tpm_tis: command 0xa (size 10) returned code 0x0
2018-01-31T02:10:38.178738+01:00 INFO crash_reporter[2571]: Enabling kernel crash handling
Also they have LiteOn SSD
So how about merging to crbug.com/803716
,
May 2 2018
Re: #25 No more Lite-on SSD-related errors for original customer, but issue is still present. 25.1: It's happening when USB-recovered to stable 62,63,64,65 25.2: Checking if it's happening on wired network. 25.3: Fresh logfiles: https://drive.google.com/open?id=1FcbMnD8xXnnP0I-gE9M7V3JDL5afo-pW
,
May 2 2018
25.2: working fine on wired connection, but often reboots during the meeting on wifi.
,
May 2 2018
The fresh logs look similar to comment #c1 in that its a user-requested shutdown. From powerd.PREVIOUS: [0413/041419:INFO:main.cc(293)] vcsid 0.0.2-r2590-555a2945fe2a4402ad5e17a108565f643de399d9 [0413/041419:INFO:main.cc(299)] System uptime: 3s ... [0413/041701:INFO:daemon.cc(530)] Power button down ... [0413/041702:INFO:daemon.cc(1606)] Shutting down, reason: user-request ... [0413/041703:INFO:daemon.cc(530)] Power button up So after being up for ~3min device thinks someone has pressed power button (for ~2sec) to shutdown. Now there is something interesting in syslog, 2018-04-13T11:09:55.421159+00:00 INFO p2p-server[5703]: p2p-server starting [../../../../../../../../../mnt/host/source/src/platform2/p2p/server/main.cc:59] 2018-04-13T11:10:15.724315+00:00 WARNING kernel: [ 456.835222] perf samples too long (2569 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 2018-04-13T04:14:18.576305-07:00 INFO kernel: [ 0.000000] Initializing cgroup subsys cpuset So between 2018-04-13T11:10:15 & 2018-04-13T04:14:18 I can't explain why there's not at least a log line like: 2018-04-13T04:17:03.736133-07:00 NOTICE pre-shutdown[4256]: Shutting down for halt: user-request Unfortunately there's nothing in logs that I can see that exposes the issue beyond what I'd expect to see in syslog Perhaps the powerd shutdown we're seeing is the user returning to the hung device and holding/tapping power button while trying to make sense of the unresponsiveness. Not long enough to trigger cold reboot but in a manner that shuts down before the display can come up. If thats the case and there's anyway to avoid it we might find something in 'console-ramoops' to help identify what went wrong.
,
May 3 2018
#32: It seems unlikely (but not impossible) that the Chrome was hanging but still able to process power button events and ask powerd to shut the system down. I think that further diagnosis requires logs to be provided along with information about precisely what was observed just before they were collected. This bug is about devices spontaneously rebooting, but the logs from #30 show a device shutting down (not rebooting) in response to the power button being held for two seconds.
,
Jun 28 2018
Latest logs we have https://drive.google.com/open?id=1FcbMnD8xXnnP0I-gE9M7V3JDL5afo-pW and feedback https://listnr.corp.google.com/product/208/report/85309333821?dateRange=All I especially asked, if it's rebooting or somehow else, so Yes, its just rebooting. It drops out of the meeting, screen turns black and then you will see the kiosk mode startup after a few seconds. I asked to disconnect all peripherals (to avoid accidental reboot caused by long-press of "Hang UP" button on remote or Jabra) and disable automated reboots, but it still happening. And it never happens on wired connection.
,
Jun 28 2018
Latest logs we have https://drive.google.com/open?id=1FcbMnD8xXnnP0I-gE9M7V3JDL5afo-pW and feedback https://listnr.corp.google.com/product/208/report/85309333821?dateRange=All I especially asked, if it's rebooting or somehow else, so Yes, its just rebooting. It drops out of the meeting, screen turns black and then you will see the kiosk mode startup after a few seconds. I asked to disconnect all peripherals (to avoid accidental reboot caused by long-press of "Hang UP" button on remote or Jabra) and disable automated reboots, but it still happening. And it never happens on wired connection.
,
Jul 2
#c34/c35: Issue description does sounds like a Chrome crash. However that feedback and log is from April, and is not even from Cfm. It's an anonymous feedback from ChromeOS. Can you confirm you attached the right link?
,
Jul 20
,
Jul 20
What are the next steps here? Igor: did you follow-up with the customers? what was their update?
,
Jul 20
Igor, can we get this bug into a good state where it can be handed to eng (or closed if this is cleaned up) ?
,
Jul 23
Per comment #25 "put back to Stable (+ USB recovery to M64 unfortunately, since changing channels doesn't downgrade from ChromeOS M65)" -- Don't forget the now (seemingly) forgotten M65 EOL devices have also started to repeatedly hang, freeze, and/or reboot -- (here) even in GUEST mode. Don't know about M64 though, since don't know when M65 upgrade happened. https://bugs.chromium.org/p/chromium/issues/detail?id=821607
,
Jul 23
#40: I don't see anything suggesting that the issue that you linked to has anything to do with this one. Please file a new issue instead of commenting on closed or unrelated ones.
,
Aug 1
@41: "I don't see anything suggesting that the issue that you linked to has anything to do with this one. Please file a new issue instead of commenting on closed or unrelated ones." #33: "This bug is about devices spontaneously rebooting" Bug #821607 is about devices hanging, especially M65 devices as title says, but as other comments show (#114, my #167, maybe #6, #7, #20), some devices (also?) reboot (the 2 here after the hang) -- and despite #116 and #161 there was no separate bug found for such yet, though my comments still belong there and may be useful here as these both hang AND reboot, and importantly, even in GUEST mode, sometimes with just ONE website opened right after a previous reboot.
,
Dec 22
|
||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||
Comment 1 by derat@chromium.org
, Feb 6 2018Labels: Needs-Feedback