New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 760236 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Last visit > 30 days ago
Closed: Apr 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

enterprise_CFM_USBPeripheralHotplugDetect.c920 / c930 - Detecting crashes on M-62 and M-63

Project Member Reported by harpreet@chromium.org, Aug 29 2017

Issue description

ChromeOS build: R62-9888.0.0
Test: enterprise_CFM_USBPeripheralHotplugDetect.c920


Failure Reason:

Following peripherals do not match: Speaker. No of Crashes: 25. Crashes: New Warning or Crash Detected before plugging in usb peripherals., New Warning or Crash detected after device enrolled into CFM.



Logs are not yet available in google cloud but I did manage to get the debug logs from tko link via sponge/

https://paste.googleplex.com/5743448221024256


There are bunch of crashes during the test run along with peripheral mismatch.

dsunkara@ - can you please extract the crash data from the device/logs and post it in this bug. Thanks
 
Cc: ar...@chromium.org
Cc: dsunk...@chromium.org
Labels: Proj-Hotrod Proj-Bluestreak
Owner: ----
These crashes are generated when device is rebooted. 
1. recover DUT with 9888.0.0
2. reboot.
3. check /var/spool/crash folder for these crash files. 
4. clear tpm and reboot.
5. More crash files added to /var/spool/crash folder. 

All crash files collected could be found below: 

https://pantheon.corp.google.com/storage/browser/chromiumos-test-logs/bugfiles/cr/760236/

dsunkara@ - are there still failures for device mismatch? if not, lets change the title of this bug to make it more focused on number of crashes.

Summary: enterprise_CFM_USBPeripheralHotplugDetect.c920 / c930 - Detecting lot of crashes on M-62 (was: enterprise_CFM_USBPeripheralHotplugDetect.c920 - Failing for peripheral mismatch and lot of crashes)
Hi Harpreet, This test is not failing anymore because of device mismatch. There are still crashes though on M-62.

Comment 6 by mzhuo@chromium.org, Sep 16 2017

enterprise_CFM_USBPeripheralHotplugDetect.c920 / c930  test fails on guado on 9/15 with ChromeOS R62-9901.16.0. 
https://wmatrix.googleplex.com/testrun_log/unfiltered?test_ids=555424043

log:

stderr:
/bin/bash: minidump_stackwalk: command not found, 'Command returned non-zero exit status')
09/15 02:09:41.624 INFO | site_crashcollect:0111| Generating stack trace using devserver for /usr/local/autotest/results/142198393-chromeos-test/sysinfo/var/spool/crash/chrome.20170915.020450.5790.dmp
09/15 02:09:41.624 DEBUG|        dev_server:2558| Get the least loaded <class 'autotest_lib.client.common_lib.cros.dev_server.CrashServer'>
09/15 02:09:41.624 DEBUG|        dev_server:0678| Pick one healthy devserver from ['http://100.108.1.139:8082', 'http://100.108.1.152:8082', 'http://100.108.1.59:8082', 'http://100.109.175.143:8082', 'http://100.109.175.145:8082', 'http://172.24.190.211:8082', 'http://172.25.61.81:8082', 'http://172.25.65.207:8082', 'http://172.25.66.216:8082']
09/15 02:09:41.624 DEBUG|        dev_server:0682| Check health for http://100.108.1.139:8082
09/15 02:09:41.996 DEBUG|        dev_server:0687| Pick http://100.108.1.139:8082
09/15 02:09:42.094 WARNI|             retry:0228| <class 'urllib2.URLError'>(<urlopen error [Errno 111] Connection refused>)
09/15 02:09:42.095 WARNI|             retry:0183| Retrying in 1.709657 seconds...
09/15 02:09:43.810 WARNI|             retry:0228| <class 'urllib2.URLError'>(<urlopen error [Errno 111] Connection refused>)
09/15 02:09:43.810 ERROR|        dev_server:0434| Devserver call failed: "http://172.25.65.207:8082/check_health?", timeout: 2.0 seconds, Error: <class 'urllib2.URLError'>: <urlopen error [Errno 111] Connection refused>
09/15 02:09:44.061 DEBUG|        dev_server:0300| RPC call get_load has timed out on devserver 172.24.190.211.
09/15 02:09:44.062 ERROR|        dev_server:0434| Devserver call failed: "http://172.24.190.211:8082/check_health?", timeout: 2.0 seconds, Error: retry exception (label="get_load"), timeout = 2s
09/15 02:09:44.077 DEBUG|        dev_server:0300| RPC call get_load has timed out on devserver 172.25.61.81.
09/15 02:09:44.078 ERROR|        dev_server:0434| Devserver call failed: "http://172.25.61.81:8082/check_health?", timeout: 2.0 seconds, Error: retry exception (label="get_load"), timeout = 2s
09/15 02:09:44.092 INFO |    connectionpool:0207| Starting new HTTP connection (1): 100.108.1.59
09/15 02:10:07.480 DEBUG|    connectionpool:0387| "POST /symbolicate_dump?archive_url=gs://chromeos-image-archive/guado-release/R62-9901.16.0 HTTP/1.1" 200 71729
09/15 02:10:07.537 INFO | site_crashcollect:0124| Generated stack trace for dump /usr/local/autotest/results/142198393-chromeos-test/sysinfo/var/spool/crash/chrome.20170915.020450.5790.dmp
09/15 02:10:07.546 INFO |        server_job:0214| INFO	----	----	timestamp=1505466607	localtime=Sep 15 02:10:07	Start crashcollection record
09/15 02:10:07.546 INFO |        server_job:0214| INFO	----	New Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/usr/local/autotest/results/142198393-chromeos-test/crashinfo.chromeos1-row4-rack5-host3/chrome.20170915.020547.7270.dmp
09/15 02:10:07.546 INFO |        server_job:0214| INFO	----	New Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/usr/local/autotest/results/142198393-chromeos-test/sysinfo/var/spool/crash/chrome.20170915.020351.3910.dmp
09/15 02:10:07.546 INFO |        server_job:0214| INFO	----	New Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/usr/local/autotest/results/142198393-chromeos-test/sysinfo/var/spool/crash/chrome.20170915.020309.3755.dmp
09/15 02:10:07.547 INFO |        server_job:0214| INFO	----	New Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/usr/local/autotest/results/142198393-chromeos-test/sysinfo/var/spool/crash/chrome.20170915.020304.664.dmp
09/15 02:10:07.547 INFO |        server_job:0214| INFO	----	New Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/usr/local/autotest/results/142198393-chromeos-test/sysinfo/var/spool/crash/chrome.20170915.020412.4684.dmp
09/15 02:10:07.547 INFO |        server_job:0214| INFO	----	New Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/usr/local/autotest/results/142198393-chromeos-test/sysinfo/var/spool/crash/chrome.20170915.020450.5790.dmp
09/15 02:10:07.547 INFO |        server_job:0214| INFO	----	Orphaned Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/var/spool/crash/chrome.20170915.020547.7270.dmp
09/15 02:10:07.547 INFO |        server_job:0214| INFO	----	Orphaned Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/var/spool/crash/chrome.20170915.020547.7270.meta
09/15 02:10:07.548 INFO |        server_job:0214| INFO	----	Orphaned Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/var/spool/crash/chrome.20170915.020547.7270.core
09/15 02:10:07.548 INFO |        server_job:0214| INFO	----	Orphaned Crash Dump	timestamp=1505466607	localtime=Sep 15 02:10:07	/var/spool/crash/chrome.20170915.020547.7270.log
09/15 02:10:07.548 INFO |        server_job:0214| INFO	----	----	timestamp=1505466607	localtime=Sep 15 02:10:07	End crashcollection record
09/15 02:10:07.548 WARNI| site_crashcollect:0405| Crash detection failed with: [Errno 2] No such file or directory: '/var/spool/crash/chrome.20170915.020547.7270.meta'
09/15 02:10:07.549 WARNI| site_crashcollect:0405| Crash detection failed with: [Errno 2] No such file or directory: '/var/spool/crash/chrome.20170915.020547.7270.meta'
09/15 02:10:07.549 WARNI| site_crashcollect:0405| Crash detection failed with: [Errno 2] No such file or directory: '/var/spool/crash/chrome.20170915.020547.7270.meta'
09/15 02:10:07.549 WARNI| site_crashcollect:0405| Crash detection failed with: [Errno 2] No such file or directory: '/var/spool/crash/chrome.20170915.020547.7270.meta'
Regarding #6. that test failure is mainly because of flakiness on skipping oobe after enrollment. Created a separate bug for it crbug.com/766366

This bug is for increase in number of crashes.

kalin@ - who is the right owner for this to look at why there is such an increase in crashes in M62?
Number of crashes seems to have subsided to ~ 2-3 during the test run down from ~30 at its peak in earlier M62 builds.

https://chromeperf.appspot.com/report?sid=ffcaedf3f246ab9738d5cbe76df56b44a093db6563d9ef19c52cacd210d7f8f3

2-3 crashes are still too many compared to the past when there use to be 0 crashes for most run's. Still might be worth looking into.

Comment 9 by lndmrk@chromium.org, Oct 18 2017

Owner: harpreet@chromium.org
Status: Assigned (was: Untriaged)

Comment 10 by ka...@chromium.org, Oct 18 2017

Can you reproduce this with a device on a recovery image?
If so, lest get these crashes uploaded to crash server, and see if there are already bugs assigned.

What is the chrome crash failing thread stacktrace?
Cc: harpreet@chromium.org
Owner: dsunk...@chromium.org
Dinesh, please give it a try on a recovery image.

For lab runs, chrome crash dumps should also be stored as part of test logs. See if you can find some and upload those to fireplace.
Cc: egemih@chromium.org
Labels: M-63
Hi Harpreet,
I had a look at the crashes reported in the lab and are mostly related to "service-failure"

sig=ffe03e3a-imageloader
exec_name=service-failure
ver=10009.0.0
payload=/var/spool/crash/service_failure.20171011.095817.0.log
payload_size=5574
done=1

We have also reported similar crashes from manual test here@ https://bugs.chromium.org/p/chromium/issues/detail?id=773760


Comment 13 by mzhuo@chromium.org, Nov 18 2017

enterprise_CFM_USBPeripheralHotplugDetect.c920  report crashes like below in M63. 
https://wmatrix.googleplex.com/failures/unfiltered?suites=hotrod&tests=enterprise_CFM_USBPeripheralHotplugDetect.c920&days_back=7&hide_missing=True

11/17 07:50:11.964 INFO |    crash_detector:0049| CRASH DETECTED in /var/spool/crash: service_failure.20171117.074823.0.meta 
11/17 07:50:11.965 INFO |    crash_detector:0049| CRASH DETECTED in /var/spool/crash: service_failure.20171117.074743.0.meta


Comment 14 by mzhuo@chromium.org, Nov 18 2017

Summary: enterprise_CFM_USBPeripheralHotplugDetect.c920 / c930 - Detecting lot of crashes on M-62 and M-63 (was: enterprise_CFM_USBPeripheralHotplugDetect.c920 / c930 - Detecting lot of crashes on M-62)
Cc: katierh@chromium.org tovep@chromium.org lfg@chromium.org
Summary: enterprise_CFM_USBPeripheralHotplugDetect.c920 / c930 - Detecting crashes on M-62 and M-63 (was: enterprise_CFM_USBPeripheralHotplugDetect.c920 / c930 - Detecting lot of crashes on M-62 and M-63)
Hi katie,
Can you please help find an owner for this issue?
Owner: egemih@chromium.org

Comment 17 by mzhuo@chromium.org, Nov 21 2017

Same type of failure happens to 9901.77.1.

Comment 18 by mzhuo@chromium.org, Nov 21 2017

Same type of failure happens to 9901.77.1 on Bluestreak setup. 

Comment 19 by mzhuo@chromium.org, Dec 20 2017

Labels: M-64
Same type of failure happens to  R64-10176.22.0 on almost all CFMs platforms.
mzhuo@minzhuo2:~/chromiumos/chroot/tmp/R64-10176.22.0$ 
mzhuo@minzhuo2:~/chromiumos/chroot/tmp/R64-10176.22.0$ grep "Start_verification:check_process_crash" 100.107.146.4/debug/test_that.INFO  -A 3
12/20 14:10:37.419 INFO | test_runner_utils:0199| autoserv| Start_verification:check_process_crash
12/20 14:10:37.478 INFO | test_runner_utils:0199| autoserv| ---ls -latr /var/spool/crash
12/20 14:10:37.479 INFO | test_runner_utils:0199| autoserv| ---['total 20', 'drwxr-xr-x 5 root root 4096 Dec 11 17:25 ..', '-rw-r--r-- 1 root root  147 Dec 20 12:10 service_failure.20171220.121011.0.meta', '-rw-r--r-- 1 root root 6770 Dec 20 12:10 service_failure.20171220.121011.0.log', 'drwxr-xr-t 2 root root 4096 Dec 20 12:10 .']
12/20 14:10:37.479 INFO | test_runner_utils:0199| autoserv| 
mzhuo@minzhuo2:~/chromiumos/chroot/tmp/R64-10176.22.0$ grep "Start_verification:check_process_crash" 100.107.146.22/debug/test_that.INFO  -A 3
12/20 14:10:35.462 INFO | test_runner_utils:0199| autoserv| Start_verification:check_process_crash
12/20 14:10:35.522 INFO | test_runner_utils:0199| autoserv| ---ls -latr /var/spool/crash
12/20 14:10:35.522 INFO | test_runner_utils:0199| autoserv| ---['total 20', 'drwxr-xr-x 5 root root 4096 Dec 11 16:49 ..', '-rw-r--r-- 1 root root  147 Dec 20 12:10 service_failure.20171220.121004.0.meta', '-rw-r--r-- 1 root root 6528 Dec 20 12:10 service_failure.20171220.121004.0.log', 'drwxr-xr-t 2 root root 4096 Dec 20 12:10 .']
12/20 14:10:35.522 INFO | test_runner_utils:0199| autoserv| 
mzhuo@minzhuo2:~/chromiumos/chroot/tmp/R64-10176.22.0$ grep "Start_verification:check_process_crash" 100.107.146.3/debug/test_that.INFO  -A 3
12/20 14:10:41.623 INFO | test_runner_utils:0199| autoserv| Start_verification:check_process_crash
12/20 14:10:41.697 INFO | test_runner_utils:0199| autoserv| ---ls -latr /var/spool/crash
12/20 14:10:41.698 INFO | test_runner_utils:0199| autoserv| ---['total 20', 'drwxr-xr-x 5 root root 4096 Oct 20 00:25 ..', '-rw-r--r-- 1 root root  147 Dec 20 12:10 service_failure.20171220.121012.0.meta', '-rw-r--r-- 1 root root 6792 Dec 20 12:10 service_failure.20171220.121012.0.log', 'drwxr-xr-t 2 root root 4096 Dec 20 12:10 .']
12/20 14:10:41.698 INFO | test_runner_utils:0199| autoserv| 
mzhuo@minzhuo2:~/chromiumos/chroot/tmp/R64-10176.22.0$ grep "Start_verification:check_process_crash" 100.107.146.13/debug/test_that.INFO  -A 3
12/20 14:10:49.664 INFO | test_runner_utils:0199| autoserv| Start_verification:check_process_crash
12/20 14:10:49.728 INFO | test_runner_utils:0199| autoserv| ---ls -latr /var/spool/crash
12/20 14:10:49.729 INFO | test_runner_utils:0199| autoserv| ---['total 28', 'drwxr-xr-x 5 root root 4096 Dec 19 14:21 ..', '-rw-r--r-- 1 root root  199 Dec 20 12:09 os-release', '-rw-r--r-- 1 root root  768 Dec 20 12:09 lsb-release', '-rw-r--r-- 1 root root  147 Dec 20 12:10 service_failure.20171220.121020.0.meta', '-rw-r--r-- 1 root root 6940 Dec 20 12:10 service_failure.20171220.121020.0.log', 'drwxr-xr-t 2 root root 4096 Dec 20 12:10 .']
12/20 14:10:49.729 INFO | test_runner_utils:0199| autoserv| 
mzhuo@minzhuo2:~/chromiumos/chroot/tmp/R64-10176.22.0$ grep "Start_verification:check_process_crash" 100.107.146.11/debug/test_that.INFO  -A 3
12/20 14:11:03.482 INFO | test_runner_utils:0199| autoserv| Start_verification:check_process_crash
12/20 14:11:03.540 INFO | test_runner_utils:0199| autoserv| ---ls -latr /var/spool/crash
12/20 14:11:03.541 INFO | test_runner_utils:0199| autoserv| ---['total 20', 'drwxr-xr-x 5 root root 4096 Nov  1 15:40 ..', '-rw-r--r-- 1 root root  147 Dec 20 12:10 service_failure.20171220.121035.0.meta', '-rw-r--r-- 1 root root 6805 Dec 20 12:10 service_failure.20171220.121035.0.log', 'drwxr-xr-t 2 root root 4096 Dec 20 12:10 .']
12/20 14:11:03.541 INFO | test_runner_utils:0199| autoserv| 
mzhuo@minzhuo2:~/chromiumos/chroot/tmp/R64-10176.22.0$ grep "Start_verification:check_process_crash" 100.107.146.12/debug/test_that.INFO  -A 3
12/20 14:10:48.446 INFO | test_runner_utils:0199| autoserv| Start_verification:check_process_crash
12/20 14:10:48.502 INFO | test_runner_utils:0199| autoserv| ---ls -latr /var/spool/crash
12/20 14:10:48.503 INFO | test_runner_utils:0199| autoserv| ---['total 20', 'drwxr-xr-x 5 root root 4096 Nov 21 15:15 ..', '-rw-r--r-- 1 root root  147 Dec 20 12:10 service_failure.20171220.121015.0.meta', '-rw-r--r-- 1 root root 6794 Dec 20 12:10 service_failure.20171220.121015.0.log', 'drwxr-xr-t 2 root root 4096 Dec 20 12:10 .']
12/20 14:10:48.503 INFO | test_runner_utils:0199| autoserv| 
mzhuo@minzhuo2:~/chromiumos/chroot/tmp/R64-10176.22.0$ grep "Start_verification:check_process_crash" 100.107.146.26/debug/test_that.INFO  -A 3
12/20 14:11:24.922 INFO | test_runner_utils:0199| autoserv| Start_verification:check_process_crash
12/20 14:11:24.992 INFO | test_runner_utils:0199| autoserv| ---ls -latr /var/spool/crash
12/20 14:11:24.993 INFO | test_runner_utils:0199| autoserv| ---['total 20', 'drwxr-xr-x 5 root root 4096 Nov 22 15:43 ..', '-rw-r--r-- 1 root root 6838 Dec 20 12:10 service_failure.20171220.121031.0.log', '-rw-r--r-- 1 root root  147 Dec 20 12:10 service_failure.20171220.121031.0.meta', 'drwxr-xr-t 2 root root 4096 Dec 20 12:10 .']
12/20 14:11:24.993 INFO | test_runner_utils:0199| autoserv| 
mzhuo@minzhuo2:~/chromiumos/chroot/tmp/R64-10176.22.0$ grep "Start_verification:check_process_crash" 100.107.146.24/debug/test_that.INFO  -A 3
12/20 14:11:02.072 INFO | test_runner_utils:0199| autoserv| Start_verification:check_process_crash
12/20 14:11:02.141 INFO | test_runner_utils:0199| autoserv| ---ls -latr /var/spool/crash
12/20 14:11:02.142 INFO | test_runner_utils:0199| autoserv| ---['total 20', 'drwxr-xr-x 5 root root 4096 Dec 12 15:14 ..', '-rw-r--r-- 1 root root  147 Dec 20 12:10 service_failure.20171220.121029.0.meta', '-rw-r--r-- 1 root root 7211 Dec 20 12:10 service_failure.20171220.121029.0.log', 'drwxr-xr-t 2 root root 4096 Dec 20 12:10 .']

Same type of crash happens to Guado in our lab testing at high frequency too with R63, R64 and R65. With this crash no noticeable impact is observed. However, during longevity run either we bypass crashing detection, or script is aborted due to this crash. 

Comment 22 by tovep@chromium.org, Jan 11 2018

Labels: -M-62 -M-63

Comment 23 by tovep@chromium.org, Jan 12 2018

Labels: hotrod-platform-triaged

Comment 24 by tovep@chromium.org, Jan 16 2018

Labels: -M-64 M-65
Cc: -harpreet@chromium.org
Status: Started (was: Assigned)
Is this user facing at all? Or, are the tests in this suite realistic? (i.e. resemble anything we might see in the wild) If we want to hold CfM devices to higher standards similar to the ones in this suite, we should:
1. enable most/all debug flags in kernel and cros 
2. run these tests on those "special" devices, so it's easier to pinpoint where errors originate from and potentially be able to issue fixes. 

Thoughts?

Comment 27 by tovep@chromium.org, Mar 14 2018

Cc: -tovep@chromium.org

Comment 28 by kerl@google.com, Apr 22 2018

Status: WontFix (was: Started)
I'm closing this due to age and IIUC, no symptoms except a crash file.

Sign in to add a comment