New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 702806 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Restricted
  • Only users with Google permission may make changes.



Sign in to add a comment

[auron_paine, cyan] Kernel crash files generated later than usual.

Project Member Reported by ka...@chromium.org, Mar 17 2017

Issue description

Started with R59-9374.0.0

Results at https://wmatrix.googleplex.com/unfiltered?platforms=auron_paine,cyan&suites=usb_detect&days_back=7&releases=tot&hide_missing=True&tests=logging_GenerateCrash*

On the failing side logs I see losing DUT connectivity  after 'sync':
 
03/17 08:07:38.269 DEBUG|          ssh_host:0284| Running (ssh) 'echo BUG > /sys/kernel/debug/provoke-crash/DIRECT'
03/17 08:07:54.092 DEBUG|logging_GenerateCr:0067| Crash invoked!
03/17 08:07:54.093 DEBUG|        base_utils:0185| Running 'ping chromeos1-row1-rack3-host5 -w1 -c1'
03/17 08:07:54.177 DEBUG|        base_utils:0280| [stdout] PING chromeos1-row1-rack3-host5.cros.corp.google.com (172.27.212.73) 56(84) bytes of data.
03/17 08:07:54.177 DEBUG|        base_utils:0280| [stdout] 64 bytes from chromeos-rack8a-host1.mtv.corp.google.com (172.27.212.73): icmp_seq=1 ttl=58 time=37.2 ms
03/17 08:07:54.177 DEBUG|        base_utils:0280| [stdout] 
03/17 08:07:54.177 DEBUG|        base_utils:0280| [stdout] --- chromeos1-row1-rack3-host5.cros.corp.google.com ping statistics ---
03/17 08:07:54.177 DEBUG|        base_utils:0280| [stdout] 1 packets transmitted, 1 received, 0% packet loss, time 0ms
03/17 08:07:54.178 DEBUG|        base_utils:0280| [stdout] rtt min/avg/max/mdev = 37.254/37.254/37.254/0.000 ms
03/17 08:07:54.178 DEBUG|          ssh_host:0284| Running (ssh) 'sync'

---------------------- Not observed in passing case --------------------------
03/17 08:07:54.185 INFO |      abstract_ssh:0795| Master ssh connection to chromeos1-row1-rack3-host5 is down.
03/17 08:07:54.186 DEBUG|      abstract_ssh:0756| Nuking master_ssh_job.
03/17 08:07:54.186 DEBUG|      abstract_ssh:0762| Cleaning master_ssh_tempdir.
03/17 08:07:54.186 INFO |      abstract_ssh:0809| Starting master ssh connection '/usr/bin/ssh -a -x   -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_G25Pv3ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos1-row1-rack3-host5'
03/17 08:07:54.186 DEBUG|        base_utils:0185| Running '/usr/bin/ssh -a -x   -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_G25Pv3ssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos1-row1-rack3-host5'
------------------------------------------------------------------------------

03/17 08:07:59.274 DEBUG|          ssh_host:0284| Running (ssh) 'ls /var/spool/crash'
03/17 08:07:59.451 INFO |logging_GenerateCr:0035| Crash files diff: set([])
03/17 08:07:59.451 DEBUG|              test:0389| Test failed due to set(['kcrash', 'meta']) files not generated.. Exception log follows the after_iteration_hooks.


What I see also is that the kernel crash file is being created, but little(or not so) after we 'ls' for it(0.5 - 4 seconds):

For auron_paine:
2017-03-17T09:50:40.501491-07:00 NOTICE ag[2597]: autotest server[stack::_call_run_once|run_once|check_missing_crash_files] -> ssh_run(ls /var/spool/crash)
2017-03-17T09:50:40.574055-07:00 INFO sshd[2594]: Received disconnect from 127.0.0.1 port 57172:11: disconnected by user
2017-03-17T09:50:40.574102-07:00 INFO sshd[2594]: Disconnected from 127.0.0.1 port 57172
2017-03-17T09:50:40.634513-07:00 INFO crash_reporter[2609]: Enabling kernel crash handling
2017-03-17T09:50:40.635795-07:00 INFO crash_reporter[2609]: Received prior crash notification from kernel (signature kernel-lkdtm_do_action-806CC39A) (developer build - always dumping)
2017-03-17T09:50:40.636208-07:00 INFO crash_reporter[2609]: Stored kcrash to /var/spool/crash/kernel.20170317.095040.0.kcrash
2017-03-17T09:50:40.636242-07:00 WARNING crash_reporter[2609]: Last shutdown was not clean
2017-03-17T09:50:40.636306-07:00 WARNING crash_reporter[2609]: Could not load the device policy file.




For cyan:
2017-03-17T00:25:20.192251-07:00 NOTICE ag[2464]: autotest server[stack::_call_run_once|run_once|check_missing_crash_files] -> ssh_run(ls /var/spool/crash)
2017-03-17T00:25:20.261154-07:00 INFO sshd[2457]: Received disconnect from 127.0.0.1 port 45476:11: disconnected by user
2017-03-17T00:25:20.261238-07:00 INFO sshd[2457]: Disconnected from 127.0.0.1 port 45476
2017-03-17T00:25:20.687015-07:00 INFO sshd[2553]: Accepted publickey for root from 127.0.0.1 port 45478 ssh2: RSA SHA256:Fp1qWjFLyK1cTpiI5rdk7iEJwoK9lcnYAgbQtGC3jzU
2017-03-17T00:25:20.844551-07:00 NOTICE ag[2560]: autotest server[stack::wrapper|_install|path_exists] -> ssh_run(test -e "/tmp/sysinfo/autoserv-ORwaly")
2017-03-17T00:25:20.914019-07:00 INFO sshd[2553]: Received disconnect from 127.0.0.1 port 45478:11: disconnected by user
2017-03-17T00:25:20.914142-07:00 INFO sshd[2553]: Disconnected from 127.0.0.1 port 45478
2017-03-17T00:25:21.251151-07:00 INFO sshd[2600]: Accepted publickey for root from 127.0.0.1 port 45479 ssh2: RSA SHA256:Fp1qWjFLyK1cTpiI5rdk7iEJwoK9lcnYAgbQtGC3jzU
2017-03-17T00:25:21.407684-07:00 NOTICE ag[2611]: autotest server[stack::wait_up|is_up|ssh_ping] -> ssh_run(true)
2017-03-17T00:25:21.473617-07:00 INFO sshd[2600]: Received disconnect from 127.0.0.1 port 45479:11: disconnected by user
2017-03-17T00:25:21.473709-07:00 INFO sshd[2600]: Disconnected from 127.0.0.1 port 45479
2017-03-17T00:25:21.905394-07:00 INFO sshd[2615]: Accepted publickey for root from 127.0.0.1 port 45481 ssh2: RSA SHA256:Fp1qWjFLyK1cTpiI5rdk7iEJwoK9lcnYAgbQtGC3jzU
2017-03-17T00:25:22.061216-07:00 NOTICE ag[2634]: autotest server[stack::install|_install|_install] -> ssh_run(mkdir -p /tmp/sysinfo/autoserv-ORwaly)
2017-03-17T00:25:22.129928-07:00 INFO sshd[2615]: Received disconnect from 127.0.0.1 port 45481:11: disconnected by user
2017-03-17T00:25:22.130021-07:00 INFO sshd[2615]: Disconnected from 127.0.0.1 port 45481
2017-03-17T00:25:22.910683-07:00 INFO sshd[2641]: Accepted publickey for root from 127.0.0.1 port 45482 ssh2: RSA SHA256:Fp1qWjFLyK1cTpiI5rdk7iEJwoK9lcnYAgbQtGC3jzU
2017-03-17T00:25:23.066785-07:00 NOTICE ag[2646]: autotest server[stack::install|_install|_install] -> ssh_run(rm -rf /tmp/sysinfo/autoserv-ORwaly/results/*)
2017-03-17T00:25:23.135422-07:00 INFO sshd[2641]: Received disconnect from 127.0.0.1 port 45482:11: disconnected by user
2017-03-17T00:25:23.135515-07:00 INFO sshd[2641]: Disconnected from 127.0.0.1 port 45482
2017-03-17T00:25:24.401556-07:00 INFO sshd[2667]: Accepted publickey for root from 127.0.0.1 port 45483 ssh2: RSA SHA256:Fp1qWjFLyK1cTpiI5rdk7iEJwoK9lcnYAgbQtGC3jzU
2017-03-17T00:25:24.558006-07:00 NOTICE ag[2672]: autotest server[stack::_install|_install|_install_using_packaging] -> ssh_run(cd /tmp/sysinfo/autoserv-ORwaly && ls | grep -v "^packages$" | xargs rm -rf && rm -rf .[!.]*)
2017-03-17T00:25:24.626512-07:00 INFO sshd[2667]: Received disconnect from 127.0.0.1 port 45483:11: disconnected by user
2017-03-17T00:25:24.626597-07:00 INFO sshd[2667]: Disconnected from 127.0.0.1 port 45483
2017-03-17T00:25:24.897754-07:00 INFO crash_reporter[2710]: Enabling kernel crash handling
2017-03-17T00:25:24.901231-07:00 INFO crash_reporter[2710]: Received prior crash notification from kernel (signature kernel-lkdtm_do_action-806CC39A) (developer build - always dumping)
2017-03-17T00:25:24.902603-07:00 INFO crash_reporter[2710]: Stored kcrash to /var/spool/crash/kernel.20170317.002524.0.kcrash
2017-03-17T00:25:24.902875-07:00 WARNING crash_reporter[2710]: Last shutdown was not clean





(I also see several kernel (WARNING) crashes on cyan DUT at boot time after the BUG crash is invoked.)



Cyan sample logs -https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/107218234-chromeos-test/chromeos1-row1-rack3-host5/logging_GenerateCrashFiles.KERNEL/

Auron_paine sample logs - https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/107228418-chromeos-test/chromeos1-row1-rack3-host4/logging_GenerateCrashFiles.KERNEL/


 

Comment 1 by ka...@chromium.org, Mar 17 2017

Owner: sontis@chromium.org
Summary: [auron_paine, cyan] Kernel crash files generated later than usual. (was: [auron_paine, cyan] Kernel crash files not generated in autotests.)
sontis@ please increase the sleep time after 'sync' and before checking for crash files.

Comment 2 by ka...@chromium.org, Mar 17 2017

Description: Show this description
Project Member

Comment 3 by bugdroid1@chromium.org, Mar 18 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/36b5257fe409e1018e89ea39cc27b9f13a383f64

commit 36b5257fe409e1018e89ea39cc27b9f13a383f64
Author: Sridhar Sonti <sontis@chromium.org>
Date: Sat Mar 18 05:00:39 2017

Increased SHORT_WAIT time to 10 seconds.

BUG= chromium:702806 
TEST=None

Change-Id: I2de89bec85b9f9503813893b683bd9ac702bd2d6
Reviewed-on: https://chromium-review.googlesource.com/456923
Commit-Ready: Sridhar Sonti <sontis@chromium.org>
Tested-by: Sridhar Sonti <sontis@chromium.org>
Reviewed-by: Ruchi Jahagirdar <rjahagir@chromium.org>
Reviewed-by: Kalin Stoyanov <kalin@chromium.org>

[modify] https://crrev.com/36b5257fe409e1018e89ea39cc27b9f13a383f64/server/site_tests/logging_GenerateCrashFiles/logging_GenerateCrashFiles.py

Comment 4 by ka...@chromium.org, Mar 20 2017

Status: Verified (was: Untriaged)
Test is passing now. Thanks Sridhar.
Project Member

Comment 5 by bugdroid1@chromium.org, Apr 27 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/80e53e57c5e5d00ce20d10d84d685413b4fa11fb

commit 80e53e57c5e5d00ce20d10d84d685413b4fa11fb
Author: Sunny Sachanandani <sunnyps@chromium.org>
Date: Fri Apr 27 01:11:50 2018

Remove debugging code for swap chain creation failure

Bug was due to creating swap chain with zero size, and has been fixed.

R=zmo
BUG= 702806 

Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: Id32bcff3cd4584cb5e0a96ac461597c4dfea2630
Reviewed-on: https://chromium-review.googlesource.com/1031515
Reviewed-by: Zhenyao Mo <zmo@chromium.org>
Commit-Queue: Sunny Sachanandani <sunnyps@chromium.org>
Cr-Commit-Position: refs/heads/master@{#554248}
[modify] https://crrev.com/80e53e57c5e5d00ce20d10d84d685413b4fa11fb/gpu/ipc/service/direct_composition_child_surface_win.cc
[modify] https://crrev.com/80e53e57c5e5d00ce20d10d84d685413b4fa11fb/gpu/ipc/service/direct_composition_surface_win.cc

Sign in to add a comment