New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 916811 link

Starred by 2 users

Issue metadata

Status: Started
Owner:
Last visit > 30 days ago
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

[Kernel_usb:Nocturne] Tests are failing with error messages "client failed to resume from sleep after 200 seconds" on M73

Project Member Reported by sontis@chromium.org, Dec 20

Issue description

Dut: Nocturne.
Host: chromeos15-row14b-rack10-host8

https://screenshot.googleplex.com/8J2c2RPRypu

logs are present at https://stainless.corp.google.com/browse/chromeos-autotest-results/268714635-chromeos-test/

From debug logs:
=========================================================================
2/19 09:37:50.408 DEBUG|             utils:0287| [stdout] 2 packets transmitted, 0 received, 100% packet loss, time 999ms
12/19 09:37:50.408 DEBUG|             utils:0287| [stdout] 
12/19 09:37:50.408 INFO |        server_job:0217| 	END FAIL	----	suspend	timestamp=1545241070	localtime=Dec 19 09:37:50	DUT is not responding after 90 seconds
  Traceback (most recent call last):
    File "/usr/local/autotest/server/server_job.py", line 964, in run_op
      op_func()
    File "/usr/local/autotest/server/hosts/remote.py", line 217, in suspend
      "DUT is not responding after %d seconds" % (time_counter))
  AutoservSuspendError: DUT is not responding after 90 seconds
12/19 09:38:17.347 ERROR|             utils:0287| [stderr] ssh: connect to host chromeos15-row14b-rack10-host8 port 22: Connection timed out
12/19 09:38:17.348 DEBUG|          ssh_host:0234| retry 2: restarting master connection
12/19 09:38:17.348 DEBUG|      abstract_ssh:0821| Restarting master ssh connection
12/19 09:38:17.348 INFO |     ssh_multiplex:0096| Starting master ssh connection '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_vaxUayssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos15-row14b-rack10-host8'
12/19 09:38:17.349 DEBUG|             utils:0219| Running '/usr/bin/ssh -a -x -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_vaxUayssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos15-row14b-rack10-host8'
12/19 09:38:47.202 ERROR|             utils:2770| Timed out waiting for condition: Wait for a socket file to exist
12/19 09:38:47.202 INFO |     ssh_multiplex:0113| Timed out waiting for master-ssh connection to be established.
12/19 09:39:50.403 ERROR|             utils:0287| [stderr] ssh: connect to host chromeos15-row14b-rack10-host8 port 22: Connection timed out
12/19 09:39:51.405 DEBUG|      abstract_ssh:0684| Host chromeos15-row14b-rack10-host8 is still down after waiting 221 seconds
12/19 09:39:51.405 DEBUG|              test:0381| Test failed due to client failed to resume from sleep after 200 seconds. Exception log follows the after_iteration_hooks.
12/19 09:39:51.405 DEBUG|              test:0386| Starting after_iteration_hooks for platform_ExternalUsbPeripherals.detect.crash_check_short
12/19 09:39:51.406 DEBUG|              test:0391| after_iteration_hooks completed
12/19 09:39:51.406 WARNI|              test:0606| The test failed with the following exception
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/test.py", line 600, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 800, in _call_test_function
    return func(*args, **dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 464, in execute
    postprocess_profiled_run, args, dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 371, in _call_run_once
    self.run_once(*args, **dargs)
  File "/usr/local/autotest/server/site_tests/platform_ExternalUsbPeripherals/platform_ExternalUsbPeripherals.py", line 338, in run_once
    self.action_resume(boot_id)
  File "/usr/local/autotest/server/site_tests/platform_ExternalUsbPeripherals/platform_ExternalUsbPeripherals.py", line 123, in action_resume
    self.host.test_wait_for_resume(boot_id, _LONG_TIMEOUT)
  File "/usr/local/autotest/server/hosts/cros_host.py", line 1383, in test_wait_for_resume
    resume_timeout)
TestFail: client failed to resume from sleep after 200 seconds
12/19 09:39:51.406 DEBUG|              test:0611| Running cleanup for test.
=======================================================================

Note: This issue is specific to M73 builds. Tests are passing on M71&M72



AIs:
-- Try to reproduce manually on M73 build.


 
Cc: ravisadineni@chromium.org
Since R73-11332.0.0, when the 'Enable dark resume' change got in
https://chromium-review.googlesource.com/c/chromiumos/overlays/board-overlays/+/1355436/ for issue 820668

Basically the test is calling host.suspend(), which is same as  powerd_dbus_suspend. The test log indicates the actual command at SUSPEND step:

12/19 09:36:05.509 INFO |platform_ExternalU:0335| STEP 1.6. SUSPEND
12/19 09:36:05.516 DEBUG|          ssh_host:0310| Running (ssh) 'if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi' from '_call_run_once|run_once|action_suspend|get_boot_id|run|run_very_slowly'
12/19 09:36:05.815 DEBUG|             utils:0287| [stdout] ec8071ba-460a-40bb-b02a-02e345349165
12/19 09:36:05.864 DEBUG|             utils:0219| Running 'ping chromeos15-row14b-rack10-host8 -w1 -c1'
12/19 09:36:05.864 INFO |        server_job:0217| 	START	----	suspend	timestamp=1545240965	localtime=Dec 19 09:36:05	
12/19 09:36:05.867 INFO |        server_job:0217| 		GOOD	----	suspend.start for 30 seconds	timestamp=1545240965	localtime=Dec 19 09:36:05	
12/19 09:36:05.871 DEBUG|          ssh_host:0310| Running (ssh) '( echo 0 > /sys/class/rtc/rtc0/wakealarm && echo +30 > /sys/class/rtc/rtc0/wakealarm && powerd_dbus_suspend --delay=0 ) </dev/null >/dev/null 2>&1 & echo -n $!' from 'log_op|run_op|suspend|run_background|run|run_very_slowly'

i.e.

echo 0 > /sys/class/rtc/rtc0/wakealarm && echo +30 > /sys/class/rtc/rtc0/wakealarm && powerd_dbus_suspend --delay=0 
Tested on M73_11429.0.0
Able to wake up the device using keyboard, trackpad and power button.

Issue is with "echo 0 > /sys/class/rtc/rtc0/wakealarm && echo +30 > /sys/class/rtc/rtc0/wakealarm && powerd_dbus_suspend --delay=0" command

M72 device is waking up automatically after 30 seconds.
M73 device is failing to wake up automatically after 30 seconds.


Debug logs are present at https://pantheon.corp.google.com/storage/browser/chromiumos-test-logs/bugfiles/cros/916811/?debugUI=CLOUD

Cc: -pgangishetty@chromium.org tbroch@chromium.org
This shld fix it https://crrev.com/c/1385683.

Owner: ravisadineni@chromium.org
Status: Started (was: Untriaged)
Project Member

Comment 6 by bugdroid1@chromium.org, Jan 8

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crosutils/+/e7ed1a6967f12dd8d2e857bea094a92478cf0584

commit e7ed1a6967f12dd8d2e857bea094a92478cf0584
Author: RaviChandra Sadineni <ravisadineni@google.com>
Date: Tue Jan 08 03:40:48 2019

Disable dark resume at the board level on test images.

Several tests depend on RTC to wakeup from suspend. But on devices
with dark resume enabled, powerd will suspend the device immediately
on wake from RTC. This can interfere with the tests. Thus
200disableDarkResume is expected to disable dark resume on test images
so that a wake from RTC will result in a full resume. When dark resume
is enabled on specifice device, the pref file gets written into
board_specific folder which takes precedence.
Thus override board_specific file instead. Tests that have to test dark
resume will override the /var/lib/disable_dark_resume and restart
powerd.

BUG=chromium:916811
TEST=None

Change-Id: I3cb74ab1dc9850df744fe229810f6ae583b65df0
Reviewed-on: https://chromium-review.googlesource.com/1385683
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Ravi Chandra Sadineni <ravisadineni@chromium.org>
Reviewed-by: Dan Erat <derat@chromium.org>
Reviewed-by: Todd Broch <tbroch@chromium.org>

[modify] https://crrev.com/e7ed1a6967f12dd8d2e857bea094a92478cf0584/mod_for_test_scripts/200disableDarkResume

Comment 7 by tbroch@chromium.org, Jan 18 (5 days ago)

Cc: pgangishetty@chromium.org derat@chromium.org
 Issue 920414  has been merged into this issue.

Sign in to add a comment