New issue
Advanced search Search tips

Issue 667610 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocking:
issue 555782



Sign in to add a comment

Tegra K1 systems reporting incoherent (negative) resume time in S2R

Project Member Reported by coconutruben@chromium.org, Nov 22 2016

Issue description

When running power_Resume on Tegra K1 boards (nyan family) we see consistently incoherent time measurements reported about resume, namely that firmware bring-up time was negative. (examples below).

There seems to be a significant difference reported in bring-up time between hwclock and kernel timestamps.

The problem is that the math assumes that total bring up time for the board has to be bigger than ec + kernel, bringing us to:

board_up = total_up - kernel_up - firmware_up

But in these cases, kernel_up is actually about 4s, while total_up is about a second. Total up is calculated based on the HWCLOCKFILE and the rtc.py utility setting an alarm, while kernel_up is calculated by having powerd search for log lines that mean resume-start and resume-end, so something there is not working as expected

Boards affected:
nyan-blaze
nyan-big
nyan-kitty

Sample results from all above and veyron-jerry (for ARM reference):
nyan-blaze:
seconds_dev_0-0010_resume{perf}=0.034416
seconds_dev_1-0015_resume{perf}=0.107711
seconds_dev_1-2.1_resume{perf}=0.056603
seconds_dev_1-2_suspend{perf}=0.088179
seconds_dev_70030000.hda_resume{perf}=0.059169
seconds_dev_70030000.hda_suspend{perf}=0.139023
seconds_dev_70090000.usb_resume{perf}=0.053462
seconds_dev_700b0000.sdhci_suspend{perf}=0.116184
seconds_dev_700b0600.sdhci_resume{perf}=0.941404
seconds_dev_usb1_resume{perf}=0.185136
seconds_system_resume_firmware_cpu{perf}=0
seconds_system_resume_firmware_ec{perf}=-3.65303699809
seconds_system_resume_firmware{perf}=-3.65303699809
seconds_system_resume_kernel_cpu{perf}=3.696576
seconds_system_resume_kernel_dev{perf}=0.965772
seconds_system_resume_kernel{perf}=4.660412
seconds_system_resume{perf}=1.00737500191
seconds_system_suspend{perf}=0.188161

nyan-big:
seconds_dev_0-0010_resume{perf}=0.034555
seconds_dev_1-0015_resume{perf}=0.114335
seconds_dev_1-004a_resume{perf}=0.033372
seconds_dev_1-2_resume{perf}=0.08651
seconds_dev_70030000.hda_resume{perf}=0.059809
seconds_dev_70030000.hda_suspend{perf}=0.143084
seconds_dev_70090000.usb_resume{perf}=0.063344
seconds_dev_700b0000.sdhci_suspend{perf}=0.115156
seconds_dev_700b0600.sdhci_resume{perf}=0.998274
seconds_dev_usb1_resume{perf}=0.126649
seconds_system_resume_firmware_cpu{perf}=0
seconds_system_resume_firmware_ec{perf}=-3.00709506979
seconds_system_resume_firmware{perf}=-3.00709506979
seconds_system_resume_kernel_cpu{perf}=3.046227
seconds_system_resume_kernel_dev{perf}=1.02407
seconds_system_resume_kernel{perf}=4.068104
seconds_system_resume{perf}=1.06100893021
seconds_system_suspend{perf}=0.198106

nyan-kitty:
seconds_dev_1-2_resume{perf}=0.086702
seconds_dev_1-3.2_resume{perf}=0.115603
seconds_dev_1-3.2_suspend{perf}=0.04788
seconds_dev_1-3_resume{perf}=0.088506
seconds_dev_70030000.hda_resume{perf}=0.058824
seconds_dev_70030000.hda_suspend{perf}=0.141733
seconds_dev_70090000.usb_resume{perf}=0.063288
seconds_dev_700b0000.sdhci_suspend{perf}=0.110109
seconds_dev_700b0600.sdhci_resume{perf}=0.99832
seconds_dev_usb1_resume{perf}=0.067829
seconds_system_resume_firmware_cpu{perf}=0
seconds_system_resume_firmware_ec{perf}=-3.64791600887
seconds_system_resume_firmware{perf}=-3.64791600887
seconds_system_resume_kernel_cpu{perf}=3.693894
seconds_system_resume_kernel_dev{perf}=1.024169
seconds_system_resume_kernel{perf}=4.715887
seconds_system_resume{perf}=1.06797099113
seconds_system_suspend{perf}=0.207672

veyron-jerry:
seconds_dev_1-1_resume{perf}=0.836824
seconds_dev_2-0010_resume{perf}=0.02903
seconds_dev_2-1_resume{perf}=0.0555
seconds_dev_3-1_resume{perf}=0.055298
seconds_dev_4-0015_resume{perf}=0.112429
seconds_dev_ff500000.usb_suspend{perf}=0.018475
seconds_dev_mmc0-0001_suspend{perf}=0.038248
seconds_dev_mmc2-0001_suspend{perf}=0.106872
seconds_dev_usb1_resume{perf}=0.209786
seconds_dev_usb3_resume{perf}=0.017674
seconds_system_resume_firmware_cpu{perf}=0
seconds_system_resume_firmware_ec{perf}=0.0836409621277
seconds_system_resume_firmware{perf}=0.0836409621277
seconds_system_resume_kernel_cpu{perf}=0.004322
seconds_system_resume_kernel_dev{perf}=1.072691
seconds_system_resume_kernel{perf}=1.078796
seconds_system_resume{perf}=1.16243696213
seconds_system_suspend{perf}=0.200017
 
Components: OS>Kernel OS>Kernel>Power
Labels: Arch-ARM
There seems to be an issue with how uptime, and the kernel timestamps are kept in sync.
In the beginning of the resume sequence, around the time we ask for the filesystem to sync, we see this:

2016-11-22T15:17:35.578265-08:00 INFO kernel: [  112.560164] Loaded LP0 firmware with request_firmware.
2016-11-22T15:17:39.962666-08:00 INFO kernel: [  112.566107] PM: Syncing filesystems ... done.
2016-11-22T15:17:39.962694-08:00 DEBUG kernel: [  112.590676] PM: Preparing system for mem sleep

So the kernel timestamp jumps ~4 seconds forward, while the uptime counter remains constant.

Then later, when the device wakes, the kernel timestamp barely increases, while the uptime counter leaps forward (as expected)

2016-11-22T15:17:39.969598-08:00 INFO kernel: [  112.788738] PMC wake enable = 0x40c028000040900
2016-11-22T15:17:39.969600-08:00 INFO kernel: [  112.789638] Entering suspend state LP0
2016-11-22T15:17:39.969606-08:00 INFO kernel: [  115.983301] PMC wake status = 0x40000
2016-11-22T15:17:39.969609-08:00 INFO kernel: [  115.983313] Resume caused by PMC WAKE18, 7000d000.i2c

On other devices (lulu, veyron_jerry, below log-lines (adjusted) pasted in for reference) we see a similar jump in the beginning, but not a similar jump at the end.

Questions:
(1) is the time jump in the beginning of the suspend-sequence expected/meaningful?
(1-2) what's causing it
(2) what's causing the 2nd jump in the end on Tegra K1 systems.


veyron_jerry:
(a)
2016-11-22T15:47:04.004955-08:00 INFO kernel: [3126087.289558] PM: Syncing filesystems ... done.
2016-11-22T15:47:04.005003-08:00 DEBUG kernel: [3126087.299715] PM: Preparing system for mem sleep
2016-11-22T15:47:10.069409-08:00 NOTICE kernel: [3126087.308266] Freezing user space processes ... 
2016-11-22T15:47:10.069474-08:00 NOTICE kernel: [3126087.310310] Freezing remaining freezable tasks 

(b)
2016-11-22T15:47:10.072798-08:00 NOTICE kernel: [3126087.499723] CPU2: shutdown
2016-11-22T15:47:10.072802-08:00 NOTICE kernel: [3126087.501728] CPU3: shutdown
2016-11-22T15:47:10.072807-08:00 INFO kernel: [3126087.503337] Enabling non-boot CPUs ...
2016-11-22T15:47:10.072814-08:00 NOTICE kernel: [3126087.504547] CPU1: Booted secondary processor
2016-11-22T15:47:10.072818-08:00 INFO kernel: [3126087.504829] CPU1 is up

lulu:
(a)
2016-11-22T23:39:32.836894+00:00 NOTICE powerd_suspend[8710]: Finalizing suspend
2016-11-22T23:39:32.841866+00:00 INFO kernel: [37496.878553] PM: suspend entry 2016-11-22 23:39:32.841181396 UTC
2016-11-22T23:39:38.316446+00:00 INFO kernel: [37496.878571] PM: Syncing filesystems ... done.
2016-11-22T23:39:38.317134+00:00 DEBUG kernel: [37496.886211] PM: Preparing system for mem sleep

(b)
2016-11-22T23:39:38.318272+00:00 NOTICE kernel: [37497.069245] Disabling non-boot CPUs ...
2016-11-22T23:39:38.318274+00:00 INFO kernel: [37497.070663] smpboot: CPU 1 is now offline
2016-11-22T23:39:38.318276+00:00 INFO kernel: [37497.072705] Disabling VMX on cpu 0
2016-11-22T23:39:38.318279+00:00 INFO kernel: [37497.072716] TSC at resume: 380552836


Project Member

Comment 3 by bugdroid1@chromium.org, Jan 17 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/7fd15a46c739534c63c1ccba7e2baa2e3a02ec37

commit 7fd15a46c739534c63c1ccba7e2baa2e3a02ec37
Author: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Date: Thu Dec 08 22:01:51 2016

power_suspend: differentiating between end_suspend_time and start_resume_time

On Tegra K1 devices, the uptime counter does not pause when we suspend.
This leads to false values in our suspend tests, because we assume that
end_suspend_time == start_resume time. Change adds an explicit getter
inside send_metrics_on_resume for end_suspend_time.

Additionally, I updated the list of messages to more accurately reflect
the last, and first message in the suspend/resume path for each system
we use.

BUG= chromium:667610 
TEST=manual testing on nyan_big, and lulu

Change-Id: Ie0a7862a83eb15f0724fb92b1f34463ce9fdfd0d
Reviewed-on: https://chromium-review.googlesource.com/418221
Commit-Ready: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Tested-by: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Reviewed-by: Dan Erat <derat@chromium.org>
Reviewed-by: Todd Broch <tbroch@chromium.org>

[modify] https://crrev.com/7fd15a46c739534c63c1ccba7e2baa2e3a02ec37/power_manager/tools/send_metrics_on_resume

Project Member

Comment 4 by bugdroid1@chromium.org, Jan 24 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/c4e257d87e1861c345b02baef217bc0f503628e6

commit c4e257d87e1861c345b02baef217bc0f503628e6
Author: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Date: Thu Jan 19 22:15:32 2017

power: remove trailing \ that made send_metrics_on_resume script abort

get_end_suspend_time had a trailing \ that made the script fail,
and it's now removed

BUG= chromium:667610 
TEST=manual testing on samus

Change-Id: I44b50c01cf2f85419742c8aa8cc6c66d6d8291b2
Reviewed-on: https://chromium-review.googlesource.com/430202
Commit-Ready: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Tested-by: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Reviewed-by: Dan Erat <derat@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>
Reviewed-by: Todd Broch <tbroch@chromium.org>

[modify] https://crrev.com/c4e257d87e1861c345b02baef217bc0f503628e6/power_manager/tools/send_metrics_on_resume

Project Member

Comment 5 by bugdroid1@chromium.org, Jan 27 2017

Labels: merge-merged-release-R57-9202.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/89073104af43d4d89952df1944f155566d0c8d43

commit 89073104af43d4d89952df1944f155566d0c8d43
Author: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Date: Thu Jan 19 22:15:32 2017

power: remove trailing \ that made send_metrics_on_resume script abort

get_end_suspend_time had a trailing \ that made the script fail,
and it's now removed

BUG= chromium:667610 
TEST=manual testing on samus

Change-Id: I44b50c01cf2f85419742c8aa8cc6c66d6d8291b2
Reviewed-on: https://chromium-review.googlesource.com/430202
Commit-Ready: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Tested-by: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Reviewed-by: Dan Erat <derat@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>
Reviewed-by: Todd Broch <tbroch@chromium.org>
(cherry picked from commit c4e257d87e1861c345b02baef217bc0f503628e6)
Reviewed-on: https://chromium-review.googlesource.com/434243
Commit-Queue: Ruben Rodriguez Buchillon <coconutruben@chromium.org>

[modify] https://crrev.com/89073104af43d4d89952df1944f155566d0c8d43/power_manager/tools/send_metrics_on_resume

Project Member

Comment 6 by bugdroid1@chromium.org, Jan 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/89073104af43d4d89952df1944f155566d0c8d43

commit 89073104af43d4d89952df1944f155566d0c8d43
Author: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Date: Thu Jan 19 22:15:32 2017

power: remove trailing \ that made send_metrics_on_resume script abort

get_end_suspend_time had a trailing \ that made the script fail,
and it's now removed

BUG= chromium:667610 
TEST=manual testing on samus

Change-Id: I44b50c01cf2f85419742c8aa8cc6c66d6d8291b2
Reviewed-on: https://chromium-review.googlesource.com/430202
Commit-Ready: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Tested-by: Ruben Rodriguez Buchillon <coconutruben@chromium.org>
Reviewed-by: Dan Erat <derat@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>
Reviewed-by: Todd Broch <tbroch@chromium.org>
(cherry picked from commit c4e257d87e1861c345b02baef217bc0f503628e6)
Reviewed-on: https://chromium-review.googlesource.com/434243
Commit-Queue: Ruben Rodriguez Buchillon <coconutruben@chromium.org>

[modify] https://crrev.com/89073104af43d4d89952df1944f155566d0c8d43/power_manager/tools/send_metrics_on_resume

Status: Assigned (was: Untriaged)
Ruben is this fixed?
It should be. I'll verify tomorrow morning and mark fixed 
Status: Verified (was: Assigned)
check on wmatrix test runs, to see what prior to the fix, nyan* devices would report a negative time. Now they report all positive times.

Sign in to add a comment