New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 753793 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Oct 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

system_health.memory_mobile failing on AndroidOne since Jul 27

Project Member Reported by u...@chromium.org, Aug 9 2017

Issue description

Last good revision: 489896 (bc3d1a8c8dbb) - https://build.chromium.org/p/chromium.perf/builders/Android%20One%20Perf/builds/326


Subsequent builds fail with:

(INFO) 2017-08-08 18:31:40,497 exception_formatter.PrintFormattedException:19  Try printing formatted exception: <class 'devil.android.device_errors.CommandTimeoutError'> Timed out waiting for 'verify_charging', wait of 1.0 secs required but only 0.6 secs left <traceback object at 0x7f74dd31fc20>
Traceback (most recent call last):
  _run_exitfuncs at /usr/lib/python2.7/atexit.py:24
    func(*targs, **kargs)
  _wrapped_function at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/util/atexit_with_log.py:11
    function(*args, **kwargs)
  _ReenableChargingIfNeeded at /b/swarming/w/ir/third_party/catapult/telemetry/telemetry/internal/platform/power_monitor/android_power_monitor_controller.py:12
    battery.SetCharging(True)
  timeout_retry_wrapper at /b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py:57
    retry_if_func=retry_if_func)
  Run at /b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py:159
    error_log_func=error_log_func)
  JoinAll at /b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py:186
    self._JoinAll(watcher, timeout)
  _JoinAll at /b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py:158
    thread.ReraiseIfException()
  run at /b/swarming/w/ir/third_party/catapult/devil/devil/utils/reraiser_thread.py:81
    self._ret = self._func(*self._args, **self._kwargs)
  <lambda> at /b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py:152
    child_thread = reraiser_thread.ReraiserThread(lambda: func(*args, **kwargs),
  impl at /b/swarming/w/ir/third_party/catapult/devil/devil/android/decorators.py:47
    return f(*args, **kwargs)
  SetCharging at /b/swarming/w/ir/third_party/catapult/devil/devil/android/battery_utils.py:578
    self._HardwareSetCharging(enabled)
  _HardwareSetCharging at /b/swarming/w/ir/third_party/catapult/devil/devil/android/battery_utils.py:618
    timeout_retry.WaitFor(verify_charging, wait_period=1)
  WaitFor at /b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py:113
    msg='Timed out waiting for %r' % condition_name)
  GetRemainingTime at /b/swarming/w/ir/third_party/catapult/devil/devil/utils/timeout_retry.py:56
    raise reraiser_thread.TimeoutError(msg)
CommandTimeoutError: Timed out waiting for 'verify_charging', wait of 1.0 secs required but only 0.6 secs left

https://build.chromium.org/p/chromium.perf/builders/Android%20One%20Perf?numbuilds=200
 
Cc: rnep...@chromium.org
Components: Speed>Benchmarks
Owner: charliea@chromium.org
Hmhh, we should get rid of that android_power_monitor_controller code path. 
Also looks like the benchmark has been running (mostly) fine as of:
https://build.chromium.org/p/chromium.perf/builders/Android%20One%20Perf/builds/379 ?

Status: Assigned (was: Untriaged)
I'm currently investigating this. It seems like there are a couple of interesting questions here:

1) Why are we reenabling charging when this benchmark never _disables_ charging? It seems like the answer here is that, on Android, we always reenable charging, even if we never disabled it. Some simple bookkeeping or changing where we register the atexit handler could fix this.
2) Why are we timing out something that waits for a total of one second when we still have .6 seconds left? I haven't yet figured out the answer to this, but it seems like some bug in timeout_retry.py or how battery_utils interacts with that code.
For 2), I reached out to jbudorick@ offline. It seems like battery_utils.py isn't correctly using the interface to timeout_retry: I think what the battery_utils code is _trying_ to do is say "wait for a maximum of 1 second for this thing to be true", but it's instead being interpreted as "check if this condition is true until there's no longer any more time to do so and by the way, it takes 1 second to check, so if there's not that much time left, just give up".
I retract #2. After digging into the WaitFor code with jbudorick@ a little bit, it looks like we actually waited for a total of 29.4s, not 0.4s. This makes it seem a lot more like there happened to be a problem with adb / the device during this test than my previous belief that there was just a problem with the code.
That error means that the phone, from the perspective of devil/battery_utils, is not charging/is taking too long to properly charge. This can indicate a problem with the battery. A1s being lower end devices, I think a physical inspection of the device to make sure there isn't any battery bulging going on might be a good idea. 

Another possibility is that a previous test disabled charging, and the phone possibly got stuck in a state where it thinks its not charging and doesn't update the battery data. The _ReenableCharingIfNeeded() was created specifically to fight this. Although, I didn't think that these code paths should be exercised in anything other than old-school power metrics.
Cc: -nednguyen@chromium.org nedngu...@google.com

Comment 8 by zh...@chromium.org, Oct 16 2017

This test has been ok in early Sep. Now it is failing again. Will close this one and open a new bug.

Comment 9 by zh...@chromium.org, Oct 16 2017

Status: Fixed (was: Assigned)

Sign in to add a comment