New issue
Advanced search Search tips

Issue 884913 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Oct 25
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 1
Type: Feature

Blocked on:
issue 804082
issue 884226
issue 884925
issue 885337
issue 885381



Sign in to add a comment

Add instrumentation to GPU tests on macOS to display screen lock detection

Project Member Reported by kbr@chromium.org, Sep 17

Issue description

In Issue 884226 and many other recent issues, a common failure mode for the Chrome team's GPU bots is that the screen lock on one of the laptops has been activated because the machine previously hibernated and later woke up.

kainino@ pointed out that a potential way to detect this is described in https://stackoverflow.com/a/11511419 .

We should modify the GPU tests' base harness to dump this information on macOS and see if we can detect this failure mode. (Alternatively, we can manually provoke this state on one of our Mac laptops and see whether we can detect it.) If we can, then we can add this to Swarming, and auto-quarantine machines in this state.

 
Owner: bsheedy@chromium.org
Status: Assigned (was: Available)
I can probably take this, as I have some spare cycles atm. A couple of questions though:

1. Is the base harness you're referring to run_gpu_integration_test.py? (https://cs.chromium.org/chromium/src/content/test/gpu/run_gpu_integration_test.py)

2. I assume we only care about whether the screen is locked or not (the value of CGSSessionScreenIsLocked), and not whether the screen is on or not (CGSSessionOnConsoleKey)?
Cc: d...@chromium.org
Fantastic, thanks Brian for picking this up.

The first thing we should do is dump all of the information (CGSSessionScreenIsLocked, CGSSessionOnConsoleKey, and anything else relevant), and see if we can correlate it with the failure mode we're seeing on the laptops in the golo.

These machines are designed to auto-login, and this process fails if the machine's hibernated and then been woken up (I think). Can you sync up with dba@ to try to figure out how to reproduce the failure mode on demand?

Similarly to the split between color_profile_manager.py and color_profile_manager_mac.py:

https://cs.chromium.org/chromium/src/content/test/gpu/gpu_tests/color_profile_manager.py?q=color_profile_manager.py&sq=package:chromium&g=0&l=1

https://cs.chromium.org/chromium/src/content/test/gpu/gpu_tests/color_profile_manager_mac.py?sq=package:chromium&g=0

we should add a separate Python file and function which does nothing on non-Mac platforms and then has the Mac-specific imports inside it. We should make sure that the logging which it performs can be seen on the bots. Yes, we should probably add the dumping of this information to run_gpu_integration_test.py before it invokes the browser_test_runner, and once it's working locally, send a tryjob or so through mac_chromium_rel_ng and see if we can get the output in the various GPU tests' stdout files.

Blockedon: 884925
Hm, it looks like color_profile_manager_mac.py uses Quartz already, but it's not available as a vpython wheel? https://chromium.googlesource.com/infra/infra/+/master/infra/tools/dockerbuild/wheels.md
I'm pretty sure that's supplied by the pyobjc wheel.

Blockedon: 804082
See long discussion in Issue 804082.

https://chromium-review.googlesource.com/c/chromium/src/+/1229366 seems to work for logging the required information. Unfortunately, using logging.info didn't show up in the bot's stdout, so I used print.

You can see an earlier version working on the bots at https://chromium-swarm.appspot.com/task?id=4002e82ab3587d10&refresh=10&show_raw=1 (I just had it exit immediately after printing, so no test was actually run).

Not sure if there's anything else relevant we want to log, as the stackoverflow post doesn't indicate anything else, and I'm quite unfamiliar with Mac atm.
Re: Locking on demand. You can VNC to the host and lock the screen that way (Apple -> Lock Screen), or use AppleScript to lock the screen. The code below seems to work. SSH into the bot and add the following to a file:

tell application "System Events"
  keystroke "q" using {command down, control down}
end tell

then exec it with "osascript <filename>" where <filename> is whatever you named the script.

The only way to unlock the screen from that point is to VNC to the host and log back in.
Good work Brian, and thanks Bryce for the tip on how to lock the screen programmatically.

Brian, could you try running this script from an ssh session on a Mac laptop, then locking the screen as Bryce suggested, and then doing it again?

If you can reliably detect when the screen's locked, then let's add code to run_gpu_integration_test to immediately return and fail with a loud, visible message if it is. (Looking at your code, it would be fine to just include this in run_gpu_integration_test.py as something like "FailIfScreenLockedOnMac", and not add run_gpu_integration_test_mac.py.)

Then we can file a P1 RFE against Swarming to include such logic so that these bots auto-quarantine if they get in that state.

Thanks for moving this forward!

Sorry for the delay - had to draft a neighbor since I don't have a Macbook. Luckily, it looks like the stackoverflow solution does properly detect the lock screen.

When run without the lockscreen, CGSSessionScreenIsLocked = None and kCGSSessionOnConsoleKey = True. With the lockscreen, CGSSessionScreenIsLocked = True and kCGSSessionOnConsoleKey = True. So, it looks like we can just check the value of CGSSessionScreenIsLocked and fail based off that.

I'll update my patch to fail instead of log, then file an infra bug to add the swarming logic.
Blockedon: 885337
Project Member

Comment 12 by bugdroid1@chromium.org, Sep 18

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3c29cf72f42c3f4d0d0140357c392e46d9726946

commit 3c29cf72f42c3f4d0d0140357c392e46d9726946
Author: bsheedy <bsheedy@chromium.org>
Date: Tue Sep 18 23:01:27 2018

Fail GPU tests if Mac lockscreen detected

Adds MacOS-only logic to run_gpu_integration_test.py
that immediately aborts if the lockscreen is detected.
The device should be automatically unlocked prior to
the test starting, so a failure to do so needs to be
easily identified for investigation.

Bug:  884913 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: I684a73ace61fd34484af815c7d55517613de16b3
Reviewed-on: https://chromium-review.googlesource.com/1229366
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Commit-Queue: Brian Sheedy <bsheedy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#592244}
[modify] https://crrev.com/3c29cf72f42c3f4d0d0140357c392e46d9726946/content/test/gpu/run_gpu_integration_test.py

Blockedon: 885381
Status: Fixed (was: Assigned)

Sign in to add a comment