New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 660582 link

Starred by 6 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 2
Type: Bug



Sign in to add a comment

A few tests fail on Mac 10.10 and 10.11 with swarming

Project Member Reported by jam@chromium.org, Oct 28 2016

Issue description

per bug 653677,  10 .10 and 10.11 bots were swarmed so they can go from 3 hours to 20 minute cycle times. I'll add an entry here for each one.
 

Comment 9 by jam@chromium.org, Oct 28 2016

Cc: erikc...@chromium.org
If the failure are deterministic, I'd be okay with disabling the broken ones and fixing. If the failures are non-deterministic or there are too many, I think we'll have to roll back the change to swarming and fix the issues first. Guess we'll have to wait and see. 

Comment 18 by jam@chromium.org, Oct 28 2016

@erikchen: I waited for 3 runs of 10.10 and 10.11 to  be comfortable that it's not just random failures.
Glanced through the failing tests. Looks like they use methods that are probably unsafe to be run in a sharded environment [miniaturize, maximize, etc.] 

Comment 20 by jam@chromium.org, Oct 29 2016

Not sure what you mean? When they were running on the main waterfall without swarming, they were also running running in parallel.

Sharding is unfortunately overloaded to mean running multiple tests at the same time on the same machine, and also for splitting up tests across multiple machines. In both cases though, browser_tests run many tests at the same time.
Project Member

Comment 21 by bugdroid1@chromium.org, Oct 29 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9

commit ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9
Author: jam <jam@chromium.org>
Date: Sat Oct 29 00:30:39 2016

Disable failing tests on Mac 10.10 and 10.11 after swarming.

BUG=660582
TBR=erikchen@chromium.org

Review-Url: https://codereview.chromium.org/2460063002
Cr-Commit-Position: refs/heads/master@{#428558}

[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/chrome/browser/apps/app_window_browsertest.cc
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/chrome/browser/extensions/api/tabs/tabs_test.cc
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/chrome/browser/ui/cocoa/applescript/window_applescript_test.mm
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/chrome/browser/ui/cocoa/apps/native_app_window_cocoa_browsertest.mm
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/chrome/browser/ui/cocoa/browser_window_cocoa_unittest.mm
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/chrome/browser/ui/cocoa/screen_capture_notification_ui_cocoa_unittest.mm
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/chrome/browser/ui/cocoa/spinner_view_unittest.mm
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/chrome/browser/ui/cocoa/sprite_view_unittest.mm
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/chrome/test/ppapi/ppapi_browsertest.cc
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/extensions/browser/api/app_window/app_window_apitest.cc
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/ui/views/widget/native_widget_mac_unittest.mm
[modify] https://crrev.com/ed6d8326951dd78e6ccc8b75e112d6b70cecd2c9/ui/views/widget/widget_unittest.cc

We used to run two sets of tests in parallel:
Using sharding settings from environment. This is shard 0/1
Using 2 parallel jobs.
https://build.chromium.org/p/chromium.mac/builders/Mac10.10%20Tests/builds/8407/steps/browser_tests/logs/stdio

Now we shard across 10 machines, and run 4 sets of tests in parallel:
https://build.chromium.org/p/chromium.mac/builders/Mac10.10%20Tests/builds/8413
"""
Using 4 parallel jobs.
"""

Many of the actual failures look very suspicious:
"""
[ RUN      ] AppWindowAPITest.TestMinimize
2016-10-28 15:44:30.839 browser_tests[6367:115701] NSWindow warning: adding an unknown subview: <FullSizeContentView: 0x7fd95bc72180>
2016-10-28 15:44:30.840 browser_tests[6367:115701] Call stack:
(
    "+callStackSymbols disabled for performance reasons"
)
[15:44:30.959] vtDecompressionDuctCreate signalled err=-8973 (err) (Could not select and open decoder instance) at /SourceCache/CoreMedia_frameworks/CoreMedia-1562.240/Sources/VideoToolbox/VTDecompressionSession.c line 1181
<<<< VTVideoEncoderSelection >>>> VTSelectAndCreateVideoEncoderInstanceInternal: no video encoder found for 'avc1'

[15:44:31.044] VTSelectAndCreateVideoEncoderInstanceInternal signalled err=-12908 (err) (Video encoder not available) at /SourceCache/CoreMedia_frameworks/CoreMedia-1562.240/Sources/VideoToolbox/VTVideoEncoderSelection.c line 1245
[15:44:31.044] VTCompressionSessionCreate signalled err=-12908 (err) (Could not select and open encoder instance) at /SourceCache/CoreMedia_frameworks/CoreMedia-1562.240/Sources/VideoToolbox/VTCompressionSession.c line 946
BrowserTestBase received signal: Terminated: 15. Backtrace:
0   browser_tests                       0x000000010aafd693 _ZN4base5debug10StackTraceC1Ev + 19
1   browser_tests                       0x000000010b1b6dac _ZN7content12_GLOBAL__N_1L27DumpStackTraceSignalHandlerEi + 204
2   libsystem_platform.dylib            0x00007fff9a02df1a _sigtramp + 26
3   ???                                 0x00007fff5722cae0 0x0 + 140734655285984
4   browser_tests                       0x000000010ab3ca56 _ZN4base12_GLOBAL__N_117oom_killer_callocEP14_malloc_zone_tmm + 22
5   libsystem_malloc.dylib              0x00007fff93dcdb90 malloc_zone_calloc + 78
6   libsystem_malloc.dylib              0x00007fff93dce546 calloc + 49
7   libobjc.A.dylib                     0x00007fff91d98c12 class_createInstance + 133
8   CoreFoundation                      0x00007fff9242a10f __CFAllocateObject2 + 15
9   CoreFoundation                      0x00007fff92432b5e +[__NSArrayM __new:::::] + 62
10  CoreFoundation                      0x00007fff92432a92 -[__NSPlaceholderArray initWithCapacity:] + 114
11  CoreFoundation                      0x00007fff9243bb73 CFArrayCreateMutable + 131
12  CoreFoundation                      0x00007fff9254e4c4 __CFRunLoopDoTimers + 180
...
"""
Signal 15 [SIGTERM]...definitely requires more investigation.
Hi, do you know why NewlibPackagedAppTest.SuccessfulLoad  still fails on Mac 10.11 after it has been disabled?https://build.chromium.org/p/chromium.mac/builders/Mac10.11%20Tests/builds/2993
Cc: bbudge@chromium.org
 Issue 660625  has been merged into this issue.

Comment 25 by jam@chromium.org, Oct 29 2016

I accidentally disabled the wrong test; the correct disable is in the CQ.
Project Member

Comment 26 by bugdroid1@chromium.org, Oct 29 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/86c3f0bdc61593569348a403972f5aee1e868d2d

commit 86c3f0bdc61593569348a403972f5aee1e868d2d
Author: jam <jam@chromium.org>
Date: Sat Oct 29 04:23:25 2016

Fix disabling of failing test from r428558.

BUG=660582
TBR=erikchen@chromium.org

Review-Url: https://codereview.chromium.org/2464593002
Cr-Commit-Position: refs/heads/master@{#428598}

[modify] https://crrev.com/86c3f0bdc61593569348a403972f5aee1e868d2d/chrome/test/ppapi/ppapi_browsertest.cc

Comment 27 by jam@chromium.org, Oct 31 2016

regarding comment 22, two things to note:
-browser tests harness retries failed tests in serial
-the other OS versions run tests with the same sharding level
Cc: roc...@chromium.org
Owner: sa...@chromium.org
Status: Assigned (was: Untriaged)
Huh. Regarding the comment on  issue 660625 :

Looks like something is being called incorrectly within the Nacl/Mojo stuff. This looks familiar and I thought I had already fixed another bug like this.

ServiceManager should never see a string like "4EB03F36E830BA450FC7A9A8F8AA6240" for any reason, and it means someone is passing a token where they should be passing a service name. Probably in a content::ChildConnection or somethng.

Tentatively over to sammc tentatively
Cc: rouslan@chromium.org
 Issue 661550  has been merged into this issue.
Cc: jam@chromium.org
jam: We've got about 50 failing interactive_ui_tests, fails about 50% of all runs. Failures started Oct 28th, and the tests are getting signal 15. See the issue I just duped into this one.

I think we need to roll back the swarming change and investigate the failures more.

Comment 31 by jam@chromium.org, Nov 2 2016

Owner: ----
Status: Available (was: Assigned)
(removing Sam as the owner since there are many bugs here)

Erik: a simple workaround could be to disable swarming for just interactive ui tests on mac 10.11. that way the runtime is still fast. I can take care of that
Cc: shrike@chromium.org
+shrike.

Okay. sgtm.

Swarming is awesome and we want it for all our tests. After jam's change, we'll have ~10 disabled tests, and interactive_ui_tests will still be non-swarming. Let's make sure that fixing those tests and getting interactive_ui_tests on swarming doesn't get dropped on the floor. 
Project Member

Comment 33 by bugdroid1@chromium.org, Nov 3 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/209f3909e2ef23104a04cb8d26299d5a86b0d6ce

commit 209f3909e2ef23104a04cb8d26299d5a86b0d6ce
Author: jam <jam@chromium.org>
Date: Thu Nov 03 01:13:28 2016

Don't swarm interactive_ui_tests on Mac 10.10 since they're flaky when swarmed.

BUG=660582

Review-Url: https://codereview.chromium.org/2477543002
Cr-Commit-Position: refs/heads/master@{#429496}

[modify] https://crrev.com/209f3909e2ef23104a04cb8d26299d5a86b0d6ce/testing/buildbot/chromium.mac.json

Project Member

Comment 34 by bugdroid1@chromium.org, Nov 23 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/12a7f40d299651beb746bf316b0c6c93fa8083d5

commit 12a7f40d299651beb746bf316b0c6c93fa8083d5
Author: sammc <sammc@chromium.org>
Date: Wed Nov 23 01:09:52 2016

Mojo EDK: Correctly handle EMSGSIZE on Mac.

As the comment copy-pasted from the IPC::ChannelPosix states, on Mac
EMSGSIZE may be returned when sending a message if there is insufficient
buffer space for transmitted FDs. Currently, the channel implementation
treats this as a fatal error; this CL changes it to a recoverable error.

This also re-enables NewlibPackagedAppTest.SuccessfulLoad, which was
failing on some bots due to this problem.

BUG=660582

Review-Url: https://codereview.chromium.org/2512463003
Cr-Commit-Position: refs/heads/master@{#434058}

[modify] https://crrev.com/12a7f40d299651beb746bf316b0c6c93fa8083d5/chrome/test/ppapi/ppapi_browsertest.cc
[modify] https://crrev.com/12a7f40d299651beb746bf316b0c6c93fa8083d5/mojo/edk/system/channel_posix.cc

Components: Internals>Services>ServiceManager
Bulk applying component Internals>Services>ServiceManager to issues referencing the text ServiceManager.  This may not be 100% accurate, so please feel free to pull the component as needed.
Cc: -roc...@chromium.org rockot@google.com

Sign in to add a comment