New issue
Advanced search Search tips

Issue 813553 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

notification_helper_unittests failing on all chromium.clang windows bots

Project Member Reported by h...@chromium.org, Feb 19 2018

Issue description

Looks like it's timing out maybe? Lots of bots are purple due to this.
Here's one: https://ci.chromium.org/buildbot/chromium.clang/ToTWin/1002
It also affects the MSVC bots.

Is this a new test that doesn't get run by many bots, or why isn't the main waterfall all purple?
 

Comment 1 by h...@chromium.org, Feb 19 2018

The official builders here are green: https://uberchromegw.corp.google.com/i/official.desktop.continuous/console
But they don't run this test.

So it could be an official vs non-official build thing.

Comment 2 by h...@chromium.org, Feb 19 2018

Owner: chengx@chromium.org
Status: Assigned (was: Available)
Ah no, the test doesn't work on Win7, so fails when swarming to such hosts, see https://chromium-review.googlesource.com/c/chromium/src/+/915016
Project Member

Comment 4 by bugdroid1@chromium.org, Feb 19 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fb02d21de0c77d9fafd717dedc439ea23f7e94b9

commit fb02d21de0c77d9fafd717dedc439ea23f7e94b9
Author: Hans Wennborg <hans@chromium.org>
Date: Mon Feb 19 15:04:43 2018

Revert "Run notification_helper_unittests on Chromium buildbots"

This reverts commit b0dc3696a70a9ba2835bbbd77d0a7874359235db.

Reason for revert:
The best breaks all Windows buildbots on the chromium.clang waterfall,
and I guess others might be broken too.

Either these need to be added to the blacklist for this test, or
ideally there would be some mechanism to express this test needs to
run on Win10 hosts.

Original change's description:
> Run notification_helper_unittests on Chromium buildbots
> 
> Since this unit test requires WinRT which only exists on Windows 8 or above,
> we don't run it on Windows 7 bots.
> 
> Bug:  734095 
> Change-Id: I6889efb1e9425fe3cc4ae37ae6b0f0d54e7c4f5a
> Reviewed-on: https://chromium-review.googlesource.com/915016
> Commit-Queue: Xi Cheng <chengx@chromium.org>
> Reviewed-by: Dirk Pranke <dpranke@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#536906}

TBR=dpranke@chromium.org,chengx@chromium.org

# Not skipping CQ checks because original CL landed > 1 day ago.

Bug:  734095 ,  813553 
Change-Id: I9ae4611aedb58d1ecf58ba25c2b4f387ae40a59b
Reviewed-on: https://chromium-review.googlesource.com/923740
Reviewed-by: Hans Wennborg <hans@chromium.org>
Cr-Commit-Position: refs/heads/master@{#537654}
[modify] https://crrev.com/fb02d21de0c77d9fafd717dedc439ea23f7e94b9/testing/buildbot/chromium.clang.json
[modify] https://crrev.com/fb02d21de0c77d9fafd717dedc439ea23f7e94b9/testing/buildbot/chromium.fyi.json
[modify] https://crrev.com/fb02d21de0c77d9fafd717dedc439ea23f7e94b9/testing/buildbot/chromium.win.json
[modify] https://crrev.com/fb02d21de0c77d9fafd717dedc439ea23f7e94b9/testing/buildbot/gn_isolate_map.pyl
[modify] https://crrev.com/fb02d21de0c77d9fafd717dedc439ea23f7e94b9/testing/buildbot/test_suite_exceptions.pyl
[modify] https://crrev.com/fb02d21de0c77d9fafd717dedc439ea23f7e94b9/testing/buildbot/test_suites.pyl

Comment 5 by chengx@chromium.org, Feb 20 2018

Status: Fixed (was: Assigned)
hans@, thank you for fixing this. 

Mark resolved as the CL has been reverted.
Status: Assigned (was: Fixed)
Reopening this bug so we can use it to track fixing the test. The failure on the Win 7 bots was:

Task ran but no result was found: shard 0 test output was missing
some shards did not complete: 0
step returned non-zero exit code: -1073741515

-1073741515 is 0xC0000135 which means DLL cannot be found. Local experimentation shows that notification_helper_unittests.exe (and notification_helper.exe, more or less) is importing this set of DLLs:

api-ms-win-core-winrt-error-l1-1-0.dll
ADVAPI32.dll
KERNEL32.dll
ole32.dll
SHELL32.dll
USER32.dll
WS2_32.dll
VERSION.dll
SHLWAPI.dll
PSAPI.DLL
OLEAUT32.dll
PROPSYS.dll
WINMM.dll
WTSAPI32.dll
USERENV.dll
GDI32.dll
dbghelp.dll
USP10.dll
DWrite.dll
IPHLPAPI.DLL
CRYPT32.dll
WINHTTP.dll
Secur32.dll
urlmon.dll
SETUPAPI.dll
CFGMGR32.dll
POWRPROF.dll

Presumably it is api-ms-win-core-winrt-error-l1-1-0.dll that is causing the problem. That DLL is being imported purely to supply RoOriginateError.

There are a few possible fixes for this:
1) Use the blacklists so that this test doesn't run on Windows 7. This seems inelegant since it requires matching the list of Windows 7 bots and it also fails inelegantly if somebody runs the test binary or notification_helper.exe itself on Windows 7.
2) Supply an RoOriginError function in our code, like this:
    STDAPI_(BOOL)
    RoOriginateError(
      _In_ HRESULT error,
      _In_opt_ HSTRING message
    ) {
      return FALSE;
    }
Supplying a stub function is reasonable because all this function does is "Reports an error and an informative string to an attached debugger."
3) Delay load api-ms-win-core-winrt-error-l1-1-0.dll

Probably the delay load solution is the cleanest.

It should be possible to use swarming to test developer-built binaries on the problematic test machines in order to reproduce the problem before fixing it and then confirm the fix.

P.S. I've reopened  bug 803617  so make sure that I fix the infra code so that it prints these Windows error code in easier to recognize hex.

Comment 7 by chengx@chromium.org, Feb 20 2018

Cc: brucedaw...@chromium.org
Yup, I will try all the possible solutions. Thank Bruce for tracking down the root cause!
Project Member

Comment 8 by bugdroid1@chromium.org, Feb 23 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/34ca404e17644da5b9afb452366188a1114c93f7

commit 34ca404e17644da5b9afb452366188a1114c93f7
Author: Xi Cheng <chengx@chromium.org>
Date: Fri Feb 23 00:10:12 2018

[Reland] Run notification_helper_unittests on Chromium buildbots

This is a reland of  issue 915016 . The difference between these two CLs
is that this CL removes this unit test from more buildbots which run on
Windows 7 machines. Since this unit test is Windows 8+ only, running it
on those buildbots caused exceptions.

Bug:  734095 ,  813553 
Change-Id: I3b7718a270f08052c81355c36c684156caeb5a1c
Reviewed-on: https://chromium-review.googlesource.com/927107
Commit-Queue: Xi Cheng <chengx@chromium.org>
Reviewed-by: Dirk Pranke <dpranke@chromium.org>
Cr-Commit-Position: refs/heads/master@{#538630}
[modify] https://crrev.com/34ca404e17644da5b9afb452366188a1114c93f7/testing/buildbot/chromium.clang.json
[modify] https://crrev.com/34ca404e17644da5b9afb452366188a1114c93f7/testing/buildbot/chromium.fyi.json
[modify] https://crrev.com/34ca404e17644da5b9afb452366188a1114c93f7/testing/buildbot/chromium.win.json
[modify] https://crrev.com/34ca404e17644da5b9afb452366188a1114c93f7/testing/buildbot/gn_isolate_map.pyl
[modify] https://crrev.com/34ca404e17644da5b9afb452366188a1114c93f7/testing/buildbot/test_suite_exceptions.pyl
[modify] https://crrev.com/34ca404e17644da5b9afb452366188a1114c93f7/testing/buildbot/test_suites.pyl

Comment 9 by chengx@chromium.org, Feb 23 2018

Now we are using the first approach in comment 6: use the blacklists so that this test doesn't run on Windows 7.

More experiments to come.
Labels: -Pri-1 Pri-3
This can be considered as fixed, but we are experimenting with different methods. So lower the priority. 
Cc: grt@chromium.org
I am trying to get rid of the blacklist used in the CL as in comment 8, and make this notification_helper_unittests return early in main() of run_all_unittests.cc for all Win7 bots.

The swarming task page shows the status is completed(success) as in https://chromium-swarm.appspot.com/task?id=3c178f4768bede10&refresh=10&show_raw=1

However, the trybot was unhappy. It was in purple (exception) with output message that:
"shard 0 test output missing.
Task ran but no result was found: shard 0 test output was missing.
some shards did not complete: 0"

So I am not sure if this early return approach is desired. 

I searched our codebase, and found that chrome/installer/util/run_all_unittests.cc early returns when the COM initialization fails. I think this is fine because we want it to indicate failure/exception, which is different from my case where we want the early return to indicate "fine, or at least non-failure/exception".

Thoughts please?

Comment 12 by grt@chromium.org, Mar 8 2018

Rather than early-exit, try making the test harness skip all tests by way of a filter:

#include "testing/gtest/include/gtest/gtest.h"

int main(...) {
  if (no tests should run) {
    // Skip all tests but let Google Test run anyway to generate its output.
    testing::GTEST_FLAG(filter) = "-*";
  }
  ...
}
Re grt@:

Unfortunately, that doesn't work. The filter flag doesn't seem be picked up by the test environment, so the tests run anyway.
I put "testing::GTEST_FLAG(filter) = "-*";" inside TestSuite::Initialize(), and finally its value got picked up. A local run of this unit test gave the following output, which I doubt is desired. 

Note: Google Test filter = -*
[==========] Running 0 tests from 0 test cases.
[==========] 0 tests from 0 test cases ran. (2 ms total)
[  PASSED  ] 0 tests.
[70216:55008:0308/170539.563:1321800703:ERROR:unit_test_launcher.cc(338)] no test result for ComServerModuleTest.EventSignalTest
[8/8] ComServerModuleTest.EventSignalTest (UNKNOWN)
2 tests had unknown result:
    ComServerModuleTest.EventSignalTest (../../notification_helper/com_server_module_unittest.cc:57)
    NotificationHelperTest.NotificationHelperServerTest (../../notification_helper/notification_helper_process_unittest.cc:141)
Tests took 0 seconds.

Comment 15 by grt@chromium.org, Mar 9 2018

It may take some fiddling to make it work. Can you reproduce these results locally? Have a look at the bot output to find the exact command line used there. It probably has things like --test-launcher-bot-mode on it. These make the harness run tests in parallel in subprocs. You may need to either put the filter on the command line that is used to launch the sub procs, or maybe you'll need to move the version check and flag setting into the sub procs.
Project Member

Comment 16 by bugdroid1@chromium.org, Mar 9 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6c23b352fd156e67e6223e103b69c3877922751a

commit 6c23b352fd156e67e6223e103b69c3877922751a
Author: Nico Weber <thakis@chromium.org>
Date: Fri Mar 09 14:22:12 2018

Disable notification_helper_unittests on libc++ bot too.

See bug, comment 8.

Bug:  813553 
Change-Id: Ie584c5f30ea23e0dfd70093600dd6d9f5d6c4dc7
Reviewed-on: https://chromium-review.googlesource.com/957122
Reviewed-by: Hans Wennborg <hans@chromium.org>
Commit-Queue: Nico Weber <thakis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#542099}
[modify] https://crrev.com/6c23b352fd156e67e6223e103b69c3877922751a/testing/buildbot/chromium.clang.json
[modify] https://crrev.com/6c23b352fd156e67e6223e103b69c3877922751a/testing/buildbot/test_suite_exceptions.pyl

Re #15 (grt@), yes, I could repro it locally, and the command line used has --test-launcher-bot-mode. Supplying the "--gtest_filter" string to the sub procs may require quite a lot of changes, as currently the {argc, argv} pair in main() fall through both base::LaunchUnitTestsSerially() and base::TestSuite ctor. 

Given that we have so many blacklists used in our test harness, I suspect we'd better fix it systematically. That means, this could be a feature request.

Comment 18 by grt@chromium.org, Mar 12 2018

Ah, well. It was worth a shot. Is it argc,argv that's passed down to subprocs, or is it the CommandLine? If the latter, perhaps you can add --gtest_filter to the launcher proc's CommandLine. Maybe there's a simpler way. You could try to figure out what the test could minimally emit that would satisfy whatever it is that's reporting "Task ran but no result was found".
Quick reply to #18: It is argc,argv that's passed down, not the CommandLine which is easy to handle. I will dig deeper and update whatever I can get later.
Update: Currently the logic w.r.t --gtest_filter is also handled in test_suite_exceptions.pyl, same as the blacklist logic.

For blacklist, we put all bots we want to get rid of in 'remove_from'.
For --gtest_filter, we put all affected bots under 'modifications'. A search of "--gtest_filter" in this file gives all the examples. 

Therefore, the {argc,argv} value pair is initialized from this file, and passed all the way down. I am not sure if we want to create another method (i.e., change the values in the middle of the way) for notification_helper_unittests in the current stage.
Status: Fixed (was: Assigned)
Mark fixed for now based on comment 20.
Project Member

Comment 22 by bugdroid1@chromium.org, May 22 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/252643a98d15ec914caf2f9973ecbea339cbc045

commit 252643a98d15ec914caf2f9973ecbea339cbc045
Author: Hans Wennborg <hans@chromium.org>
Date: Tue May 22 14:41:12 2018

Add bug reference for notification_helper_unittests exclusions

R=thakis@chromium.org

Bug:  813553 
Change-Id: I456f97a740634b4e6ca2c77e7eb8bd7c9b107131
Reviewed-on: https://chromium-review.googlesource.com/1069067
Commit-Queue: Hans Wennborg <hans@chromium.org>
Reviewed-by: Nico Weber <thakis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#560575}
[modify] https://crrev.com/252643a98d15ec914caf2f9973ecbea339cbc045/testing/buildbot/test_suite_exceptions.pyl

Sign in to add a comment