New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 717648 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Long OOO (go/where-is-mgiuca)
Closed: May 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug



Sign in to add a comment

"HotwordInstallerBrowserTest.AbortInstallOnShutdown" is flaky

Project Member Reported by chromium...@appspot.gserviceaccount.com, May 2 2017

Issue description

"HotwordInstallerBrowserTest.AbortInstallOnShutdown" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 13 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyPQsSBUZsYWtlIjJIb3R3b3JkSW5zdGFsbGVyQnJvd3NlclRlc3QuQWJvcnRJbnN0YWxsT25TaHV0ZG93bgw.

Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs
 
Cc: kcaratt...@chromium.org mgiuca@chromium.org
I'm not really able to tell exactly what CL caused this test to be flaky, any ideas?
Trace:

[ RUN      ] HotwordInstallerBrowserTest.AbortInstallOnShutdown
[5000:3696:0502/101435.613:ERROR:service_manager_connection_impl.cc(285)] Can't create service network. No handler found.
[5808:5912:0502/101435.769:INFO:media_foundation_video_encode_accelerator_win.cc(329)] Windows versions earlier than 8 are not supported.
[5000:3984:0502/101436.002:FATAL:thread_restrictions.cc(38)] Check failed: false. Function marked as IO-only was called from a thread that disallows IO!  If this thread really should be allowed to make IO calls, adjust the call to base::ThreadRestrictions::SetIOAllowed() in this thread's startup.
Backtrace:
	base::debug::StackTrace::StackTrace [0x0265FBA7+55]
	base::debug::StackTrace::StackTrace [0x0262653A+10]
	base::ThreadRestrictions::AssertIOAllowed [0x025D2A89+153]
	base::DeleteFileW [0x025D744F+31]
	base::ScopedTempDir::~ScopedTempDir [0x025D5B13+35]
	TestingProfile::~TestingProfile [0x025CF4F2+450]
	extensions::HotwordInstallerBrowserTest_AbortInstallOnShutdown_Test::RunTestOnMainThread [0x01296808+83]
	content::BrowserTestBase::ProxyRunTestOnMainThreadLoop [0x02749C69+233]
	ChromeBrowserMainParts::PreMainMessageLoopRunImpl [0x02EA0D11+3314]
	ChromeBrowserMainParts::PreMainMessageLoopRun [0x02E9FFE7+167]
	content::BrowserMainLoop::PreMainMessageLoopRun [0x018F49F0+112]
	content::StartupTaskRunner::RunAllTasksNow [0x01B535A2+35]
	content::BrowserMainLoop::CreateStartupTasks [0x018F2A1E+394]
	content::BrowserMainRunnerImpl::Initialize [0x018F5BC0+704]
	content::BrowserMain [0x018F1665+117]
	content::RunNamedProcessTypeMain [0x025C163C+206]
	content::ContentMainRunnerImpl::Run [0x025C153D+413]
	service_manager::Main [0x0352B494+490]
	content::ContentMain [0x025C0BEF+39]
	content::BrowserTestBase::SetUp [0x0274A192+866]
	InProcessBrowserTest::SetUp [0x02693DF6+310]
	testing::internal::HandleExceptionsInMethodIfSupported<testing::TestCase,void> [0x02C8C59D+32]
	testing::Test::Run [0x02C936EF+51]
	testing::TestCase::Run [0x02C937C5+133]
	testing::internal::UnitTestImpl::RunAllTests [0x02C93B51+433]
	testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl,bool> [0x02C8C5E1+32]
	testing::UnitTest::Run [0x02C9396E+133]
	base::TestSuite::Run [0x026A3DE6+95]
	ChromeTestSuiteRunner::RunTestSuite [0x04DF4058+40]
	content::LaunchTests [0x02742F10+583]
	LaunchChromeTests [0x04DF4014+62]
	main [0x04DF3E29+63]
	__scrt_common_main_seh [0x04DA39AB+249] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:253)
	BaseThreadInitThunk [0x7641337A+18]
	RtlInitializeExceptionChain [0x77B692B2+99]
	RtlInitializeExceptionChain [0x77B69285+54]
Cc: jam@chromium.org
Status: Available (was: Untriaged)
I don't know the root cause but I've found the proximate cause: r467438 (jam@; 2017-04-26).

TestingProfile contains a ScopedTempDir as a member. TestingProfile runs on the UI thread; ScopedTempDir performs IO in its destructor (which isn't allowed on the UI thread after the above patch).

However, jam@ added this line to the TestingProfile destructor:

    base::ThreadRestrictions::ScopedAllowIO allow_io;
    ignore_result(temp_dir_.Delete());

By default, ~ScopedTempDir illegally performs IO by calling ScopedTempDir::Delete. The above line of code pre-emptively calls ScopedTempDir::Delete in a context where IO is allowed.

The bug is ignore_result(): if Delete() succeeds, it clears the path so ~ScopedTempDir will do nothing. If it fails, it has no effect, so ~ScopedTempDir will try to Delete() again, and this time, it crashes. TestingProfile's destructor will crash if Delete fails.

Two options:
1. Change ignore_result to CHECK(temp_dir_.Delete()). This doesn't fix the crash but makes it explicit: Delete() is expected to succeed.
2. Change this to:

    if (!temp_dir_.Delete()) {
      LOG(WARNING) << "...";
      // Make sure |temp_dir_| no longer owns the directory
      // (so it does not try to delete it in a context where IO
      // is not allowed).
      ignore_result(temp_dir_.Take));
    }

This reverts back to the status quo (if the temp dir can't be deleted for some reason, it will log a warning but otherwise pass the test).

I think #1 is preferable, but then that leaves us with the same issue: why is HotwordInstallerBrowserTest flakily failing to delete the temporary directory? I don't have a good answer for that. (And I am totally unfamiliar with the hotword code; I am just an OWNER as a backup.)

jam: Which of the above options do you prefer? If #1, I think we should just land the CHECK, and make HotwordInstallerBrowserTest work around this so we can stop the flake.
Cc: -mgiuca@chromium.org
Labels: OS-Windows
Owner: mgiuca@chromium.org
Status: Started (was: Available)
I've put up two competing CLs.

Solution 1: https://codereview.chromium.org/2854323002 (CHECK).
Solution 2: https://codereview.chromium.org/2855123002 (ignore).

These can be manually tested by adding "ret = false" to ScopedTempDir::Delete:

--- a/base/files/scoped_temp_dir.cc
+++ b/base/files/scoped_temp_dir.cc
@@ -62,6 +62,7 @@ bool ScopedTempDir::Delete() {
     return false;
 
   bool ret = base::DeleteFile(path_, true);
+  ret = false;
   if (ret) {
     // We only clear the path if deleted the directory.
     path_.clear();

then running any browser test that uses TestingProfile (e.g., out/Default/browser_tests --gtest_filter='HotwordInstallerBrowserTest.AbortInstallOnShutdown').

Without either CL applied, it will crash in base::DeleteFile. Solution 1 will always crash on the new CHECK. Solution 2 will pass but log a warning.
Cc: devlin@chromium.org
As for the hotword test, I think I know what's going on here.

The test starts installing an extension, then *deliberately* does not wait for it to finish, before shutting down the browser test infrastructure. It is a test that it's safe to shutdown the system while an extension is still being installed.

I think by design that means it's going to sometimes fail to delete the profile directory on Windows, because another thread will still be installing an extension (i.e., have files open) while the UI thread is shutting down. This could be fixed "properly" by having the UI thread gracefully communicate to the IO thread that it is shutting down, but that is a huge refactor that should be done by the extensions team (not me!). +devlin in case you want to think about that possibility.

If this really is Windows-only, I think we can just blacklist the test on Windows. (This code shouldn't even be used outside of Chrome OS anyway.)

CL to disable on Windows: https://codereview.chromium.org/2860463003 (if we go with Solution 1).
Thanks for looking into this! The failure does appear to be Windows only so I think I will try landing your CL to disable on Windows for now. If we do go with solution #2, we can revert it back.
Project Member

Comment 7 by chromium...@appspot.gserviceaccount.com, May 3 2017

Detected 9 new flakes for test/step "HotwordInstallerBrowserTest.AbortInstallOnShutdown". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyPQsSBUZsYWtlIjJIb3R3b3JkSW5zdGFsbGVyQnJvd3NlclRlc3QuQWJvcnRJbnN0YWxsT25TaHV0ZG93bgw. This message was posted automatically by the chromium-try-flakes app.
Project Member

Comment 8 by bugdroid1@chromium.org, May 4 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a8ace33174ea8cd5ee2cf41804e0e0733be52dd3

commit a8ace33174ea8cd5ee2cf41804e0e0733be52dd3
Author: mgiuca <mgiuca@chromium.org>
Date: Thu May 04 13:47:07 2017

Disable HotwordInstallerBrowserTest.AbortInstallOnShutdown on Windows.

Flaky on Windows because file handles may be open during shutdown (which
causes TestingProfile to crash on shutdown).

BUG= 717648 

Review-Url: https://codereview.chromium.org/2860463003
Cr-Commit-Position: refs/heads/master@{#469322}

[modify] https://crrev.com/a8ace33174ea8cd5ee2cf41804e0e0733be52dd3/chrome/browser/search/hotword_installer_browsertest.cc

Labels: -Via-TryFlakes
Labels: -Sheriff-Chromium
Project Member

Comment 11 by chromium...@appspot.gserviceaccount.com, May 5 2017

Labels: Sheriff-Chromium
Detected 3 new flakes for test/step "HotwordInstallerBrowserTest.AbortInstallOnShutdown". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyPQsSBUZsYWtlIjJIb3R3b3JkSW5zdGFsbGVyQnJvd3NlclRlc3QuQWJvcnRJbnN0YWxsT25TaHV0ZG93bgw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Labels: -Sheriff-Chromium
Removing the Sheriff-Chromium label again. All the failures were before the test was disabled on Windows.
Status: Fixed (was: Started)

Sign in to add a comment