New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 864205 link

Starred by 4 users

Issue metadata

Status: Archived
Owner:
Closed: Jan 9
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug

Blocked on:
issue 864915



Sign in to add a comment

chromedriver_py_tests flaky again on win7_chromium_rel_ng

Project Member Reported by kbr@chromium.org, Jul 16

Issue description

chromedriver_py_tests has become flaky again on win7_chromium_rel_ng. Here are links to just a few failing builds:

https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/39144
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/39133
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/39131
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/39116
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/39111
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/39109
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/39102

(Some of these might be legitimate failures, but the first one, at least, is definitely a flake.)

Multiple different tests are observed flaky, for example:

__main__.MobileEmulationCapabilityTest.testSendKeysToElement
__main__.SessionHandlingTest.testGetSessions
__main__.SessionHandlingTest.testQuitASessionMoreThanOnce

Similar flakiness, caused by tab crashes, was previously seen in  Issue 858963 .

johnchen@ could you triage this as you triaged the earlier failure? CC'ing sheriffs and marking P1 as this is affecting the commit queue.

 
Blockedon: 858963
Components: Tests>WebDriver
 Issue 864206  has been merged into this issue.
 Issue 864207  has been merged into this issue.
The error from ChromeDriver indicates that occasionally Chrome failed to start. Are there any recent issues with Chrome startup on Windows?
Not sure. Are there any failures in other test suites like browser_tests which may give hints?

Labels: OS-Windows
 Issue 864692  has been merged into this issue.
 Issue 864696  has been merged into this issue.
Detected 6 new flakes for test/step "__main__.SessionHandlingTest.testQuitASessionMoreThanOnce". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyRAsSBUZsYWtlIjlfX21haW5fXy5TZXNzaW9uSGFuZGxpbmdUZXN0LnRlc3RRdWl0QVNlc3Npb25Nb3JlVGhhbk9uY2UM. This message was posted automatically by the chromium-try-flakes app.
 Issue 864750  has been merged into this issue.
 Issue 864760  has been merged into this issue.
Too many flakes on win7_chromium_rel_ng...

Can we disable or supress it?
I'm working on a changelist that should help us determine what is going on here. We think that Chrome is crashing, but we can't reproduce this failure locally.
Owner: crouleau@chromium.org
Status: Assigned (was: Untriaged)
Thanks for the update!

Setting crouleau@ as owner for better tracking.
 Issue 864814  has been merged into this issue.
 Issue 864829  has been merged into this issue.
Note that the specific test case that fails is different for each failure. My changelist is here: https://chromium-review.googlesource.com/c/chromium/src/+/1141163 

Below is the chromedriver log from a flaky test from my changelist (search for "chrome not reachable" in here https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8940701586687600832/+/steps/chromedriver_py_tests__with_patch_/0/stdout ). I don't know why "Chrome Automation Extension" is one of the sessions. I thought that we didn't use that anymore.

[1531881541.431][INFO]: Launching chrome: "e:\b\s\w\ir\out\Release\chrome.exe" --disable-background-networking --disable-client-side-phishing-detection --disable-default-apps --disable-hang-monitor --disable-popup-blocking --disable-prompt-on-repost --disable-sync --disable-web-resources --enable-automation --enable-logging --force-fieldtrials=SiteIsolationExtensions/Control --ignore-certificate-errors --load-extension="C:\Users\CHROME~2\AppData\Local\Temp\scoped_dir5132_29986\internal" --log-level=0 --metrics-recording-only --no-first-run --password-store=basic --remote-debugging-port=0 --test-type=webdriver --use-mock-keychain --user-data-dir="C:\Users\CHROME~2\AppData\Local\Temp\scoped_dir5132_5919" data:,
[1531881541.612][DEBUG]: DevTools request: http://localhost:59489/json/version
[1531881542.012][DEBUG]: DevTools response: {

   "Browser": "Chrome/69.0.3495.0",

   "Protocol-Version": "1.3",

   "User-Agent": "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3495.0 Safari/537.36",

   "V8-Version": "6.9.423",

   "WebKit-Version": "537.36 (@59db2de675cb0bd93e784b691265082a5da52db0)",

   "webSocketDebuggerUrl": "ws://localhost:59489/devtools/browser/5e801aa4-696a-4c89-9741-bbefa55d4dca"

}


[1531881542.016][DEBUG]: DevTools request: http://localhost:59489/json
[1531881542.024][DEBUG]: DevTools response: [ {

   "description": "",

   "devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:59489/devtools/page/0F3F4CD53EB0712DEE5D92792CCFD705",

   "id": "0F3F4CD53EB0712DEE5D92792CCFD705",

   "title": "Chrome Automation Extension",

   "type": "background_page",

   "url": "chrome-extension://aapnijgdinlhnhlmodcfapnahmbfebeb/_generated_background_page.html",

   "webSocketDebuggerUrl": "ws://localhost:59489/devtools/page/0F3F4CD53EB0712DEE5D92792CCFD705"

}, {

   "description": "",

   "devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:59489/devtools/page/37163917C2E75399E4C3FDABFB703143",

   "id": "37163917C2E75399E4C3FDABFB703143",

   "title": "",

   "type": "page",

   "url": "data:,",

   "webSocketDebuggerUrl": "ws://localhost:59489/devtools/page/37163917C2E75399E4C3FDABFB703143"

} ]


[1531881542.025][DEBUG]: DevTools request: http://localhost:59489/json
[1531881542.027][DEBUG]: DevTools response: [ {

   "description": "",

   "devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:59489/devtools/page/0F3F4CD53EB0712DEE5D92792CCFD705",

   "id": "0F3F4CD53EB0712DEE5D92792CCFD705",

   "title": "Chrome Automation Extension",

   "type": "background_page",

   "url": "chrome-extension://aapnijgdinlhnhlmodcfapnahmbfebeb/_generated_background_page.html",

   "webSocketDebuggerUrl": "ws://localhost:59489/devtools/page/0F3F4CD53EB0712DEE5D92792CCFD705"

}, {

   "description": "",

   "devtoolsFrontendUrl": "/devtools/inspector.html?ws=localhost:59489/devtools/page/37163917C2E75399E4C3FDABFB703143",

   "id": "37163917C2E75399E4C3FDABFB703143",

   "title": "",

   "type": "page",

   "url": "data:,",

   "webSocketDebuggerUrl": "ws://localhost:59489/devtools/page/37163917C2E75399E4C3FDABFB703143"

} ]


[1531881542.028][INFO]: resolved localhost to ["::1","127.0.0.1"]
[1531881542.028][DEBUG]: WebSocket::Connect code=ERR_IO_PENDING
[1531881544.032][WARNING]: Timed out connecting to Chrome, retrying...
[1531881544.033][INFO]: resolved localhost to ["::1","127.0.0.1"]
[1531881544.033][DEBUG]: WebSocket::Connect code=ERR_IO_PENDING
[1531881546.091][DEBUG]: WebSocket::OnSocketConnect code=ERR_CONNECTION_REFUSED
[1531881546.091][DEBUG]: failed to connect to localhost (error -102)
[1531881546.092][DEBUG]: DevTools request: http://localhost:59489/json
[1531881548.137][DEBUG]: DevTools request failed
[1531881548.142][SEVERE]: Unable to terminate process: Access is denied. (0x5)
[1531881548.142][INFO]: RESPONSE InitSession session not created exception
from chrome not reachable
  (Session info: chrome=69.0.3495.0)
[1531881548.142][DEBUG]: Log type 'driver' lost 2 entries on destruction
[1531881548.142][DEBUG]: Log type 'browser' lost 0 entries on destruction
[1531881548.153][INFO]: COMMAND InitSession {
   "desiredCapabilities": {
      "chromeOptions": {

      },
      "goog:testName": "__main__.ChromeDriverTest.testShadowDomText",
      "loggingPrefs": {

      }
   }
}
The additional information from the changelist by crouleau@ indicates that the flakiness was caused by Chrome occasionally crashing a few seconds after it is started by ChromeDriver.

* In builds https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/40718 and https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/40913, ChromeDriver was able to make an initial connection to Chrome. But a few seconds later, additional connection attempts resulted in ERR_CONNECTION_REFUSED error, indicating that Chrome is no longer listening on the TCP port, most likely due to crashing.

* In build https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/40826, a few seconds after ChromeDriver started Chrome, it noticed that the Chrome process no longer existed. Calling base::GetTerminationStatus returned TERMINATION_STATUS_PROCESS_CRASHED.

Estimating from the recent history of win7_chromium_rel_ng builds, each startup of Chrome has about 0.5% chance of crashing. Since each run of chromedriver_py_tests starts up Chrome about 130 times, there is still a significant chance for each run of chromedriver_py_tests to encounter at least one error.

However, it is not clear why Chrome crashed. The flakiness appears to have started around build 38973 (https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/38973), which suggests that the flakiness was introduced around commit position 575335 or so. However, looking at the commits around that position didn't reveal a clear culprit. I am also unable to repro this issue on my own Windows box.

Is there any good way to debug crashes during test on a swarming bot, such as collecting crash dumps and debug symbols?
I don't know where the minidump file is located, but I guess the easiest first step would be to find the minidump file and output it in isolated outputs dir.

Here's a bug I filed based on a comment kbr@ made about improving ChromeDriver tests debugging: https://bugs.chromium.org/p/chromium/issues/detail?id=864818
Cc: mar...@chromium.org
I don't know about collecting crash dumps, but it is possible to debug a failing swarm bot -- See https://groups.google.com/a/google.com/d/msgid/chrome-team/CANAQWOVpKFAH6Rode8W4CVW3ShaWDm7QxOyMPRbV%2B_hxwpgfLQ%40mail.gmail.com?utm_medium=email&utm_source=footer (sorry, Google-internal ML)

+maruel for more info if needed.
Owner: johnchen@chromium.org
Reassigning to John who has more cycles for this right now (and more expertise besides :) )
Note that one recent try job which was failing chromedriver_py_tests instead failed browser_tests:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/40864

and this one produced stack traces:

[4512:3148:0717/191432.566:FATAL:ref_counted.h(83)] Check failed: CalledOnValidSequence(). 

Backtrace:

	base::debug::StackTrace::StackTrace [0x03ABEB70+32]

	base::debug::StackTrace::StackTrace [0x03ABE32D+13]

	logging::LogMessage::~LogMessage [0x03AD7FA3+83]

	base::subtle::RefCountedBase::Release [0x01089301+145]

	syncer::BlockingModelTypeStoreImpl::~BlockingModelTypeStoreImpl [0x06D9953B+59]

	syncer::BlockingModelTypeStoreImpl::`scalar deleting destructor' [0x06D9A15B+11]

	std::unique_ptr<CPDF_Parser::TrailerData,std::default_delete<CPDF_Parser::TrailerData> >::~unique_ptr<CPDF_Parser::TrailerData,std::default_delete<CPDF_Parser::TrailerData> > [0x02624B46+22]

	base::internal::BindState<void (__cdecl*)(base::OnceCallback<void __cdecl(std::unique_ptr<policy::URLBlacklist,std::default_delete<policy::URLBlacklist> >)>,std::unique_ptr<policy::URLBlacklist,std::default_delete<policy::URLBlacklist> > *),base::OnceCall [0x0264983B+27]

	base::internal::CallbackBase::~CallbackBase [0x03AB91C9+25]

	base::internal::PostTaskAndReplyImpl::PostTaskAndReply [0x04B2E334+1508]

	base::internal::PostTaskAndReplyImpl::PostTaskAndReply [0x04B2E513+1987]

	base::internal::CallbackBase::~CallbackBase [0x03AB91C9+25]

	base::internal::IncomingTaskQueue::TriageQueue::Clear [0x04B1A5B7+375]

	base::MessageLoop::DeletePendingTasks [0x03AE1365+101]

	base::MessageLoop::~MessageLoop [0x03AE3097+247]

	base::MessageLoopForUI::`scalar deleting destructor' [0x039D841B+11]

	content::ContentMainRunnerImpl::Shutdown [0x03A74D9E+222]

	service_manager::Main [0x0417BE8C+1484]

	content::ContentMain [0x03A74293+51]

	content::BrowserTestBase::SetUp [0x03BC5356+1942]

	InProcessBrowserTest::SetUp [0x03B88006+566]

	chrome_browser_interstitials::SecurityInterstitialIDNTest::VerifyIDNDecoded [0x0119EEEC+1436]

	testing::Test::Run [0x01CB86D0+112]

	testing::TestInfo::Run [0x01CB8F32+210]

	testing::TestCase::Run [0x01CB93E4+244]

	testing::internal::UnitTestImpl::RunAllTests [0x01CBF585+629]

	testing::UnitTest::Run [0x01CBF209+153]

	base::TestSuite::Run [0x03B99C34+100]

	ChromeTestSuiteRunner::RunTestSuite [0x07AEB59C+44]

	content::LaunchTests [0x03BDBEA2+418]

	LaunchChromeTests [0x07AEB943+259]

	main [0x07AEB52F+111]

	__scrt_common_main_seh [0x07B0941C+250] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283)

	BaseThreadInitThunk [0x76F0337A+18]

	RtlInitializeExceptionChain [0x779C92B2+99]

	RtlInitializeExceptionChain [0x779C9285+54]



[105/1037] IOThreadBrowserTestWithHangingPacRequest.Shutdown (CRASHED)


Not sure whether it's the same issue and/or has already been reverted, but worth investigating.

 Issue #864848  appears to be similar to this one, except it contains a chromedriver_py_tests failure on Linux, with a Chrome crash at the following call stack (the error message "Check failed: CalledOnValidSequence()" is identical to the error message in comment #23, though most of the call stack is different):

[2144:2144:0717/154814.867678:FATAL:ref_counted.h(61)] Check failed: CalledOnValidSequence(). 
#0 0x55a58c196f8c base::debug::StackTrace::StackTrace()
#1 0x55a58c0e505b logging::LogMessage::~LogMessage()
#2 0x55a58974e617 base::subtle::RefCountedBase::AddRef()
#3 0x55a58ccb2f59 network::ResourceResponseInfo::ResourceResponseInfo()
#4 0x55a58d315cac network::(anonymous namespace)::SimpleURLLoaderImpl::OnReceiveResponse()
#5 0x55a58983d98f network::mojom::URLLoaderClientStubDispatch::Accept()
#6 0x55a58c7fa4e2 mojo::InterfaceEndpointClient::HandleValidatedMessage()
#7 0x55a58c7fa066 mojo::FilterChain::Accept()
#8 0x55a58c7fb9e2 mojo::InterfaceEndpointClient::HandleIncomingMessage()
#9 0x55a58c802dad mojo::internal::MultiplexRouter::ProcessIncomingMessage()
#10 0x55a58c802160 mojo::internal::MultiplexRouter::Accept()
#11 0x55a58c7fa066 mojo::FilterChain::Accept()
#12 0x55a58c7f793b mojo::Connector::ReadSingleMessage()
#13 0x55a58c7f8434 mojo::Connector::ReadAllAvailableMessages()
#14 0x55a58c7f8296 mojo::Connector::OnHandleReadyInternal()
#15 0x55a589ee7724 mojo::SimpleWatcher::DiscardReadyState()
#16 0x55a58c7eb853 mojo::SimpleWatcher::OnHandleReady()
#17 0x55a58c7ebdce _ZN4base8internal7InvokerINS0_9BindStateIMN4mojo13SimpleWatcherEFvijRKNS3_18HandleSignalsStateEEJNS_7WeakPtrIS4_EEijS5_EEEFvvEE7RunImplIRKS9_RKNSt3__15tupleIJSB_ijS5_EEEJLm0ELm1ELm2ELm3EEEEvOT_OT0_NSI_16integer_sequenceImJXspT1_EEEE
#18 0x55a58c0edffd base::debug::TaskAnnotator::RunTask()
#19 0x55a58c0ec676 base::MessageLoop::RunTask()
#20 0x55a58c0eca8a base::MessageLoop::DeferOrRunPendingTask()
#21 0x55a58c0ecdbc base::MessageLoop::DoWork()
#22 0x55a58c0f252f base::(anonymous namespace)::WorkSourceDispatch()
#23 0x7fec27f46e04 g_main_context_dispatch
#24 0x7fec27f47048 <unknown>
#25 0x7fec27f470ec g_main_context_iteration
#26 0x55a58c0f22e2 base::MessagePumpGlib::Run()
#27 0x55a58c0ebfe1 base::MessageLoop::Run()
#28 0x55a58c118686 base::RunLoop::Run()
#29 0x55a58bd349e8 ChromeBrowserMainParts::MainMessageLoopRun()
#30 0x55a58a1f5a97 content::BrowserMainLoop::RunMainMessageLoopParts()
#31 0x55a58a1f8c13 content::BrowserMainRunnerImpl::Run()
#32 0x55a58a1f1829 content::BrowserMain()
#33 0x55a58bcd85d4 content::ContentMainRunnerImpl::Run()
#34 0x55a58bd0ecd9 service_manager::Main()
#35 0x55a58bcd6631 content::ContentMain()
#36 0x55a5893d11b3 ChromeMain
#37 0x7fec24182f45 __libc_start_main
#38 0x55a5893d102a _start
Detected 6 new flakes for test/step "__main__.SessionHandlingTest.testGetSessions". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyNwsSBUZsYWtlIixfX21haW5fXy5TZXNzaW9uSGFuZGxpbmdUZXN0LnRlc3RHZXRTZXNzaW9ucww. This message was posted automatically by the chromium-try-flakes app.
Project Member

Comment 26 by bugdroid1@chromium.org, Jul 18

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1ca17c55c4e23840893be64e18b80518ea26da85

commit 1ca17c55c4e23840893be64e18b80518ea26da85
Author: Caleb Rouleau <crouleau@chromium.org>
Date: Wed Jul 18 21:47:22 2018

[ChromeDriver] Check for Chrome crashes after 1 second rather than 60 seconds

Bug:  864205 
Change-Id: I5d9ad26b1f14ae28ca1b80399539dbb3c3a978ee
Reviewed-on: https://chromium-review.googlesource.com/1141163
Reviewed-by: John Chen <johnchen@chromium.org>
Commit-Queue: Caleb Rouleau <crouleau@chromium.org>
Cr-Commit-Position: refs/heads/master@{#576227}
[modify] https://crrev.com/1ca17c55c4e23840893be64e18b80518ea26da85/chrome/test/chromedriver/chrome_launcher.cc

magchen@, sunnyps@ and I have been debugging a problem with a CL (the root cause was something else) and were looking into this:

Failing try job:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win7_chromium_rel_ng/41655

Failing shard:
https://chromium-swarm.appspot.com/task?id=3ec7955f221dfb10&refresh=10&show_raw=1

We downloaded the isolate as follows:

C:\src\debugging-hung-tests\output>python ..\chromium\src\tools\swarming_client\isolateserver.py download -I isolateserver.appspot.com -s 4c4c1c6d1212d3bb8b8aa12e10778b288ad001b7 -t output

cd output

out\Release\chrome.exe --user-data-dir=C:\Users\magchen\temp\t1 --enable-logging -- http://google.com

The browser reliably crashed on startup with the following stack trace. Maybe this is related? I'm not convinced, because the chromedriver_py_tests failures are intermittent, but this one happened every time.

[47536:21020:0718/135107.095:FATAL:resource_bundle.cc(123)] Check failed: false. Unable to load image with id 277, scale=2
Backtrace:
        base::debug::StackTrace::StackTrace [0x52439730+32]
        base::debug::StackTrace::StackTrace [0x52438F2D+13]
        logging::LogMessage::~LogMessage [0x524533F3+83]
        ui::ResourceBundle::ResourceBundleImageSource::GetImageForScale [0x524E256A+446]
        gfx::internal::ImageSkiaStorage::FindRepresentation [0x5255F459+597]
        gfx::ImageSkia::GetRepresentation [0x52560184+118]
        gfx::ImageSkiaOperations::CreateIconWithBadge [0x5256BACD+3111]
        gfx::internal::ImageSkiaStorage::FindRepresentation [0x5255F459+597]
        gfx::ImageSkia::GetRepresentation [0x52560184+118]
        gfx::Canvas::DrawImageInt [0x5256957B+27]
        gfx::Canvas::DrawImageInt [0x52569545+73]
        TabIcon::OnPaint [0x54041C83+425]
        views::View::Paint [0x52400658+968]
        views::View::RecursivePaintHelper [0x5240225C+172]
        views::View::PaintChildren [0x524020DE+92]
        Tab::PaintChildren [0x53E6895B+165]
        views::View::Paint [0x52400680+1008]
        TabStrip::PaintChildren [0x53B316AF+1105]
        views::View::Paint [0x52400680+1008]
        views::View::RecursivePaintHelper [0x5240225C+172]
        views::View::PaintChildren [0x524020DE+92]
        TopContainerView::PaintChildren [0x53B412CE+166]
        views::View::Paint [0x52400680+1008]
        views::View::RecursivePaintHelper [0x5240225C+172]
        views::View::PaintChildren [0x524020DE+92]
        BrowserView::PaintChildren [0x536371A8+32]
        views::View::Paint [0x52400680+1008]
        views::View::RecursivePaintHelper [0x5240225C+172]
        views::View::PaintChildren [0x524020DE+92]
        views::View::Paint [0x52400680+1008]
        views::View::RecursivePaintHelper [0x5240225C+172]
        views::View::PaintChildren [0x524020DE+92]
        views::View::Paint [0x52400680+1008]
        views::View::PaintFromPaintRoot [0x52402A97+63]
        ui::Layer::PaintContentsToDisplayList [0x5240E096+238]
        cc::PictureLayer::Update [0x52A51CC8+344]
        cc::LayerTreeHost::PaintContent [0x52A4B3C9+137]
        cc::LayerTreeHost::DoUpdateLayers [0x52A4AA40+2832]
        cc::LayerTreeHost::UpdateLayers [0x52A49E79+121]
        cc::SingleThreadProxy::BeginMainFrame [0x52FC27C6+342]
        base::internal::Invoker<base::internal::BindState<void (__thiscall cc::SingleThreadProxy::*)(viz::BeginFrameArgs const &),base::WeakPtr<cc::SingleThreadProxy>,viz::BeginFrameArgs>,void __cdecl(void)>::RunOnce [0x52FC30CD+157]
        base::debug::TaskAnnotator::RunTask [0x52B19D52+306]
        base::MessageLoop::RunTask [0x5245D093+467]
        base::MessageLoop::DeferOrRunPendingTask [0x5245D3ED+157]
        base::MessageLoop::DoWork [0x5245D667+599]
        base::MessagePumpForUI::DoRunLoop [0x5245F598+120]
        base::MessagePumpWin::Run [0x5245F121+65]
        base::MessageLoop::Run [0x5245CBE7+119]
        base::RunLoop::Run [0x5247F59C+204]
        ChromeBrowserMainParts::MainMessageLoopRun [0x530170BC+190]
        content::BrowserMainLoop::RunMainMessageLoopParts [0x51D3C199+59]
        content::BrowserMainRunnerImpl::Run [0x51D3E624+142]
        content::BrowserMain [0x51D390D5+157]
        content::RunBrowserProcessMain [0x523EFF78+84]
        content::ContentMainRunnerImpl::Run [0x523F06D6+680]
        content::ContentServiceManagerMainDelegate::RunEmbedderProcess [0x523EFCFF+19]
        service_manager::Main [0x523FA638+1384]
        content::ContentMain [0x523EFEFB+51]
        ChromeMain [0x515E119C+288]
        MainDllLoader::Launch [0x01355790+560]
        wWinMain [0x01351543+1347]
        __scrt_common_main_seh [0x0142D12A+248] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283)

Detected 8 new flakes for test/step "__main__.SessionHandlingTest.testQuitASessionMoreThanOnce". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyRAsSBUZsYWtlIjlfX21haW5fXy5TZXNzaW9uSGFuZGxpbmdUZXN0LnRlc3RRdWl0QVNlc3Npb25Nb3JlVGhhbk9uY2UM. This message was posted automatically by the chromium-try-flakes app.
Following suggestion from comment #21, I was able to reserve a swarming bot and repro this bug. I confirmed that the failures are caused by Chrome crashing. The following stack was displayed in the console at the time of the crash:

[5040:7888:0718/153618.444:FATAL:ref_counted.h(61)] Check failed: CalledOnValidSequence().
Backtrace:
        base::debug::StackTrace::StackTrace [0x65D84A10+32]
        base::debug::StackTrace::StackTrace [0x65D8420D+13]
        logging::LogMessage::~LogMessage [0x65D9E6D3+83]
        base::subtle::RefCountedBase::AddRef [0x65187BBF+229]
        network::ResourceResponseInfo::ResourceResponseInfo [0x66017DDE+150]
        network::SimpleURLLoader::Create [0x660A7DC2+3906]
        network::mojom::URLLoaderClientStubDispatch::Accept [0x6522DB30+1820]
        network::mojom::URLLoaderClientStub<mojo::RawPtrImplRefTraits<network::mojom::URLLoaderClient> >::Accept [0x65283DE9+19]
        mojo::InterfaceEndpointClient::HandleValidatedMessage [0x65F2F5B9+541]
        mojo::FilterChain::Accept [0x66535F4F+131]
        mojo::InterfaceEndpointClient::HandleIncomingMessage [0x65F3046A+106]
        mojo::internal::MultiplexRouter::ProcessIncomingMessage [0x65F33A1C+698]
        mojo::internal::MultiplexRouter::Accept [0x65F335A3+295]
        mojo::FilterChain::Accept [0x66535F4F+131]
        mojo::Connector::ReadSingleMessage [0x65F2DC3A+364]
        mojo::Connector::ReadAllAvailableMessages [0x65F2E36C+88]
        mojo::Connector::OnHandleReadyInternal [0x65F2E23A+126]
        base::internal::Invoker<base::internal::BindState<void (__thiscall ui::mojom::WindowTreeClient_OnDragOver_ProxyToResponder::*)(unsigned int),std::unique_ptr<ui::mojom::WindowTreeClient_OnDragOver_ProxyToResponder,std::default_delete<ui::mojom::WindowTreeC [0x6518D71F+15]
        favicon::FaviconService::FaviconResultsCallbackRunner [0x654AE034+24]
        base::internal::Invoker<base::internal::BindState<void (__cdecl*)(std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_strin [0x654AE04F+19]
        mojo::SimpleWatcher::OnHandleReady [0x65ECAC78+224]
        base::internal::Invoker<base::internal::BindState<void (__thiscall mojo::SimpleWatcher::*)(int,unsigned int,mojo::HandleSignalsState const ),base::Weak
Ptr<mojo::SimpleWatcher>,int,unsigned int,mojo::HandleSignalsState>,void __cdecl(void)>::Run [0x65ECAEF4+58]
        base::debug::TaskAnnotator::RunTask [0x66A9B212+306]
        base::internal::IncomingTaskQueue::RunTask [0x66428AC9+105]
        base::MessageLoop::RunTask [0x65DA8107+519]
        base::MessageLoop::DeferOrRunPendingTask [0x65DA845D+157]
        base::MessageLoop::DoWork [0x65DA86D7+599]
        base::MessagePumpForUI::DoRunLoop [0x65DAA598+120]
        base::MessagePumpWin::Run [0x65DAA121+65]
        base::MessageLoop::Run [0x65DA7C27+119]
        base::RunLoop::Run [0x65DCA58C+204]
        ChromeBrowserMainParts::MainMessageLoopRun [0x6695E29C+190]
        content::BrowserMainLoop::RunMainMessageLoopParts [0x656962D9+59]
        content::BrowserMainRunnerImpl::Run [0x65698764+142]
        content::BrowserMain [0x65693215+157]
        content::RunBrowserProcessMain [0x65D4CBA8+84]
        content::ContentMainRunnerImpl::Run [0x65D4D306+680]
        content::ContentServiceManagerMainDelegate::RunEmbedderProcess [0x65D4C92F+19]
        service_manager::Main [0x65D57234+1384]
        content::ContentMain [0x65D4CB2B+51]
        ChromeMain [0x64F2119C+288]
        MainDllLoader::Launch [0x012E5790+560]
        wWinMain [0x012E1543+1347]
        __scrt_common_main_seh [0x013BD12A+248] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283)
        BaseThreadInitThunk [0x7704338A+18]
        RtlInitializeExceptionChain [0x77849F72+99]
        RtlInitializeExceptionChain [0x77849F45+54]

(Note that on Linux, this stack trace is saved in ChromeDriver log, and easily available after the crash. On Windows, however, the stack trace is displayed in the console window, and thus much harder to get. Something to investigate and fix later.)
Components: UI>Browser>TabStrip
Who's working on favicons and would be able to find the culprit CL causing this crash?

Cc: pkasting@chromium.org
pkasting (already implicitly cc'd) might have ideas.
The stack in comment 27 reports unable to load image 277 at scale factor 2.  That's IDR_DEFAULT_FAVICON, which should be part of the resource bundle at scale factor 2 (it is in codesearch), suggesting to me some sort of problem with the build on that machine maybe.

Comment 29's stack looks different -- the FaviconService isn't used by the tabstrip.  Omnibox uses it for favicons in results.  But it looks to me like the stack there isn't the fault of the omnibox code but maybe something in mojo.
chromedriver_py_tests failed three times on a CL, preventing it from landing: 
https://chromium-review.googlesource.com/c/chromium/src/+/1141962/2 

Is it possible to disable it on windows until the issue is fixed?


Blockedon: -858963 864915
I've collected two Chrome crash dumps from Windows, and have attached them. The corresponding Chrome binary and symbols can be found at https://isolateserver.appspot.com/browse?namespace=default-gzip&hash=001b02d98750e9857d5168b969a33dbcc0a00a90

The stack trace is:

[3524:5576:0719/094148.198:FATAL:ref_counted.h(61)] Check failed: CalledOnValidSequence().
Backtrace:
        base::debug::StackTrace::StackTrace [0x676BCA80+32]
        base::debug::StackTrace::StackTrace [0x676BC27D+13]
        logging::LogMessage::~LogMessage [0x676D6743+83]
        base::subtle::RefCountedBase::AddRef [0x66AEB7BF+229]
        network::ResourceResponseInfo::ResourceResponseInfo [0x6794FFBE+150]
        network::SimpleURLLoader::Create [0x679DD97A+3906]
        network::mojom::URLLoaderClientStubDispatch::Accept [0x66B91A02+1806]
        network::mojom::URLLoaderClientStub<mojo::RawPtrImplRefTraits<network::mojom::URLLoaderClient> >::Accept [0x66BE7BD9+19]
        mojo::InterfaceEndpointClient::HandleValidatedMessage [0x67867C31+541]
        mojo::FilterChain::Accept [0x67E6758F+131]
        mojo::InterfaceEndpointClient::HandleIncomingMessage [0x67868AE2+106]
        mojo::internal::MultiplexRouter::ProcessIncomingMessage [0x6786C094+698]
        mojo::internal::MultiplexRouter::Accept [0x6786BC1B+295]
        mojo::FilterChain::Accept [0x67E6758F+131]
        mojo::Connector::ReadSingleMessage [0x678662B2+364]
        mojo::Connector::ReadAllAvailableMessages [0x678669E4+88]
        mojo::Connector::OnHandleReadyInternal [0x678668B2+126]
        base::internal::Invoker<base::internal::BindState<void (__thiscall WebRtcRemoteEventLogManager::*)(enum network::mojom::ConnectionType),base::internal::UnretainedWrapper<WebRtcRemoteEventLogManager> >,void __cdecl(enum network::mojom::ConnectionType)>::Ru [0x66AF131F+15]
        favicon::FaviconService::FaviconResultsCallbackRunner [0x66DEED50+24]
        base::internal::Invoker<base::internal::BindState<void (__cdecl*)(std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &,std::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::allocator<std::basic_strin [0x66DEED6B+19]
        mojo::SimpleWatcher::OnHandleReady [0x67803548+224]
        base::internal::Invoker<base::internal::BindState<void (__thiscall mojo::SimpleWatcher::*)(int,unsigned int,mojo::HandleSignalsState const &),base::WeakPtr<mojo::SimpleWatcher>,int,unsigned int,mojo::HandleSignalsState>,void __cdecl(void)>::Run [0x678037C4+58]
        base::debug::TaskAnnotator::RunTask [0x67D5A212+306]
        base::MessageLoop::RunTask [0x676E03E3+467]
        base::MessageLoop::DeferOrRunPendingTask [0x676E073D+157]
        base::MessageLoop::DoWork [0x676E09B7+599]
        base::MessagePumpForUI::DoRunLoop [0x676E28E8+120]
        base::MessagePumpWin::Run [0x676E2471+65]
        base::MessageLoop::Run [0x676DFF37+119]
        base::RunLoop::Run [0x677028EC+204]
        ChromeBrowserMainParts::MainMessageLoopRun [0x68285E3C+190]
        content::BrowserMainLoop::RunMainMessageLoopParts [0x66FD32F9+59]
        content::BrowserMainRunnerImpl::Run [0x66FD5784+142]
        content::BrowserMain [0x66FD0235+157]
        content::RunBrowserProcessMain [0x6768537C+84]
        content::ContentMainRunnerImpl::Run [0x67685ADA+680]
        content::ContentServiceManagerMainDelegate::RunEmbedderProcess [0x67685103+19]
        service_manager::Main [0x6768FA20+1384]
        content::ContentMain [0x676852FF+51]
        ChromeMain [0x6688119C+288]
        MainDllLoader::Launch [0x01395790+560]
        wWinMain [0x01391543+1347]
        __scrt_common_main_seh [0x0146D12A+248] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283)
        BaseThreadInitThunk [0x7596336A+18]
        RtlInitializeExceptionChain [0x77539902+99]
        RtlInitializeExceptionChain [0x775398D5+54]

The stack trace is somewhat different from other stack traces from early comments, except that they all contain network and mojo methods at the top, and all triggered by "Check failed: CalledOnValidSequence()" error.

As mentioned in  issue 864848 , this might have the same root cause as issue 864915.
a8b5a2f8-3e5d-4036-97c2-b3f6291ba2fb.dmp
1.6 MB Download
1bfec7a5-0a0a-4583-b9fc-ce1221464fc3.dmp
1.6 MB Download
 Issue 864848  has been merged into this issue.
 Issue 865234  has been merged into this issue.
 Issue 865584  has been merged into this issue.
 Issue 865771  has been merged into this issue.
 Issue 865772  has been merged into this issue.
 Issue 865773  has been merged into this issue.
Status: Fixed (was: Assigned)
This *should* be fixed now that crbug.com/864915 's change crrev.com/576636 is submitted.
Project Member

Comment 42 by bugdroid1@chromium.org, Jul 20

Labels: merge-merged-3440
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/020005247508092e88be9246f281f822b5751430

commit 020005247508092e88be9246f281f822b5751430
Author: Caleb Rouleau <crouleau@chromium.org>
Date: Fri Jul 20 18:01:12 2018

[ChromeDriver] Check for Chrome crashes after 1 second rather than 60 seconds

TBR=crouleau@chromium.org

Bug:  865982 ,  864205 
Change-Id: I5d9ad26b1f14ae28ca1b80399539dbb3c3a978ee
Reviewed-on: https://chromium-review.googlesource.com/1141163
Reviewed-by: John Chen <johnchen@chromium.org>
Commit-Queue: Caleb Rouleau <crouleau@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#576227}(cherry picked from commit 1ca17c55c4e23840893be64e18b80518ea26da85)
Reviewed-on: https://chromium-review.googlesource.com/1145561
Cr-Commit-Position: refs/branch-heads/3440@{#730}
Cr-Branched-From: 010ddcfda246975d194964ccf20038ebbdec6084-refs/heads/master@{#561733}
[modify] https://crrev.com/020005247508092e88be9246f281f822b5751430/chrome/test/chromedriver/chrome_launcher.cc

 Issue 865842  has been merged into this issue.
 Issue 876948  has been merged into this issue.
Labels: -Sheriff-Chromium
Status: Available (was: Fixed)
We got aonther report that "__main__.SessionHandlingTest.testGetSessions" is flaky. https://bugs.chromium.org/p/chromium/issues/detail?id=876948

I merged that into this issue because the bug says it was previously tracked in this issue. It looks the  bug 865842  was also merged into this too. 

Could you have a chance to take a look at this new flakiness?

Labels: Hotlist-DesktopUIToolingRequired Hotlist-DesktopUIChecked
***UI Mass Triage ***
Status: Archived (was: Available)
General flakiness of chromedriver_py_tests has been addressed by r598796 and r616565. Flakiness of "__main__.SessionHandlingTest.testGetSessions" is being tracked by issue 899919.

Sign in to add a comment