Issue metadata
Sign in to add a comment
|
Win10 Tests x64 (dbg) bot is flaky |
||||||||||||||||||||||||||||||||||||||||||||||
Issue descriptionhttps://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20Tests%20x64%20%28dbg%29 Almost, coin toss game... :-< Succeed: 14 2599, 2600, 2601, 2603, 2605, 2607, 2608, 2611, 2613, 2614, 2618, 2619, 2621 Failed: 13 2602, 2604, 2606, 2609, 2610, 2612, 2615, 2616, 2617, 2620, 2622, 2623 ⛆ |
|
|
,
Aug 21
I checked and this bot has been flaky since run 1.
,
Aug 21
,
Aug 21
,
Aug 21
,
Aug 21
,
Aug 21
After going through a bunch of the logs I'm seeing three main failure states for tests on this bot. Which lead me to believe that it currently cannot cleanly tear down the renderer process: 1) Can't run the renderer service: [ RUN ] WebViewFocusInteractiveTest.Focus_FocusTakeFocus [1812:4912:0821/040022.562:WARNING:discovery_network_list_win.cc(195)] Failed to open Wlan client handle: 1062 [1812:1068:0821/040022.703:WARNING:chrome_browser_main_win.cc(641)] Command line too long for RegisterApplicationRestart: --brave-new-test-launcher --cfi-diag=0 --disable-gpu-process-for-dx12-vulkan-info-collection --gtest_also_run_disabled_tests --gtest_filter=WebViewFocusInteractiveTest.Focus_FocusTakeFocus --single_process --snapshot-output-dir="C:\b\s\w\iot0kopo" --test-launcher-bot-mode --test-launcher-output="C:\b\s\w\itafwyt3\scoped_dir4136_13365\results4136_31523\test_results.xml" --test-launcher-summary-output="C:\b\s\w\iot0kopo\output.json" --user-data-dir="C:\b\s\w\itafwyt3\scoped_dir4136_13365\d4136_2408" --disable-offline-auto-reload --no-first-run --no-default-browser-check --enable-logging=stderr --disable-default-apps --wm-window-animations-disabled --disable-component-update --test-type=browser --force-color-profile=srgb --disable-zero-browsers-open-for-tests --ipc-connection-timeout=45 --allow-file-access-from-files --dom-automation --log-gpu-control-list-decisions --disable-backgrounding-occluded-windows --disable-gl-drawing-for-tests --override-use-software-gl-for-tests --force-color-profile=srgb --disable-compositor-ukm-for-tests --enable-features=NetworkService,TestFeatureForBrowserTest1 --disable-features=NetworkPrediction,SpeculativePreconnect,TestFeatureForBrowserTest2 --disable-gpu-process-for-dx12-vulkan-info-collection --flag-switches-begin --flag-switches-end --restore-last-session about:blank [1812:4244:0821/040023.609:ERROR:service_manager_context.cc(270)] Attempting to run unsupported native service: C:\b\s\w\ir\out\Debug_x64\chrome_renderer.service.exe [1812:4244:0821/040023.625:ERROR:service_manager_context.cc(270)] Attempting to run unsupported native service: C:\b\s\w\ir\out\Debug_x64\chrome_renderer.service.exe [1812:1068:0821/040023.812:WARNING:gaia_auth_fetcher.cc(931)] Could not reach Google Accounts servers: errno -11 [1812:4244:0821/040024.086:ERROR:service_manager_context.cc(270)] Attempting to run unsupported native service: C:\b\s\w\ir\out\Debug_x64\chrome_renderer.service.exe [1812:4244:0821/040024.105:ERROR:service_manager_context.cc(270)] Attempting to run unsupported native service: C:\b\s\w\ir\out\Debug_x64\chrome_renderer.service.exe [1812:4244:0821/040024.307:ERROR:service_manager_context.cc(270)] Attempting to run unsupported native service: C:\b\s\w\ir\out\Debug_x64\chrome_renderer.service.exe [1812:8872:0821/040024.322:ERROR:process_win.cc(155)] Unable to terminate process: Access is denied. (0x5) [1812:1068:0821/040024.751:WARNING:gaia_auth_fetcher.cc(931)] Could not reach Google Accounts servers: errno -11 [1812:1068:0821/040027.484:WARNING:gaia_auth_fetcher.cc(931)] Could not reach Google Accounts servers: errno -11 [1812:1068:0821/040034.922:WARNING:gaia_auth_fetcher.cc(931)] Could not reach Google Accounts servers: errno -11 [1812:1068:0821/040100.359:WARNING:gaia_auth_fetcher.cc(931)] Could not reach Google Accounts servers: errno -11 [1/254] WebViewFocusInteractiveTest.Focus_FocusTakeFocus (TIMED OUT) 2) Can't shut down the previous instance: [1/2914] MojoTest.Init (11048 ms) Still waiting for the following processes to finish: ".\content_browsertests.exe" --brave-new-test-launcher --cfi-diag=0 --data-path="C:\b\s\w\itfq3qik\scoped_dir6036_5815\d6036_11341" --disable-features=WebRTC-H264WithOpenH264FFmpeg --disable-gpu-process-for-dx12-vulkan-info-collection --gtest_also_run_disabled_tests --gtest_filter=AccessibilityActionBrowserTest.FocusAction --single_process --test-launcher-bot-mode --test-launcher-output="C:\b\s\w\itfq3qik\scoped_dir6036_5815\results6036_28028\test_results.xml" --test-launcher-summary-output="C:\b\s\w\iomdxxvw\output.json" --use-fake-device-for-media-stream ".\content_browsertests.exe" --brave-new-test-launcher --cfi-diag=0 --data-path="C:\b\s\w\itfq3qik\scoped_dir6036_5815\d6036_14950" --disable-features=WebRTC-H264WithOpenH264FFmpeg --disable-gpu-process-for-dx12-vulkan-info-collection --gtest_also_run_disabled_tests --gtest_filter=AccessibilityActionBrowserTest.IncrementDecrementActions --single_process --test-launcher-bot-mode --test-launcher-output="C:\b\s\w\itfq3qik\scoped_dir6036_5815\results6036_4098\test_results.xml" --test-launcher-summary-output="C:\b\s\w\iomdxxvw\output.json" --use-fake-device-for-media-stream ".\content_browsertests.exe" --brave-new-test-launcher --cfi-diag=0 --data-path="C:\b\s\w\itfq3qik\scoped_dir6036_5815\d6036_10239" --disable-features=WebRTC-H264WithOpenH264FFmpeg --disable-gpu-process-for-dx12-vulkan-info-collection --gtest_also_run_disabled_tests --gtest_filter=AccessibilityActionBrowserTest.CanvasGetImage --single_process --test-launcher-bot-mode --test-launcher-output="C:\b\s\w\itfq3qik\scoped_dir6036_5815\results6036_10401\test_results.xml" --test-launcher-summary-output="C:\b\s\w\iomdxxvw\output.json" --use-fake-device-for-media-stream ".\content_browsertests.exe" --brave-new-test-launcher --cfi-diag=0 --data-path="C:\b\s\w\itfq3qik\scoped_dir6036_5815\d6036_5168" --disable-features=WebRTC-H264WithOpenH264FFmpeg --disable-gpu-process-for-dx12-vulkan-info-collection --gtest_also_run_disabled_tests --gtest_filter=AccessibilityActionBrowserTest.CanvasGetImageScale --single_process --test-launcher-bot-mode --test-launcher-output="C:\b\s\w\itfq3qik\scoped_dir6036_5815\results6036_14902\test_results.xml" --test-launcher-summary-output="C:\b\s\w\iomdxxvw\output.json" --use-fake-device-for-media-stream Still waiting for the following processes to finish: ".\content_browsertests.exe" --brave-new-test-launcher --cfi-diag=0 --data-path="C:\b\s\w\itfq3qik\scoped_dir6036_5815\d6036_11341" --disable-features=WebRTC-H264WithOpenH264FFmpeg --disable-gpu-process-for-dx12-vulkan-info-collection --gtest_also_run_disabled_tests --gtest_filter=AccessibilityActionBrowserTest.FocusAction --single_process --test-launcher-bot-mode --test-launcher-output="C:\b\s\w\itfq3qik\scoped_dir6036_5815\results6036_28028\test_results.xml" --test-launcher-summary-output="C:\b\s\w\iomdxxvw\output.json" --use-fake-device-for-media-stream ".\content_browsertests.exe" --brave-new-test-launcher --cfi-diag=0 --data-path="C:\b\s\w\itfq3qik\scoped_dir6036_5815\d6036_14950" --disable-features=WebRTC-H264WithOpenH264FFmpeg --disable-gpu-process-for-dx12-vulkan-info-collection --gtest_also_run_disabled_tests --gtest_filter=AccessibilityActionBrowserTest.IncrementDecrementActions --single_process --test-launcher-bot-mode --test-launcher-output="C:\b\s\w\itfq3qik\scoped_dir6036_5815\results6036_4098\test_results.xml" --test-launcher-summary-output="C:\b\s\w\iomdxxvw\output.json" --use-fake-device-for-media-stream ".\content_browsertests.exe" --brave-new-test-launcher --cfi-diag=0 --data-path="C:\b\s\w\itfq3qik\scoped_dir6036_5815\d6036_10239" --disable-features=WebRTC-H264WithOpenH264FFmpeg --disable-gpu-process-for-dx12-vulkan-info-collection --gtest_also_run_disabled_tests --gtest_filter=AccessibilityActionBrowserTest.CanvasGetImage --single_process --test-launcher-bot-mode --test-launcher-output="C:\b\s\w\itfq3qik\scoped_dir6036_5815\results6036_10401\test_results.xml" --test-launcher-summary-output="C:\b\s\w\iomdxxvw\output.json" --use-fake-device-for-media-stream ".\content_browsertests.exe" --brave-new-test-launcher --cfi-diag=0 --data-path="C:\b\s\w\itfq3qik\scoped_dir6036_5815\d6036_5168" --disable-features=WebRTC-H264WithOpenH264FFmpeg --disable-gpu-process-for-dx12-vulkan-info-collection --gtest_also_run_disabled_tests --gtest_filter=AccessibilityActionBrowserTest.CanvasGetImageScale --single_process --test-launcher-bot-mode --test-launcher-output="C:\b\s\w\itfq3qik\scoped_dir6036_5815\results6036_14902\test_results.xml" --test-launcher-summary-output="C:\b\s\w\iomdxxvw\output.json" --use-fake-device-for-media-stream [ RUN ] AccessibilityActionBrowserTest.IncrementDecrementActions DevTools listening on ws://127.0.0.1:59118/devtools/browser/ba78bcc7-3ce2-4f87-a8c6-69a9b5a4695b [2/2914] AccessibilityActionBrowserTest.IncrementDecrementActions (TIMED OUT) 3) Abnormal Renderer Teardown [ RUN ] HeadlessProtocolBrowserTest.VirtualTimePendingScript [1864:3932:0821/040135.314:25128218:ERROR:registration_protocol_win.cc(84)] TransactNamedPipe: The pipe has been ended. (0x6D) [1864:3932:0821/040135.317:25128234:WARNING:resource_bundle.cc(358)] locale_file_path.empty() for locale [1864:3932:0821/040135.355:25128265:ERROR:gpu_process_transport_factory.cc(642)] Switching to software compositing. [1864:3932:0821/040135.355:25128265:ERROR:gpu_process_transport_factory.cc(984)] Lost UI shared context. [0821/040135.533:ERROR:registration_protocol_win.cc(56)] CreateFile: The system cannot find the file specified. (0x2) ../../headless/test/headless_browser_test.cc(251): error: Failed Abnormal renderer termination Stack trace: Backtrace: StackTraceGetter::CurrentStackTrace [0x00007FF7735E5820+80] testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop [0x00007FF7735FF05A+90] testing::internal::AssertHelper::operator= [0x00007FF7735FEB5A+90] headless::HeadlessAsyncDevTooledBrowserTest::RenderProcessExited [0x00007FF7731654B0+192] headless::HeadlessWebContentsImpl::RenderProcessExited [0x00007FF86B6FCF25+421] content::RenderProcessHostImpl::ProcessDied [0x00007FF863CD63D6+1222] content::RenderProcessHostImpl::OnChannelError [0x00007FF863CD8243+179] IPC::ChannelProxy::Context::OnDispatchError [0x00007FF8779614ED+45] ??$Invoke@P8Context@ChannelProxy@IPC@@EAAXXZAEBV?$scoped_refptr@VContext@ChannelProxy@IPC@@@@$$V@?$FunctorTraits@P8Context@ChannelProxy@IPC@@EAAXXZX@internal@base@@SAXP8Context@ChannelProxy@IPC@@EAAXXZAEBV?$scoped_refptr@VContext@ChannelProxy@IPC@@@@@Z [0x00007FF8779684A3+67] base::internal::InvokeHelper<0,void>::MakeItSo<void (__cdecl IPC::ChannelProxy::Context::*const & __ptr64)(void) __ptr64,scoped_refptr<IPC::ChannelProxy::Context> const & __ptr64> [0x00007FF877968416+86] base::internal::Invoker<base::internal::BindState<void (__cdecl IPC::ChannelProxy::Context::*)(void) __ptr64,scoped_refptr<IPC::ChannelProxy::Context> >,void __cdecl(void)>::RunImpl<void (__cdecl IPC::ChannelProxy::Context::*const & __ptr64)(void) __ptr64 [0x00007FF8779683B9+73] base::internal::Invoker<base::internal::BindState<void (__cdecl IPC::ChannelProxy::Context::*)(void) __ptr64,scoped_refptr<IPC::ChannelProxy::Context> >,void __cdecl(void)>::Run [0x00007FF8779682DC+60] base::OnceCallback<void __cdecl(void)>::Run [0x00007FF87215D851+97] base::debug::TaskAnnotator::RunTask [0x00007FF8721C5EC3+915] base::MessageLoop::RunTask [0x00007FF872262ED7+951] base::MessageLoop::DeferOrRunPendingTask [0x00007FF872263723+83] base::MessageLoop::DoWork [0x00007FF872263C14+484] base::MessagePumpForUI::DoRunLoop [0x00007FF87227045D+77] base::MessagePumpWin::Run [0x00007FF87226F3CC+140] base::MessageLoop::Run [0x00007FF87226276C+524] base::RunLoop::Run [0x00007FF87233F3FA+506] headless::HeadlessBrowserTest::RunAsynchronousTest [0x00007FF773164DA0+432] headless::HeadlessAsyncDevTooledBrowserTest::RunTest [0x00007FF773165767+583] headless::HeadlessProtocolBrowserTest_VirtualTimePendingScript_Test::RunTestOnMainThread [0x00007FF77316F097+87] content::BrowserTestBase::ProxyRunTestOnMainThreadLoop [0x00007FF7746EDADB+811] ??$Invoke@P8BrowserTestBase@content@@EAAXXZPEAV12@$$V@?$FunctorTraits@P8BrowserTestBase@content@@EAAXXZX@internal@base@@SAXP8BrowserTestBase@content@@EAAXXZ$$QEAPEAV34@@Z [0x00007FF7746F05FA+26] base::internal::InvokeHelper<0,void>::MakeItSo<void (__cdecl content::BrowserTestBase::*const & __ptr64)(void) __ptr64,content::BrowserTestBase * __ptr64> [0x00007FF7746F0574+52] base::internal::Invoker<base::internal::BindState<void (__cdecl content::BrowserTestBase::*)(void) __ptr64,base::internal::UnretainedWrapper<content::BrowserTestBase> >,void __cdecl(void)>::RunImpl<void (__cdecl content::BrowserTestBase::*const & __ptr64) [0x00007FF7746F0528+88] base::internal::Invoker<base::internal::BindState<void (__cdecl content::BrowserTestBase::*)(void) __ptr64,base::internal::UnretainedWrapper<content::BrowserTestBase> >,void __cdecl(void)>::Run [0x00007FF7746F043C+60] base::OnceCallback<void __cdecl(void)>::Run [0x00007FF87215D851+97] base::debug::TaskAnnotator::RunTask [0x00007FF8721C5EC3+915] base::MessageLoop::RunTask [0x00007FF872262ED7+951] base::MessageLoop::DeferOrRunPendingTask [0x00007FF872263723+83] base::MessageLoop::DoWork [0x00007FF872263C14+484] base::MessagePumpForUI::DoRunLoop [0x00007FF87227045D+77] base::MessagePumpWin::Run [0x00007FF87226F3CC+140] base::MessageLoop::Run [0x00007FF87226276C+524] base::RunLoop::Run [0x00007FF87233F3FA+506] content::BrowserMainLoop::MainMessageLoopRun [0x00007FF862D7C9C0+512] content::BrowserMainLoop::RunMainMessageLoopParts [0x00007FF862D7C655+517] content::BrowserMainRunnerImpl::Run [0x00007FF862D8D74F+335] headless::HeadlessContentMainDelegate::RunProcess [0x00007FF86B709AF7+503] [1864:6236:0821/040135.725:25128640:WARNING:discardable_shared_memory_manager.cc(431)] Some MojoDiscardableSharedMemoryManagerImpls are still alive. They will be leaked. [ FAILED ] HeadlessProtocolBrowserTest.VirtualTimePendingScript, where TypeParam = and GetParam() = (499 ms)
,
Aug 21
I'm not sure if a bot restart would clear this out.
,
Aug 21
,
Aug 21
Win10 seems too flaky. Do we have a meta bug? martiniss@: assigned to you as you work on deflaking Win10 IIRC
,
Aug 21
I haven't worked on deflaking Win10 bots in a long time.
,
Aug 22
Please do some actions for bot, e.g. - adding a new machine - replace to a new machine - change configuration - upgrade to newest version of Windows 10 etc. "Win10 Tests x64(dbg)" bot has been flaky since July, and there are no action taken in bot side. Build sheriffs has been spending lots of time for this bot's flakiness.
,
Aug 22
,
Aug 22
I'll replace it with a new machine.
,
Aug 22
The following revision refers to this bug: https://chrome-internal.googlesource.com/infradata/config/+/5abbd2c02d77dd5ecaac06e6e87a8ea54fbb8de4 commit 5abbd2c02d77dd5ecaac06e6e87a8ea54fbb8de4 Author: Stephen Martinis <martiniss@google.com> Date: Wed Aug 22 21:25:20 2018
,
Aug 22
The bot has been replaced with a GCE machine. It should be on the latest version of Windows 10 that we support. Let me know if things are still flaky and I can look a bit more
,
Aug 28
This is linked from a "components_browsertests failing on chromium.win/Win10 Tests x64 (dbg)" failure in sheriff-o-matic. Things are still quite flaky looking at the bunch of "Distiller" errors in https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20Tests%20x64%20%28dbg%29/2754, there's not much to go on errors like """ [ RUN ] DomDistillerJsTest.RunJsTests DevTools listening on ws://127.0.0.1:64163/devtools/browser/8f59a081-390a-4523-a774-6959f003abf7 """ components_browsertests was green the run before, but red in a run a few cycles earlier -- https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20Tests%20x64%20%28dbg%29/2750 -- with similar errors.
,
Aug 28
Looking
,
Aug 28
Actually, I think this is the same thing as bug 840369. There's a webui test suite that's flaky, but that's for a different reason. I have a CL out to fix that, which should land. Most of the work is probably going to be done in the other bug. I'm going to mark a bunch of test suites on this bot as experimental. That should help sheriffs know what is actually a new failure, and what is a known bug that's being worked on.
,
Aug 28
How many tests in those suites are flaky? Could we disable tests instead of dumping the suite to experimental?
,
Aug 28
and are we seeing similar failures elsewhere, or just on Win10 x64 dbg?
,
Aug 28
I'm planning on only marking sync_integration_tests as experimental. That's being worked on in bug 840369. There are roughly 50 tests that fail, so I don't think we can just mark the individually failed tests as flaky.
,
Aug 28
Actually, https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=sync_integration_tests&builder=chromium.win%3AWin10%20Tests%20x64%20(dbg) seems to show a fairly consistent set of tests which all flake together....
,
Aug 30
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/75ab55d5d1d4b2c436121bc06a914f4ec0aaf320 commit 75ab55d5d1d4b2c436121bc06a914f4ec0aaf320 Author: Stephen Martinis <martiniss@chromium.org> Date: Thu Aug 30 02:27:12 2018 sync_integration_tests: Set experimental on "Win10 Tests x64 (dbg)" This test suite has been flaky for a while, and is being worked on slowly. Mark it as experimental so sheriffs aren't alerted to it. Bug: 840369, 876224 Change-Id: I20e049e3176e8432da51736a84a9a4697b98c214 Reviewed-on: https://chromium-review.googlesource.com/1195044 Commit-Queue: Stephen Martinis <martiniss@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> Reviewed-by: Ben Pastene <bpastene@chromium.org> Cr-Commit-Position: refs/heads/master@{#587422} [modify] https://crrev.com/75ab55d5d1d4b2c436121bc06a914f4ec0aaf320/testing/buildbot/chromium.win.json [modify] https://crrev.com/75ab55d5d1d4b2c436121bc06a914f4ec0aaf320/testing/buildbot/test_suite_exceptions.pyl
,
Aug 30
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0309a55ce9cddfc7fbea4a5e61d9c04b6b43a992 commit 0309a55ce9cddfc7fbea4a5e61d9c04b6b43a992 Author: rbpotter <rbpotter@chromium.org> Date: Thu Aug 30 18:41:50 2018 Remove webui_polymer2_browser_tests from Win 10 Tests x64 (dbg) bot These tests are timing out periodically. Match the removal of browser_tests, since the behavior is expected to be the same. Bug: 876224 Change-Id: I76183e78fde8f10a1feae67db51ca4336a4c652e Reviewed-on: https://chromium-review.googlesource.com/1197322 Reviewed-by: John Budorick <jbudorick@chromium.org> Commit-Queue: Rebekah Potter <rbpotter@chromium.org> Cr-Commit-Position: refs/heads/master@{#587675} [modify] https://crrev.com/0309a55ce9cddfc7fbea4a5e61d9c04b6b43a992/testing/buildbot/chromium.win.json [modify] https://crrev.com/0309a55ce9cddfc7fbea4a5e61d9c04b6b43a992/testing/buildbot/test_suite_exceptions.pyl
,
Aug 30
,
Aug 31
,
Aug 31
Sheriffs, can you disable the failing tests in the 'extensions_browsertests' and 'headless_browsertests' test suites?
,
Aug 31
headless_browsertests has 102 failing tests, and extensions_browsertests has 20. Should I disable whole suites (similarly to https://crrev.com/c/1197322)?
,
Sep 10
Opened issue 882512 for headless_browsertests; would prefer not to disable entire test suites if we can help it...
,
Sep 10
Filed issue 882545 for extension_browsertests/content_browsertests failures (seems related to devtools "listening" timing out -- disabling ad-hoc tests will only cover up the problem and not get rid of all flakes IMO while risking leaving disabled tests behind once root issue is fixed)
,
Sep 10
Sorry s/content_browsertests/components_browsertests/
,
Sep 11
,
Sep 11
Looking into issue 882512 : I'm starting to wonder if the system is so overloaded that random processes are killed... Looks like GPU and other pipes have log errors before we receive the abnormal renderer termination [ RUN ] DevToolsAttachAndDetachNotifications.RunAsyncTest [7224:7392:0911/064643.004:16996921:ERROR:registration_protocol_win.cc(84)] TransactNamedPipe: The pipe has been ended. (0x6D) [7224:7392:0911/064643.004:16996921:WARNING:resource_bundle.cc(358)] locale_file_path.empty() for locale [7224:7392:0911/064643.035:16996953:ERROR:gpu_process_transport_factory.cc(638)] Switching to software compositing. [7224:7392:0911/064643.035:16996953:ERROR:gpu_process_transport_factory.cc(980)] Lost UI shared context. [0911/064643.226:ERROR:registration_protocol_win.cc(56)] CreateFile: The system cannot find the file specified. (0x2) ../../headless/test/headless_browser_test.cc(249): error: Failed Abnormal renderer termination
,
Oct 5
,
Oct 8
Back to the sheriff queue FYI.
,
Oct 8
,
Oct 9
I need help with this, so escalating to Troopers, hoping to pick up some expert Windows help. The current symptoms: • every test in the chrome_cleaner_unittests module flakily fails with sandbox connection errors (Sandboxed process exited before signaling it was initialized, exit code: 3221225477): See b/871924 • components_browsertests fails a set of tests consistently (all flake together or don't flake together) - fdoray reports in https://bugs.chromium.org/p/chromium/issues/detail?id=882545&desc=2#c18 that this could be due to the incorrect use of TestNavigationObserver. • interactive_ui_tests and content_shell_crash_test and network_service_interactive_ui_tests and webui_polymer2_interactive_ui_tests and telemetry_unittests have all failed enough that they've been marked as experimental on this bot. This seems like a lot of wrong - and while the components_browsertest failures might have a local cause I want to get a more knowledgeable opinion about whether there's a common underlying issue on the bot before I disable 20+ tests, including all chrome_cleaner unittests.
,
Oct 9
RVG for b/ link. Also I think that link is wrong, it looks like something about Cluster Lifecycle, which doesn't seem relevant to this. Did you mean crbug.com/871924?
,
Oct 9
Yeah, looks like that's what you meant. Don't need to RVG this then.
,
Oct 9
Yep, that's what I meant. >_<
,
Oct 9
To be clear, this build configuration (Windows 10, 64 bit, debug) was spun up this year, and we hadn't really run these test suites on it before. So, it's semi expected (maybe?) to have several test suites failing on this. It's not a case of a bunch of test suites that used to pass all of a sudden started failing. I'm trooper today; I don't have much Windows experience in general, and that would be useful here... I can think about it a bit more though.
,
Oct 9
,
Oct 9
,
Oct 9
,
Oct 9
,
Oct 9
,
Oct 9
,
Oct 9
,
Oct 9
,
Oct 9
,
Oct 9
,
Oct 9
I fixed the dependent bugs list; everything listed under "Blocked on" are the individual bugs filed about test suites.
,
Oct 9
I don't think troopers can do anything here to fix anything really. In particular, the hardware these tests are running on is the same hardware that runs our regular windows 10 tests. So, I don't think there's anything really wrong with the actual hardware, since it isn't failing on other tests nearly as common. I think the path forward with this bug (and bugs this bug is blocked on) is to diagnose why these suites are failing. This will probably involve a chrome dev who has experience with the chromium codebase debugging the test suites. I would guess that there are a few root issues causing all of the failures we have. I'm not sure who the best owner for this is. For now I'll leave this available with the Sheriff-Chromium label, which I think will mean that sheriffs will eventually look at this during their shift. Troopers (me included) are happy to help with infrastructure issues and troubleshooting. This includes tasks like figuring out how exactly the binaries being run are compiled, getting devs access to test machines, answering questions about what exactly is happening with this particular builder.
,
Oct 22
I have a Windows machine and some experience, so I'll look at this for the duration of my shift.
,
Oct 23
End of shift update:
I started looking at headless_browsertests because it seemed the flakiest. I wasn't able to reproduce the failure locally. However, I noticed the failure always starts with HeadlessBrowserTest.TraceUsingBrowserDevToolsTarget. I might speculatively disable that test and see what happens. I tried forcing it to fail locally to see the error output and these two lines were unique to the real failure:
headless_devtools_client_impl.cc(195) Unhandled protocol message: {"method":"Inspector.targetCrashed","params":{}}
registration_protocol_win.cc(56) CreateFile: The system cannot find the file specified. (0x2)
I'm guessing the latter is related to the former, since it's inside SendToCrashHandlerServer, so maybe the renderer crashed and the browser is trying to report it?
Anyway, the first line looks pretty specific to the test, so I took a look through the other suites blocked on this bug and I'm not convinced they share root causes, but I need to dig into them more.
Overnight sheriffs, feel free to take this. I'll look at it more tomorrow.
,
Oct 23
I mean HeadlessBrowserTest.WebGLSupported, the next test. TraceUsingBrowserDevToolsTarget is passing.
,
Oct 24
Don't we run our release bots with DCHECKs enabled? I'm not sure there's much to be gained from fixing debug test timeouts. We're already getting the correctness coverage from release. I'm inclined to make a blanket policy of "if it looks like it's timing out because debug is slower than release then just disable it in debug".
,
Oct 24
Shift update: interactive_ui_tests is now consistently failing due to timeout for the last 18 runs in this configuration. Everything has been on fire all day so I haven't had a chance to dive into this more deeply. I also second the call for a windows / bot ninja to take a look.
,
Oct 26
Sheriff update: Agree that someone with greater bot and Windows skills should look at this. There are various theories here, including that the bot is weighed down. I don't know were to dig, and consequently haven't dug into this today.
,
Oct 26
fwict, https://bugs.chromium.org/p/chromium/issues/detail?id=651906 is probably the root cause of most if not all these flakes.
,
Oct 26
Based on that last comment, assigning to wfh@ to dig into the InterceptionAgent::PatchDll issue
,
Nov 9
,
Nov 9
No reason to keep this in the sheriff bug queue, no?
,
Nov 26
+dpranke@ Can we take this bot out of sheriff rotation while we wait for a fix? A few suites have been marked experimental and this bug itself is no longer in sheriff queue, but the test failures continue to pop up in the sheriff-o-matic and folks will sink an afternoon into it before finding this bug and realizing they're following a well worn path.
,
Nov 26
switching to @chromium address
,
Dec 14
Still annoying the sheriffs.
,
Dec 14
I can land an experimental fix based on the comments in issue 651906, but without a local repro I won't be sure it's fixed.
,
Dec 19
under other issue I landed a speculative fix, is it possible to check whether this has improved things or not? I opened https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20Tests%20x64%20%28dbg%29?limit=200 and there is still some red but don't know if that is Normal or not.
,
Dec 31
Sheriff still seeing these flakes today, example of failing build: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20Tests%20x64%20%28dbg%29/5200 https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8925631502827771664/+/steps/headless_browsertests_on_Windows-10-15063/0/logs/HeadlessProtocolCompositorBrowserTest.RendererRedirectKeepsFragment/0
,
Jan 8
Ditto of #71: flakes still occurring. For example of how bad/frequent these fail on this bot, see https://test-results.appspot.com/dashboards/flakiness_dashboard.html#testType=headless_browsertests&builder=chromium.win%3AWin10%20Tests%20x64%20(dbg)
,
Jan 14
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ec52747a5abc25a862843edca761104b3c319764 commit ec52747a5abc25a862843edca761104b3c319764 Author: Gabriel Charette <gab@chromium.org> Date: Mon Jan 14 16:16:49 2019 [ui_controls] Unflake Send*NotifyWhenDone() on Windows ui_controls::Send*NotifyWhenDone() can be flaky when invoked after ui_controls::Send*() as the former can decide to notify based on observing a yet-to-be-processed event from the latter (or even a yet-to-be-processed event emitted by unrelated code) and thus notify too early, resuming and testing conditions that have yet to be met. Solution: defer the notification if the system queue has pending events of the same type awaiting dispatch. Note: mouse move can be repeated indefinitely during a drag, as such we consider a mouse move complete when it hits the target regardless of remaining mouse move messages in the queue. @ BUG OWNERS : This might unflake many currently disabled tests. I've CC'ed interactive_ui_tests + Windows bugs, please try to re-enable your test after this CL if you think it might be related. Bug: 892228 , 640996, 897801,893078,876224,875443,873110,852786,850343,848049,846695,840369,798492,756338,751031,665296,651906,499858,468660,419468,238347,131612,106489,97777,92467 Change-Id: I548856a3948ff71a145435799b4ba3e689561f14 Reviewed-on: https://chromium-review.googlesource.com/c/1392178 Reviewed-by: Sadrul Chowdhury <sadrul@chromium.org> Reviewed-by: Greg Thompson <grt@chromium.org> Reviewed-by: Peter Kasting <pkasting@chromium.org> Commit-Queue: Gabriel Charette <gab@chromium.org> Cr-Commit-Position: refs/heads/master@{#622470} [modify] https://crrev.com/ec52747a5abc25a862843edca761104b3c319764/chrome/browser/ui/views/bookmarks/bookmark_bar_view_test.cc [modify] https://crrev.com/ec52747a5abc25a862843edca761104b3c319764/ui/base/test/ui_controls_internal_win.cc |
||||||||||||||||||||||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||||||||||||||||||||||||
Comment 1 by yosin@chromium.org
, Aug 21