testDumpMemorySuccess flaky. GPU issue? |
||||||
Issue descriptionFrom: https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29/builds/49551/steps/telemetry_unittests%20on%20Windows-7-SP1/logs/stdio [263/1023] telemetry.internal.backends.chrome_inspector.tracing_backend_unittest.TracingBackendTest.testDumpMemorySuccess failed unexpectedly 5.4570s: Successfully shut down browser cooperatively Chrome build location for win_AMD64 not found. Browser will be run without Flash. Requested remote debugging port: 0 Chrome log file will be saved in e:\b\swarm_slave\work\isolated\isolated_tmpfk5aoy\tmpxtgwen\chrome.log Starting Chrome ['../../out\\Debug\\chrome.exe', '--no-sandbox', '--enable-memory-benchmarking', '--enable-net-benchmarking', '--metrics-recording-only', '--no-default-browser-check', '--no-first-run', '--enable-gpu-benchmarking', '--disable-background-networking', '--no-proxy-server', '--disable-component-extensions-with-background-pages', '--disable-default-apps', '--enable-logging', '--v=1', '--remote-debugging-port=0', '--enable-crash-reporter-for-testing', '--window-size=1280,1024', '--user-data-dir=e:\\b\\swarm_slave\\work\\isolated\\isolated_tmpfk5aoy\\tmpfjbqj9', 'about:blank'] [snip] [5576:4916:0610/103620:FATAL:gpu_info_collector.cc(104)] Check failed: gl::GetGLImplementation() != gl::kGLImplementationNone (0 vs. 0) Backtrace: base::debug::StackTrace::StackTrace [0x10064957+23] logging::LogMessage::~LogMessage [0x100B36BB+59] gpu::gles2::BufferManager::MarkContextLost [0x0B3CFF68+2890514] gpu::gles2::BufferManager::MarkContextLost [0x0B3D3458+2904066] content::GpuChildThread::OnCollectGraphicsInfo [0x1078F178+216] ??$DispatchToMethodImpl@PAVGpuChildThread@content@@P812@AEXXZ$$V$$Z$S@base@@YAXABQAVGpuChildThread@content@@P812@AEXXZABV?$tuple@$$V@std@@U?$IndexSequence@$S@0@@Z [0x10784C20+32] ??$DispatchToMethod@PAVGpuChildThread@content@@P812@AEXXZ$$V@base@@YAXABQAVGpuChildThread@content@@P812@AEXXZABV?$tuple@$$V@std@@@Z [0x107847DC+44] ??$DispatchToMethod@VGpuChildThread@content@@P812@AEXXZXV?$tuple@$$V@std@@@IPC@@YAXPAVGpuChildThread@content@@P812@AEXXZPAXABV?$tuple@$$V@std@@@Z [0x107849B6+38] ??$Dispatch@VGpuChildThread@content@@V12@XP812@AEXXZ@?$MessageT@UGpuMsg_CollectGraphicsInfo_Meta@@V?$tuple@$$V@std@@X@IPC@@SA_NPBVMessage@1@PAVGpuChildThread@content@@1PAXP834@AEXXZ@Z [0x10783B63+227] content::GpuChildThread::OnControlMessageReceived [0x1078F78B+491] content::ChildThreadImpl::OnMessageReceived [0x108A336B+1259] content::GpuChildThread::OnMessageReceived [0x10791088+24] IPC::MessageAttachmentSet::ReplacePlaceholderWithAttachment [0x0B900DD3+168836] IPC::MessageAttachmentSet::ReplacePlaceholderWithAttachment [0x0B8FB7C3+146804] IPC::MessageAttachmentSet::ReplacePlaceholderWithAttachment [0x0B8FB530+146145] IPC::MessageAttachmentSet::ReplacePlaceholderWithAttachment [0x0B901688+171065] base::Callback<void __cdecl(void),1>::Run [0x1003C32E+30] base::debug::TaskAnnotator::RunTask [0x1006DF34+324] base::MessageLoop::RunTask [0x100DA9A0+640] base::MessageLoop::DeferOrRunPendingTask [0x100D887D+45] base::MessageLoop::DoWork [0x100D8E64+196] base::MessagePumpForGpu::DoRunLoop [0x100E1F42+98] base::MessagePumpWin::Run [0x100E34DB+123] base::MessageLoop::RunHandler [0x100DA6E1+193] base::RunLoop::Run [0x10180834+52] base::MessageLoop::Run [0x100DA5DC+188] content::GpuMain [0x1079C3C3+2691] content::RunNamedProcessTypeMain [0x131B4B67+135] content::ContentMainRunnerImpl::Run [0x131B4A28+488] content::ContentMain [0x131B2A14+100] ChromeMain [0x04EF5622+114] MainDllLoader::Launch [0x0043F5C4+916] wWinMain [0x0043B42D+653] invoke_main [0x006D3DDE+30] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:118) __scrt_common_main_seh [0x006D3C2A+346] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:255) __scrt_common_main [0x006D3ABD+13] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:300) wWinMainCRTStartup [0x006D3DF8+8] (f:\dd\vctools\crt\vcstartup\src\startup\exe_wwinmain.cpp:17) BaseThreadInitThunk [0x7701337A+18] RtlInitializeExceptionChain [0x777D9882+99] RtlInitializeExceptionChain [0x777D9855+54] [snip] [1372:2900:0610/103621:VERBOSE1:tracing_controller_impl.cc(1005)] Memory-infra dump failed because of NACK from child 5576 [1372:1908:0610/103621:VERBOSE1:node_controller.cc(445)] Dropped peer E106F77245BBC37.9191E085EF5E21B6 [1372:1908:0610/103621:VERBOSE1:node.cc(410)] Observing lost connection from node 13C75C8D216A6809.B0EF3EC4F1FC7A8B to node E106F77245BBC37.9191E085EF5E21B6 ========== END BROWSER LOG ========== Traceback (most recent call last): File "e:\b\swarm_slave\work\isolated\isolated_runekl8te\third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\tracing_backend_unittest.py", line 25, in WrappedTest test(self) File "e:\b\swarm_slave\work\isolated\isolated_runekl8te\third_party\catapult\telemetry\telemetry\internal\backends\chrome_inspector\tracing_backend_unittest.py", line 92, in testDumpMemorySuccess self.assertIsNotNone(dump_id) AssertionError: unexpectedly None It looks like maybe a race or something makes the OnCollectGraphicsInfo fail / lose context, and it looks like we're getting into a bad situation (kGLImplementationNone ?) which raises asserts.
,
Jun 10 2016
j.isorce@ has been working on the GPU info collection code recently and may be able to postulate a cause of the crash.
,
Jun 10 2016
GPU device 0: VENDOR = 0x15ad, DEVICE = 0x405 It is using VMware software renderer. The assertion is from CollectGraphicsInfoGL, which should not be reached on Windows at all. This is weird.
,
Jun 10 2016
Issue 616483 has been merged into this issue.
,
Jun 10 2016
Looking higher in the log from the last failure reported in Issue 616483 : https://chromium-swarm.appspot.com/user/task/2f5431fbd1518e10 [4908:4748:0610/104011:ERROR:angle_platform_impl.cc(33)] ANGLE Display::initialize error 4: Renderer does not support PS 3.0.aborting! [4908:4748:0610/104011:ERROR:gl_surface_egl.cc(598)] eglInitialize D3D9 failed with error EGL_NOT_INITIALIZED [4908:4748:0610/104011:ERROR:gl_initializer_win.cc(28)] GLSurfaceEGL::InitializeOneOff failed. [4908:4748:0610/104011:VERBOSE1:gpu_main.cc(345)] gl::init::InitializeGLOneOff failed ... [4908:4748:0610/104011:ERROR:gpu_child_thread.cc(376)] Exiting GPU process due to errors during initialization [4908:4748:0610/104011:FATAL:gpu_info_collector.cc(104)] Check failed: gl::GetGLImplementation() != gl::kGLImplementationNone (0 vs. 0) Backtrace: base::debug::StackTrace::StackTrace [0x10064957+23] logging::LogMessage::~LogMessage [0x100B36BB+59] gpu::gles2::BufferManager::MarkContextLost [0x0B19FF68+2890514] gpu::gles2::BufferManager::MarkContextLost [0x0B1A3458+2904066] content::GpuChildThread::OnCollectGraphicsInfo [0x1078F178+216] ??$DispatchToMethodImpl@PAVGpuChildThread@content@@P812@AEXXZ$$V$$Z$S@base@@YAXABQAVGpuChildThread@content@@P812@AEXXZABV?$tuple@$$V@std@@U?$IndexSequence@$S@0@@Z [0x10784C20+32] ??$DispatchToMethod@PAVGpuChildThread@content@@P812@AEXXZ$$V@base@@YAXABQAVGpuChildThread@content@@P812@AEXXZABV?$tuple@$$V@std@@@Z [0x107847DC+44] ??$DispatchToMethod@VGpuChildThread@content@@P812@AEXXZXV?$tuple@$$V@std@@@IPC@@YAXPAVGpuChildThread@content@@P812@AEXXZPAXABV?$tuple@$$V@std@@@Z [0x107849B6+38] ??$Dispatch@VGpuChildThread@content@@V12@XP812@AEXXZ@?$MessageT@UGpuMsg_CollectGraphicsInfo_Meta@@V?$tuple@$$V@std@@X@IPC@@SA_NPBVMessage@1@PAVGpuChildThread@content@@1PAXP834@AEXXZ@Z [0x10783B63+227] content::GpuChildThread::OnControlMessageReceived [0x1078F78B+491] content::ChildThreadImpl::OnMessageReceived [0x108A336B+1259] content::GpuChildThread::OnMessageReceived [0x10791088+24] IPC::MessageAttachmentSet::ReplacePlaceholderWithAttachment [0x0B840DD3+168836] IPC::MessageAttachmentSet::ReplacePlaceholderWithAttachment [0x0B83B7C3+146804] IPC::MessageAttachmentSet::ReplacePlaceholderWithAttachment [0x0B83B530+146145] IPC::MessageAttachmentSet::ReplacePlaceholderWithAttachment [0x0B841688+171065] base::Callback<void __cdecl(void),1>::Run [0x1003C32E+30] base::debug::TaskAnnotator::RunTask [0x1006DF34+324] base::MessageLoop::RunTask [0x100DA9A0+640] base::MessageLoop::DeferOrRunPendingTask [0x100D887D+45] base::MessageLoop::DoWork [0x100D8E64+196] base::MessagePumpForGpu::DoRunLoop [0x100E1F42+98] base::MessagePumpWin::Run [0x100E34DB+123] base::MessageLoop::RunHandler [0x100DA6E1+193] base::RunLoop::Run [0x10180834+52] base::MessageLoop::Run [0x100DA5DC+188] content::GpuMain [0x1079C3C3+2691] content::RunNamedProcessTypeMain [0x131B4B67+135] content::ContentMainRunnerImpl::Run [0x131B4A28+488] content::ContentMain [0x131B2A14+100] ChromeMain [0x04CC5622+114] MainDllLoader::Launch [0x0043F5C4+916] wWinMain [0x0043B42D+653] invoke_main [0x006D3DDE+30] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:118) __scrt_common_main_seh [0x006D3C2A+346] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:255) __scrt_common_main [0x006D3ABD+13] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:300) wWinMainCRTStartup [0x006D3DF8+8] (f:\dd\vctools\crt\vcstartup\src\startup\exe_wwinmain.cpp:17) BaseThreadInitThunk [0x7701337A+18] RtlInitializeExceptionChain [0x777D9882+99] RtlInitializeExceptionChain [0x777D9855+54] Are fallbacks taking effect that are causing unexpected code paths to be taken?
,
Jun 10 2016
Jmadill: I remember we have a list of fallbacks that you moved from ANGLE to chromium. Can you shed some light on this?
,
Jun 10 2016
In addition to previous comment, some random ideas: 1: Should we bump driver_version field of entry 68 in gpu/config/software_rendering_list_json.cc ? Though why does it start to fail just now. Since these test run in a virtual machine, do these tests really require to start the gpu process ? 2: Is it possible that the bot or the vm has been touched ? Like upgrading the client driver ? 3: Comment #21 from 2015 reports a similar problem here https://bugs.chromium.org/p/chromium/issues/detail?id=514274 on win_os (though the ticket itself is against Linux). So it seems that the problem disappear and re-appear.
,
Jun 11 2016
If the GPU process fails to initialize the GPU then it'll clear the GL bindings and GpuChildThread::OnInitialize will do base::MessageLoop::current()->QuitWhenIdle(). I think there's a race condition there where if other messages (e.g. collect graphics info) are sent quickly then it'll process them before it becomes idle. Maybe it should do _exit(0); immediately in that case, so it won't try to handle other IPC messages. Another option is to get rid of the dead on arrival state and die immediately after creating a GPU context fails. That should work now that the browser process can detect when processes die before creating an IPC channel.
,
Jun 13 2016
jbauman I think you are exactly right. I did this quick fix here https://codereview.chromium.org/2061953002 . Though in long term I think what you suggested would be better.
,
Jun 13 2016
,
Jun 14 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e02f94b59fa148e354db57a0f5d38f6bb644aa55 commit e02f94b59fa148e354db57a0f5d38f6bb644aa55 Author: j.isorce <j.isorce@samsung.com> Date: Tue Jun 14 08:31:03 2016 Do not call CollectContextGraphicsInfo if GL has failed to initialize If the gpu process is not yet launch when calling GpuDataManagerImplPrivate::RequestCompleteGpuInfoIfNeeded() it will cause to start the gpu process and send the message GpuMsg_CollectGraphicsInfo right away. (possible causes: browser_bridge.js, SystemInfoHandler::GetInfo, GPUFeatureChecker) On gpu side this will cause to call GpuMain, GpuChildThread's constructor, OnInitialize and OnCollectGraphicsInfo sequentially. If dead_on_arrival_ is true then GpuChildThread::OnInitialize calls base::MessageLoop::current()->QuitWhenIdle() which might let handle the pending GpuMsg_CollectGraphicsInfo message. BUG= 619106 R=jbauman@chromium.org, kbr@chromium.org, piman@chromium.org, zmo@chromium.org Review-Url: https://codereview.chromium.org/2061953002 Cr-Commit-Position: refs/heads/master@{#399669} [modify] https://crrev.com/e02f94b59fa148e354db57a0f5d38f6bb644aa55/content/gpu/gpu_child_thread.cc
,
Jun 15 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e02f94b59fa148e354db57a0f5d38f6bb644aa55 commit e02f94b59fa148e354db57a0f5d38f6bb644aa55 Author: j.isorce <j.isorce@samsung.com> Date: Tue Jun 14 08:31:03 2016 Do not call CollectContextGraphicsInfo if GL has failed to initialize If the gpu process is not yet launch when calling GpuDataManagerImplPrivate::RequestCompleteGpuInfoIfNeeded() it will cause to start the gpu process and send the message GpuMsg_CollectGraphicsInfo right away. (possible causes: browser_bridge.js, SystemInfoHandler::GetInfo, GPUFeatureChecker) On gpu side this will cause to call GpuMain, GpuChildThread's constructor, OnInitialize and OnCollectGraphicsInfo sequentially. If dead_on_arrival_ is true then GpuChildThread::OnInitialize calls base::MessageLoop::current()->QuitWhenIdle() which might let handle the pending GpuMsg_CollectGraphicsInfo message. BUG= 619106 R=jbauman@chromium.org, kbr@chromium.org, piman@chromium.org, zmo@chromium.org Review-Url: https://codereview.chromium.org/2061953002 Cr-Commit-Position: refs/heads/master@{#399669} [modify] https://crrev.com/e02f94b59fa148e354db57a0f5d38f6bb644aa55/content/gpu/gpu_child_thread.cc
,
Jun 28 2016
Can we mark this issue as fixed ? Just adding a last note about what jbauman suggested in comment #8: "Another option is to get rid of the dead on arrival state and die immediately after creating a GPU context fails. That should work now that the browser process can detect when processes die before creating an IPC channel."
,
Jun 28 2016
Thanks for working on this Julien. chromium-try-flakes isn't reporting any recent flakes in this test: http://chromium-try-flakes.appspot.com/search?q=telemetry_unittests%20(with%20patch) Closing as fixed. Thanks. |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by kbr@chromium.org
, Jun 10 2016