New issue
Advanced search Search tips
Starred by 0 users

Issue metadata

Status: Fixed
Owner:
Closed: Dec 5
Cc:
Components:
OS: Windows
Pri: 1
Type: Defect



Sign in to add a comment
link

Issue 60: Vulkan Device destruction asserts all fences are completed, this isn't the case.

Reported by cwallez@chromium.org, Dec 4 Project Member

Issue description

The Vulkan backend is causing a flaky failure on Win10 FYI Exp (Release)

https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20FYI%20Exp%20Release%20%28NVIDIA%29

Here's one of the failures:

Assertion failure at ../../third_party/dawn/src/dawn_native/vulkan/DeviceVk.cpp:151 (~Device): mFencesInFlight.empty()
Received fatal exception EXCEPTION_BREAKPOINT
Still waiting for the following processes to finish:
	"c:\b\s\w\ir\out\Release\dawn_end2end_tests.exe" --gtest_flagfile="c:\b\s\w\itqwveny\scoped_dir1792_21741\08337e91-7c58-4769-b0a8-3a449322577f.tmp" --single-process-tests --test-launcher-output="c:\b\s\w\itqwveny\1792_13399\test_results.xml" --test-launcher-retry-limit=0 --test-launcher-summary-output="c:\b\s\w\iowbdrii\output.json" --use-gpu-in-tests
Backtrace:
	HandleAssertionFailure [0x745B2435+149]
	dawn_native::vulkan::Device::~Device [0x745A431B+159]
	dawn_native::vulkan::Device::`scalar deleting destructor' [0x745A53F7+11]
	dawn_native::DeviceBase::Release [0x7458B788+58]
	DawnTest::~DawnTest [0x00F1109A+92]
	BasicTests_BufferSetSubData_Test::`scalar deleting destructor' [0x00F18BCB+11]
	testing::Test::DeleteSelf_ [0x00F51B5D+13]
	testing::TestInfo::Run [0x00F51A5D+295]
	testing::TestCase::Run [0x00F51EAC+244]
	testing::internal::UnitTestImpl::RunAllTests [0x00F587D5+629]
	testing::UnitTest::Run [0x00F5845B+153]
	base::TestSuite::Run [0x00F676C2+100]
	main [0x00F43BA8+260]
	base::internal::Invoker<base::internal::BindState<void (__cdecl*)(void *),void *>,void __cdecl(void)>::Run [0x00F43BDC+12]
	base::OnceCallback<int __cdecl(void)>::Run [0x00F697BD+43]
	std::unique_ptr<logging::ScopedLogAssertHandler,std::default_delete<logging::ScopedLogAssertHandler> >::reset [0x00F689C1+299]
	base::LaunchUnitTestsWithOptions [0x00F68E2B+166]
	main [0x00F43B59+181]
	__scrt_common_main_seh [0x0104AD1C+250] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283)
	BaseThreadInitThunk [0x769E8674+36]
	RtlGetAppContainerNamedObjectPath [0x77D75D87+311]
	RtlGetAppContainerNamedObjectPath [0x77D75D57+263]

The Vulkan backend waits for idle and then assumes all fences must have passed. This is not the case given the assert firing above. We should see if we could instead wait for idle, set CompletedSignal to lastSignaled+1, Tick and then destroy all fences or alternatively wait on all of them individually then destroy.

Austin can you TAL since you touched this code recently?
 

Comment 1 by cwallez@chromium.org, Dec 4

Project Member
Cc: sugoi@chromium.org ynovikov@chromium.org
+CC wranglers

Comment 2 by bugdroid1@chromium.org, Dec 4

Project Member
The following revision refers to this bug:
  https://dawn.googlesource.com/dawn/+/66b024e499a11e2b98630d01c34543bd71f5c3c2

commit 66b024e499a11e2b98630d01c34543bd71f5c3c2
Author: Austin Eng <enga@chromium.org>
Date: Tue Dec 04 23:55:01 2018

Vulkan: Explicitly wait for all fences to complete on Device destruction

This ensures that all fences are complete. Flaky failures on Windows
showed some fence statuses were NOT_READY despite having been checked
after calling vkQueueWaitIdle.

Bug:  dawn:60 
Change-Id: Id4fa18c8842daf75faa9df6fcba8afdca43623c9
Reviewed-on: https://dawn-review.googlesource.com/c/2920
Reviewed-by: Corentin Wallez <cwallez@chromium.org>
Commit-Queue: Austin Eng <enga@chromium.org>

[modify] https://crrev.com/66b024e499a11e2b98630d01c34543bd71f5c3c2/src/dawn_native/vulkan/DeviceVk.cpp

Comment 3 by cwallez@chromium.org, Dec 5

Project Member
Status: Fixed (was: Accepted)

Sign in to add a comment