win/lld switch breaks gpu tests on fyi waterfall |
||||||
Issue descriptionThe lld switch broke these bots: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20FYI%20Release%20%28NVIDIA%29/1063 https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win10%20FYI%20dEQP%20Release%20%28NVIDIA%29/3528 These bots still use the builder/tester split; with luci it's not clear to me how to get to the corresponding builder. I'll try to find out. They're on the fyi waterfall but corresponding try bots are apparently auto-added for gpu changes (but not, say, linker changes) and block the cq (I'll file a separate bug for that). Some failures: NVIDIA bot: https://chromium-swarm.appspot.com/task?id=3d6a06183fea4910&refresh=10&show_raw=1 lay::initialize error 12289: Failed to create dummy OpenGL window. [7432:1660:0511/145650.312:11864078:ERROR:gl_surface_egl.cc(862)] eglInitialize OpenGLNull failed with error EGL_NOT_INITIALIZED [7432:1660:0511/145650.312:11864078:ERROR:gl_initializer_win.cc(232)] GLSurfaceEGL::InitializeOneOff failed. [7432:1660:0511/145650.312:11864078:FATAL:run_all_tests.cc(22)] Check failed: gl::init::InitializeGLOneOff(). Backtrace: base::debug::StackTrace::StackTrace [0x00951A30+32] base::debug::StackTrace::StackTrace [0x00899D4D+13] logging::LogMessage::~LogMessage [0x008A1B93+83] main [0x00848533+379] base::internal::Invoker<base::internal::BindState<int (__cdecl*)(base::TestSuite *),base::internal::UnretainedWrapper<base::TestSuite> >,int __cdecl(void)>::Run [0x008485B0+12] base::OnceCallback<int __cdecl(void)>::Run [0x008C7C12+44] std::unique_ptr<logging::ScopedLogAssertHandler,std::default_delete<logging::ScopedLogAssertHandler> >::reset [0x008C6EB1+297] base::LaunchUnitTestsSerially [0x008C7314+157] main [0x00848468+176] __scrt_common_main_seh [0x00B622DE+248] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283) BaseThreadInitThunk [0x77248674+36] RtlGetAppContainerNamedObjectPath [0x77374B47+311] RtlGetAppContainerNamedObjectPath [0x77374B17+263] [6400:6396:0511/151733.350:FATAL:canvas_resource_provider.cc(323)] Check failed: SharedGpuContext::IsGpuCompositingEnabled(). Backtrace: base::debug::StackTrace::StackTrace [0x69D77A50+32] base::debug::StackTrace::StackTrace [0x69D772FD+13] logging::LogMessage::~LogMessage [0x69D8C263+83] blink::CanvasResourceProvider::Create [0x6AEA224F+559] blink::WebGLRenderingContextBase::TexImageHelperHTMLVideoElement [0x6C010E29+1099] blink::WebGLRenderingContextBase::texImage2D [0x6C0111C3+85] blink::V8WebGL2RenderingContext::texImage2DMethodCallback [0x6C365178+5760] v8::internal::FunctionCallbackArguments::Call [0x69397FC1+625] v8::internal::SharedFunctionInfo::get_api_func_data [0x69396AD9+2601] v8::internal::BuiltinArguments::BuiltinArguments [0x6939577C+492] v8::internal::Builtin_HandleApiCall [0x693953B1+161] Received fatal exception EXCEPTION_BREAKPOINT Backtrace: base::debug::BreakDebugger [0x6A26A92C+12] ?Run@?$Invoker@U?$BindState@P6AXPBDHV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@base@@1@Z$$V@internal@base@@$$A6AXPBDHV?$BasicStringPiece@V?$basic_string@DU?$char_traits@D@std@@V?$allocator@D@2@@std@@@3@1@Z@internal@ [0x6A12C24D+31] logging::LogMessage::~LogMessage [0x69D8C67D+1133] blink::CanvasResourceProvider::Create [0x6AEA224F+559] blink::WebGLRenderingContextBase::TexImageHelperHTMLVideoElement [0x6C010E29+1099] blink::WebGLRenderingContextBase::texImage2D [0x6C0111C3+85] blink::V8WebGL2RenderingContext::texImage2DMethodCallback [0x6C365178+5760] v8::internal::FunctionCallbackArguments::Call [0x69397FC1+625] v8::internal::SharedFunctionInfo::get_api_func_data [0x69396AD9+2601] v8::internal::BuiltinArguments::BuiltinArguments [0x6939577C+492] v8::internal::Builtin_HandleApiCall [0x693953B1+161] (No symbol) [0x57A6BDCA] (No symbol) [0x0801C314] (No symbol) [0x0801C314] (No symbol) [0x0800EA7D] (No symbol) [0x0801C314] (No symbol) [0x0801C314] (No symbol) [0x0800EA7D] (No symbol) [0x08014FDC] (No symbol) [0x080085F1] v8::internal::Execution::New [0x696655C3+931] v8::internal::Execution::Call [0x696650F7+247] v8::internal::Execution::Call [0x69665021+33] v8::Function::Call [0x69307DA4+500] blink::V8ScriptRunner::CallFunction [0x6A868D3E+592] blink::V8EventListener::CallListenerFunction [0x6BB24017+371] blink::V8AbstractEventListener::InvokeEventHandler [0x6B1BBEA1+307] blink::V8AbstractEventListener::HandleEvent [0x6B1BBD0D+179] blink::V8AbstractEventListener::handleEvent [0x6B1BBC38+188] blink::EventTarget::FireEventListeners [0x6A8B0C06+1544] blink::EventTarget::FireEventListeners [0x6A8B043C+418] blink::Node::HandleLocalEvents [0x6A7E0DC5+269] blink::EventDispatcher::Dispatch [0x6A759A07+741] blink::Event::DispatchEvent [0x6A1CCC45+11] blink::EventDispatcher::DispatchEvent [0x6A7592FE+150] blink::Node::DispatchEventInternal [0x6A7E0E02+12] blink::MediaElementEventQueue::TimerFired [0x6B9E5E6D+721] blink::TaskRunnerTimer<blink::AXObjectCacheImpl>::Fired [0x69BACF07+15] blink::TimerBase::RunInternal [0x6A1B42CD+389] base::OnceCallback<void __cdecl(void)>::Run [0x68A26721+43] WTF::ThreadCheckingCallbackWrapper<base::OnceCallback<void __cdecl(void)>,void __cdecl(void)>::Run [0x69BA2913+95] base::debug::TaskAnnotator::RunTask [0x69D78274+308] blink::scheduler::internal::ThreadControllerImpl::DoWork [0x69BE3B5F+417] base::internal::Invoker<base::internal::BindState<void (__thiscall media::AudioRendererImpl::*)(enum media::BufferingState),base::WeakPtr<media::AudioRendererImpl>,enum media::BufferingState>,void __cdecl(void)>::Run [0x68DB8EB5+59] ??$ForwardRepeating@$$V@?$CancelableCallbackImpl@V?$RepeatingCallback@$$A6AXXZ@base@@@internal@base@@AAEXXZ [0x68D81F50+16] base::debug::TaskAnnotator::RunTask [0x69D78274+308] base::internal::IncomingTaskQueue::RunTask [0x6A26E869+105] base::MessageLoop::RunTask [0x69D94967+519] base::MessageLoop::DeferOrRunPendingTask [0x69D94CBD+157] base::MessageLoop::DoDelayedWork [0x69D95162+562] base::MessagePumpDefault::Run [0x6A270E4A+74] base::MessageLoop::Run [0x69D94484+116] base::RunLoop::Run [0x69DAF3FC+204] content::RendererMain [0x6A2289A1+1021] content::RunOtherNamedProcessTypeMain [0x69D6AE05+109] content::ContentMainRunnerImpl::Run [0x69D6B66E+430] service_manager::Main [0x69D708C2+1198] content::ContentMain [0x69D6AD6F+51] ChromeMain [0x68A2111E+286] MainDllLoader::Launch [0x00CA54BC+560] wWinMain [0x00CA1543+1347] __scrt_common_main_seh [0x00D860D2+246] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283) [1816:7192:0511/164958.933:4898281:ERROR:angle_platform_impl.cc(54)] initialize(470): ANGLE Display::initialize error 12289: Failed to create dummy OpenGL window. [1816:7192:0511/164958.933:4898281:ERROR:gl_surface_egl.cc(862)] eglInitialize OpenGL failed with error EGL_NOT_INITIALIZED [1816:7192:0511/164958.933:4898281:ERROR:gl_initializer_win.cc(232)] GLSurfaceEGL::InitializeOneOff failed. [1816:7192:0511/164958.933:4898281:FATAL:rendering_helper.cc(109)] Could not initialize GL Backtrace: base::debug::StackTrace::StackTrace [0x00E66790+32] base::debug::StackTrace::StackTrace [0x00E0A75D+13] logging::LogMessage::~LogMessage [0x00E174E3+83] media::RenderingHelper::InitializeOneOff [0x00D9154B+231] base::internal::Invoker<base::internal::BindState<void (__cdecl*)(bool,base::WaitableEvent *),bool,base::WaitableEvent *>,void __cdecl(void)>::Run [0x00D995F7+17] base::debug::TaskAnnotator::RunTask [0x00EFB864+308] base::internal::IncomingTaskQueue::RunTask [0x00EF6559+105] base::MessageLoop::RunTask [0x00E73197+519] base::MessageLoop::DeferOrRunPendingTask [0x00E734ED+157] base::MessageLoop::DoWork [0x00E7371A+506] base::MessagePumpForUI::DoRunLoop [0x00E74568+120] base::MessagePumpWin::Run [0x00E740CE+110] base::MessageLoop::Run [0x00E72CB4+116] base::RunLoop::Run [0x00E20A8C+204] base::Thread::Run [0x00E2ACC4+164] base::Thread::ThreadMain [0x00E2AF57+631] base::PlatformThread::SetCurrentThreadPriority [0x00E29E95+533] BaseThreadInitThunk [0x75668674+36] RtlGetAppContainerNamedObjectPath [0x77034B47+311] RtlGetAppContainerNamedObjectPath [0x77034B17+263] deqp bot: https://chromium-swarm.appspot.com/task?id=3d69c613d962a410&refresh=10&show_raw=1 [ RUN ] dEQP_GLES31.Default/info_render_target dEQP-GLES31.info.render_target Writing test log into TestResults.qpa Exception running test: Got EGL_NOT_INITIALIZED: initialize(m_eglDisplay, &major, &minor) at egluGLContextFactory.cpp:331 ../../third_party/angle/src/tests/deqp_support/angle_deqp_gtest.cpp(295): error: Value of: result Actual: false Expected: true Stack trace: Backtrace: StackTraceGetter::CurrentStackTrace [0x009BD778+40] testing::internal::UnitTestImpl::CurrentOsStackTraceExceptTop [0x009C55F7+69] testing::internal::AssertHelper::operator= [0x009C525C+48] So for some reason gl intialization fails when linking with lld. jmadill says "You can build these tests with Chrome and by running with the --use-angle=gl flag".
,
May 12 2018
I wonder whether what's going on here has something to do with what was going on here: https://chromium-review.googlesource.com/c/chromium/src/+/826544
,
May 12 2018
I downloaded an archive of a bad build from http://commondatastorage.googleapis.com/chromium-gpu-fyi-archive/chromium.gpu.fyi/GPU FYI Win Builder/full-build-win32_7aec5f843746c07cab0be327fa1ee84cd4b66eee.zip (from the "extract build" output on the build linked in comment 0) and a good build from http://commondatastorage.googleapis.com/chromium-gpu-fyi-archive/chromium.gpu.fyi/GPU FYI Win Builder/full-build-win32_9c79c3546c7996d284ddf16b45845ab51b213d00.zip I then wanted to compare dumpbin /dependent based on pcc's comment, but chrome wants to scan the zip files before letting me look at them, and on my laptop it takes a long long time for chrome to scan these 1.2 GB zip files. So now I'll let the laptop compute for a while instead :-/
,
May 12 2018
Nothing obvious from dumpbin /dependents output (I ended up downloading the zips with IE O_o), but from spelunking through the code a bit this looks pretty suspicious: https://cs.chromium.org/chromium/src/ui/gl/angle_platform_impl.cc?rcl=0b14f4b844da36a953498066fa590b0e2bcba813&l=29 // Place the function pointers for ANGLEGetDisplayPlatform and // ANGLEResetDisplayPlatform in read-only memory after being resolved to prevent // them from being tampered with. See crbug.com/771365 for details. PROTECTED_MEMORY_SECTION base::ProtectedMemory<GetDisplayPlatformFunc> g_angle_get_platform; PROTECTED_MEMORY_SECTION base::ProtectedMemory<ResetDisplayPlatformFunc> g_angle_reset_platform; https://cs.chromium.org/chromium/src/base/memory/protected_memory.h?type=cs&l=90 // Define a read-write prot section. The $a, $mem, and $z 'sub-sections' are // merged alphabetically so $a and $z are used to define the start and end of // the protected memory section, and $mem holds protected variables. // (Note: Sections in Portable Executables are equivalent to segments in other // executable formats, so this section is mapped into its own pages.) #pragma section("prot$a", read, write) #pragma section("prot$mem", read, write) #pragma section("prot$z", read, write) // We want the protected memory section to be read-only, not read-write so we // instruct the linker to set the section read-only at link time. We do this // at link time instead of compile time, because defining the prot section // read-only would cause mis-compiles due to optimizations assuming that the // section contents are constant. #pragma comment(linker, "/SECTION:prot,R") __declspec(allocate("prot$a")) __declspec(selectany) char __start_protected_memory; __declspec(allocate("prot$z")) __declspec(selectany) char __stop_protected_memory; #define PROTECTED_MEMORY_SECTION __declspec(allocate("prot$mem")) lld probably doesn't implement all of that, or not correctly.
,
May 12 2018
That code was added in https://chromium-review.googlesource.com/c/chromium/src/+/765009 by vstyrklevich. Vlad, did you check if that does the right thing with lld?
,
May 12 2018
On second look, it looks rather harmless, but it's linker-y and in the vicinity of what goes wrong...
,
May 13 2018
That change is from November and I don't remember if I tested it with lld then. That CL is potentially sensitive to linker changes, it relies on a sections being merged in a particular order. The location it's failing is suspicious with the ProtectedMemory change. I'm trying to download a broken build to take a look at it. To test it locally you could try a build with lines 89-112 of base/memory/protected_memory.h commented out to disable ProtectedMemory Windows support.
,
May 13 2018
Looking at gpu_unittests.exe from full-build-win32_7aec5f843746c07cab0be327fa1ee84cd4b66eee.zip __start_protected_memory and __end_protected_memory seem to have been merged in the correct order and the section is correctly set readable. Also the build has DCHECK() enabled so a failure in ProtectedMemory should error out if the VirtualProtect() or other sanity tests had failed. The failure messages stemming from "Failed to create dummy OpenGL window" is a failure in the ANGLE code that indicates a failing CreateWindowEx() return code. Given that ANGLE doesn't use base::ProtectedMemory that's a strike against that hypothesis.
,
May 13 2018
C:\Users\thakis\Downloads\full-build-win32_7aec5f843746c07cab0be327fa1ee84cd4b66eee\full-build-win32>python scan.py
.\angle_end2end_tests.exe
.\angle_gles1_conformance_tests.exe
.\angle_util.dll
.\libEGL.dll
.\libGLESv1_CM.dll
.\libGLESv2.dll
.\swiftshader_unittests.exe
.\swiftshader\libGLESv2.dll
.\win_clang_nacl_win64\libEGL.dll
.\win_clang_nacl_win64\libGLESv2.dll
.\win_clang_nacl_win64\swiftshader\libGLESv2.dll
C:\Users\thakis\Downloads\full-build-win32_7aec5f843746c07cab0be327fa1ee84cd4b66eee\full-build-win32>type scan.py
import os
import subprocess
matches = []
for root, dirnames, filenames in os.walk('.'):
for filename in filenames:
if not (filename.endswith(".exe") or filename.endswith(".dll")):
continue
filename = os.path.join(root, filename)
o = subprocess.check_output(["dumpbin", "/dependents", filename])
if 'libgles' in o.lower():
print filename
,
May 13 2018
This repros locally at least with the bad build downloaded from the url mentioned in comment 3: C:\Users\thakis\Downloads\full-build-win32_7aec5f843746c07cab0be327fa1ee84cd4b66eee\full-build-win32>command_buffer_perf
tests.exe --use-angle=gl-null --gtest_filter=DecoderPerfTest.TextureDraw
IMPORTANT DEBUGGING NOTE: batches of tests are run inside their
own process. For debugging a test inside a debugger, use the
--gtest_filter=<your_test_name> flag along with
--single-process-tests.
Using sharding settings from environment. This is shard 0/1
Using 1 parallel jobs.
[2672:11480:0513/165743.982:2184674521:ERROR:angle_platform_impl.cc(54)] initialize(470): ANGLE Display::initialize erro
r 12289: WGL_NV_DX_interop2 is required but not present.
[2672:11480:0513/165743.983:2184674521:ERROR:gl_surface_egl.cc(862)] eglInitialize OpenGLNull failed with error EGL_NOT_
INITIALIZED
[2672:11480:0513/165743.983:2184674521:ERROR:gl_initializer_win.cc(232)] GLSurfaceEGL::InitializeOneOff failed.
[2672:11480:0513/165743.984:2184674521:FATAL:run_all_tests.cc(22)] Check failed: gl::init::InitializeGLOneOff().
Backtrace:
base::debug::StackTrace::StackTrace [0x01071A30+32]
base::debug::StackTrace::StackTrace [0x00FB9D4D+13]
logging::LogMessage::~LogMessage [0x00FC1B93+83]
main [0x00F68533+379]
base::internal::Invoker<base::internal::BindState<int (__cdecl*)(base::TestSuite *),base::internal::UnretainedWr
apper<base::TestSuite> >,int __cdecl(void)>::Run [0x00F685B0+12]
base::OnceCallback<int __cdecl(void)>::Run [0x00FE7C12+44]
std::unique_ptr<logging::ScopedLogAssertHandler,std::default_delete<logging::ScopedLogAssertHandler> >::reset [0
x00FE6EB1+297]
base::LaunchUnitTestsSerially [0x00FE7314+157]
main [0x00F68468+176]
__scrt_common_main_seh [0x012822DE+248] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283)
BaseThreadInitThunk [0x7555343D+18]
RtlInitializeExceptionChain [0x77B99832+99]
RtlInitializeExceptionChain [0x77B99805+54]
Failed to get out-of-band test success data, dumping full stdio below:
[1/1] DecoderPerfTest.TextureDraw (0 ms)
1 test failed:
DecoderPerfTest.TextureDraw (../../gpu/command_buffer/tests/decoder_perftest.cc:569)
Tests took 4 seconds.
,
May 13 2018
Actually that's different, it says "WGL_NV_DX_interop2 is required but not present" (and seems to happen with the good build as well), while the bot says "Failed to create dummy OpenGL window."
,
May 13 2018
,
May 14 2018
Nope that's not it.
,
May 14 2018
Luckily this repro'd on hans's windows box, so we pair-debugged it a bit. We added a custom windowproc that forwards to DefWindowProc and logged window messages. With link.exe, doing just that (https://paste.googleplex.com/6048858521993216) made the test fail if the wrapping window proc wasn't static. If it was static, then the messages with passing link were 0x24 0x81 0x83 0x1, while lld was 0x24 2x81 0x82 So for lld, the defwindowproc for WM_NCCREATE returned FALSE for some reason. Hans then noticed that we pass DefWindowProc but call CreateWindowExA and build with -DUNICODE< so DefWindowProc was DefWindowProcW. Explicitly using DefWindowProcA makes everything go with both linkers. So this is a code bug, and Hans is making a CL for that. Still curious why things happened to work with link but not lld. (Other things hans tried was removing /DEF: flags, no difference. We considered turning off ICF, but by that time hans had seen the DefWindowProcA thing already.)
,
May 14 2018
Thanks for filing the issue. The --use-angle=gl or --use-angle=gl-null flags should repro the problem, but it looks like you got it figured out.
,
May 14 2018
,
May 14 2018
,
May 14 2018
The following revision refers to this bug: https://chromium.googlesource.com/angle/angle/+/5d2ccc534d26cc1801dd82479d647cce84ba25e2 commit 5d2ccc534d26cc1801dd82479d647cce84ba25e2 Author: Hans Wennborg <hans@chromium.org> Date: Mon May 14 14:06:56 2018 Use DefWindowProcA for window created with CreateWindowExA Bug: chromium:842408 Change-Id: I8793e3bb9ed4661e49eceb55c7253d7ada06488a Reviewed-on: https://chromium-review.googlesource.com/1057231 Reviewed-by: Nico Weber <thakis@chromium.org> Reviewed-by: Jamie Madill <jmadill@chromium.org> Commit-Queue: Jamie Madill <jmadill@chromium.org> [modify] https://crrev.com/5d2ccc534d26cc1801dd82479d647cce84ba25e2/src/libANGLE/renderer/gl/wgl/DisplayWGL.cpp
,
May 14 2018
jmadill: once angle with the fix has rolled into chromium, which trybot do i need to add to the lld switch to make sure things are happy now?
,
May 14 2018
Use win_optional_gpu_tests_rel: https://ci.chromium.org/buildbot/tryserver.chromium.win/win_optional_gpu_tests_rel/23638 Look for the webgl_conformance_gl_passthrough_tests and gles2_conform_gl_test targets.
,
May 14 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/f7f609d28c4d78e68712a710c66e9e4d45eef1fe commit f7f609d28c4d78e68712a710c66e9e4d45eef1fe Author: angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Mon May 14 15:35:33 2018 Roll src/third_party/angle/ 66aafcb46..5d2ccc534 (1 commit) https://chromium.googlesource.com/angle/angle.git/+log/66aafcb4641c..5d2ccc534d26 $ git log 66aafcb46..5d2ccc534 --date=short --no-merges --format='%ad %ae %s' 2018-05-14 hans Use DefWindowProcA for window created with CreateWindowExA Created with: roll-dep src/third_party/angle BUG= chromium:842408 The AutoRoll server is located here: https://angle-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. CQ_INCLUDE_TRYBOTS=luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel TBR=ynovikov@chromium.org Change-Id: I01da66d6a726675f2362ecc4206c1ae11578d247 Reviewed-on: https://chromium-review.googlesource.com/1057357 Commit-Queue: angle-chromium-autoroll <angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Reviewed-by: angle-chromium-autoroll <angle-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#558320} [modify] https://crrev.com/f7f609d28c4d78e68712a710c66e9e4d45eef1fe/DEPS
,
May 14 2018
,
May 14 2018
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/win_optional_gpu_tests_rel/1729 |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by thakis@chromium.org
, May 12 2018