Issue metadata
Sign in to add a comment
|
Chrome: Crash Report - ThreadWatcherList::ParseCommandLineCrashOnHangThreads |
||||||||||||||||||||
Issue descriptionProduct name: Chrome Magic Signature: ThreadWatcherList::ParseCommandLineCrashOnHangThreads Current link: https://crash.corp.google.com/browse?q=product.name%3D'Chrome'%20AND%20product.version%3D'56.0.2924.0'%20AND%20custom_data.ChromeCrashProto.ptype%3D'browser'%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D'ThreadWatcherList%3A%3AParseCommandLineCrashOnHangThreads'&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#reports Search properties: product.name: Chrome product.version: 56.0.2924.0 custom_data.chromecrashproto.ptype: browser Stack trace: ============ Thread 0 CRASHED [EXCEPTION_ILLEGAL_INSTRUCTION @ 0x0fb2f720 ] MAGIC SIGNATURE THREAD Stack Quality100%Show frame trust levels 0x0fb2f720 (chrome.dll -math.h:642 ) ceilf 0x113e4a48 (chrome.dll -thread_watcher.cc:523 ) ThreadWatcherList::ParseCommandLineCrashOnHangThreads(std::basic_string<char,std::char_traits<char>,std::allocator<char> > const &,unsigned int,unsigned int,std::map<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,ThreadWatcherList::CrashDataThresholds,std::less<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,std::allocator<std::pair<std::basic_string<char,std::char_traits<char>,std::allocator<char> > const ,ThreadWatcherList::CrashDataThresholds> > > *) 0x113e47f8 (chrome.dll -thread_watcher.cc:494 ) ThreadWatcherList::ParseCommandLine(base::CommandLine const &,unsigned int *,std::map<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,ThreadWatcherList::CrashDataThresholds,std::less<std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,std::allocator<std::pair<std::basic_string<char,std::char_traits<char>,std::allocator<char> > const ,ThreadWatcherList::CrashDataThresholds> > > *) 0x113e43d3 (chrome.dll -thread_watcher.cc:367 ) ThreadWatcherList::StartWatchingAll(base::CommandLine const &) 0x1065c202 (chrome.dll -chrome_browser_main.cc:1804 ) ChromeBrowserMainParts::PreMainMessageLoopRunImpl() 0x1065afe7 (chrome.dll -chrome_browser_main.cc:1249 ) ChromeBrowserMainParts::PreMainMessageLoopRun() 0x102e83c6 (chrome.dll -browser_main_loop.cc:967 ) content::BrowserMainLoop::PreMainMessageLoopRun() 0x0fc1749e (chrome.dll -callback.h:64 ) base::internal::RunMixin<base::Callback<bool ,1,1> >::Run() 0x102e7f66 (chrome.dll -browser_main_loop.cc:852 ) content::BrowserMainLoop::CreateStartupTasks() 0x102eae14 (chrome.dll -browser_main_runner.cc:126 ) content::BrowserMainRunnerImpl::Initialize(content::MainFunctionParams const &) 0x102e6668 (chrome.dll -browser_main.cc:42 ) content::BrowserMain(content::MainFunctionParams const &) 0x10623c34 (chrome.dll -content_main_runner.cc:774 ) content::ContentMainRunnerImpl::Run() 0x0fb25d02 (chrome.dll -chrome_main.cc:108 ) ChromeMain 0x00d758e8 (chrome.exe -main_dll_loader_win.cc:174 ) MainDllLoader::Launch(HINSTANCE__ *,base::TimeTicks) 0x00d71c38 (chrome.exe -chrome_exe_main_win.cc:249 ) wWinMain 0x00ddbfc7 (chrome.exe -exe_common.inl:253 ) __scrt_common_main_seh 0x779aee6b (kernel32.dll + 0x0004ee6b ) BaseThreadInitThunk 0x77ba3ab2 (ntdll.dll + 0x00063ab2 ) __RtlUserThreadStart 0x77ba3a85 (ntdll.dll + 0x00063a85 ) _RtlUserThreadStart Link to the list of the builds: ============================== https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.ptype%3D%27browser%27%20AND%20custom_data.ChromeCrashProto.magic_signature_1.name%3D%27ThreadWatcherList%3A%3AParseCommandLineCrashOnHangThreads%27%20AND%20product.name%3D%27Chrome%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D
,
Nov 18 2016
https://codereview.chromium.org/2445763002 r432913 is not responsible for this crash since it was committed and reverted in the 56.0.2922.0..56.0.2924.0 range https://codereview.chromium.org/2517443002 r432989 The crash report has an illegal instruction exception. Did we change something that could have allowed this instruction in our binary? Disassembly: --- c:\b\depot_tools\win_toolchain\vs_files\d5dc33b15d1b2c086f2f6632e2fd15882f80dbd3\win_sdk\include\10.0.10586.0\ucrt\math.h 5D9CF71A push ebp 5D9CF71B mov ebp,esp 5D9CF71D sub esp,0Ch 5D9CF720 vmovss xmm0,dword ptr [_X] <------ Illegal instruction error 5D9CF725 vcvtss2sd xmm0,xmm0,xmm0 5D9CF729 vmovsd qword ptr [ebp-0Ch],xmm0
,
Nov 18 2016
math.h:642 looks pretty innocent
_Check_return_ __inline float __CRTDECL ceilf(_In_ float _X)
{
return (float)ceil(_X);
}
Disk corruption?
,
Nov 18 2016
Chrome 56.0.2924.1 Looks like this: 0:000> u 1010769d chrome!ceilf: 1010769d 55 push ebp 1010769e 8bec mov ebp,esp 101076a0 83e4c0 and esp,0FFFFFFC0h 101076a3 83ec40 sub esp,40h 101076a6 f30f104508 movss xmm0,dword ptr [ebp+8] 101076ab 83ec08 sub esp,8 101076ae 0f5ac0 cvtps2pd xmm0,xmm0 101076b1 f20f11442440 movsd mmword ptr [esp+40h],xmm0 The crash looks like this: 0:000> u 6159f71a chrome_614a0000!ceilf 6159f71a 55 push ebp 6159f71b 8bec mov ebp,esp 6159f71d 83ec0c sub esp,0Ch 6159f720 c5fa104508 vmovss xmm0,dword ptr [ebp+8] 6159f725 c5fa5ac0 vcvtss2sd xmm0,xmm0,xmm0 6159f729 c5fb1145f4 vmovsd qword ptr [ebp-0Ch],xmm0 6159f72e dd45f4 fld qword ptr [ebp-0Ch] 6159f731 51 push ecx Strange change of instructions for the same code.
,
Nov 18 2016
Dropping Internals > TaskScheduler since it's not really a task scheduler bug. Adding brucedawson@ who may have seen something like this before.
,
Nov 18 2016
I just talked to robliao@/fdoray@ about this. The problem is almost certainly an ODR-type violation caused by a failure to inline an inline function that is called by translation units compiled normally and compiled with /AVX (or some variant). Two or more versions of _ceilf are generated (some using AVX instructions, some not) and the linker is free to choose whichever ones it wants. This means that non-AVX code paths end up using AVX, and crashes happen. As far as I can tell it is almost impossible to safely build part of a binary with /AVX. This issue was hit elsewhere in Chrome some weeks ago and the VC++ team helped explain the issue.
,
Nov 18 2016
This looks like a duplicate (albeit perhaps a more serious variant) of crbug.com/654213 . For context, these links were helpful in understanding the issue, including a Connect bug which Microsoft commented on. 1. Reproduction example: https://bitbucket.org/chromiumembedded/cef/issues/attachments/1999/chromiumembedded/cef/1474399239.3/1999/vs2015u3-ltcg-bug-1.zip 2. Quick analysis: https://bitbucket.org/chromiumembedded/cef/issues/1999#comment-30763602 3. Microsoft Connect entry: https://connect.microsoft.com/VisualStudio/feedback/details/3103806 This all assumes that vmovss is an AVX or AVX2 instruction which I should probably double check...
,
Nov 18 2016
Yup. vmovss is an AVX instruction. VMOVSS xmm1, m32 (Opcode: VEX.LIG.F3.0F.WIG 10 /r) CPUID Feature Flag: AVX
,
Nov 18 2016
vmovss is an AVX instruction. (Anything with a v-prefix is.) We've wrapped all SkNx types in anonymous namespaces, and force-inlined all methods. They should be safe now both by law (the namespace) and doubly so in practice (force-inlining). Is the problem here that math.h does not mark ceilf, etc as static?
,
Nov 18 2016
Issue 666722 has been merged into this issue.
,
Nov 18 2016
The problem is that if ceilf is ever called by a translation unit that is compiled with /AVX then we are potentially doomed. If the compiler decides not to inline ceilf then non-inline code will be generated and at link time the linker gets to choose between all non-inlined versions of ceilf. So, it is not enough to use anonymous namespaces or forceinline in skia code. This has to be applied to *all* code that is included in /AVX translation units. Uggh.
,
Nov 18 2016
Unsurprisngly... 0:000> !cpuid CP F/M/S Manufacturer MHza 0 6,23,10 GenuineIntel 2000 1 6,23,10 GenuineIntel 2000 From what I can gather from the web, this is a Pentium(R) Dual-Core CPU T4300 @ 2.10GHz Penryn Core. This predates AVX.
,
Nov 18 2016
This problem goes away if the functions are static. They're no longer the same ceilf, and the linker has no choices to make. Every function that's defined in a header needs to be marked static, or you risk ODR violation. Here it's just easy because the AVX instructions are so different, but it's easy to construct if the code differs depending on a preprocessor define or some other compile-time bit. Do you happen to have the definition of ceilf from this math.h handy? Is it marked static?
,
Nov 18 2016
Oh, right, it's up there in comment 3. No static. That's the bug.
,
Nov 18 2016
Here is the VC++ definition of ceilf:
_Check_return_ __inline float __CRTDECL ceilf(_In_ float _X)
{
return (float)ceil(_X);
}
Defining all header-file functions as static would indeed avoid this, but is that really standard practice? Absent a requirement to do this I think Microsoft would (and in fact has) argued that the mistake is ours.
Practically speaking we will need to come up with a fix other than marking ceilf as static, but if there is a C++ standard requirement or recommendation that inline functions be static then that could affect future VC++ versions.
,
Nov 18 2016
Yes, as far as I know, all functions that GCC and Clang define in headers are static. I presume it's for this reason, but we might want to get someone who can speak with more authority there. Here's what I'm thinking, first in Skia, and then if that goes OK, similarly in Chromium: https://skia-review.googlesource.com/c/5005/
,
Nov 18 2016
robliao@, Would it be possible to revert the change ASAP and merge it to 2924? we have some other hiccups with qualifying a good M56 dev candidate and we are planning to trigger one from 2922 after the revert. Appreciate your help!
,
Nov 18 2016
,
Nov 18 2016
Typo: we are planning to trigger one from 2924 after the revert.
,
Nov 18 2016
I don't think reverting any CL mentioned on this thread will help anything.
,
Nov 18 2016
I agree with mtklein@ that reverting is unlikely to help. The change listed in comment #16 is step one towards resolving the crashes, although note that it is probably part one of two.
,
Nov 18 2016
The following revision refers to this bug: https://skia.googlesource.com/skia.git/+/c89e2438ae24c862087f8ca1cb076052edbb27ea commit c89e2438ae24c862087f8ca1cb076052edbb27ea Author: Mike Klein <mtklein@chromium.org> Date: Fri Nov 18 17:55:21 2016 Turn off /arch:AVX[2] on Windows builds. This canaries a similar change Chrome may need. BUG= chromium:666707 GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=5005 Change-Id: Ibf7f9941968d905d865b9be1e63ebbf768870175 Reviewed-on: https://skia-review.googlesource.com/5005 Reviewed-by: Mike Klein <mtklein@chromium.org> Commit-Queue: Mike Klein <mtklein@chromium.org> [modify] https://crrev.com/c89e2438ae24c862087f8ca1cb076052edbb27ea/BUILD.gn [modify] https://crrev.com/c89e2438ae24c862087f8ca1cb076052edbb27ea/gn/BUILD.gn
,
Nov 18 2016
Just so we're all clear, the change landed in #22 doesn't affect Chrome. That GN build is completely separate from Chrome's. I just wanted to land this first to test the effect on Skia-in-isolation before we change Skia-in-Chrome. If that seems sane on our tree, we will want to make an analogous change to src/skia/BUILD.gn. I imagine we want to remove all /arch:AVX and /arch:AVX2 flags in Chrome, not just Skia's right? Are there others beyond libvpx? Do you know if they can function without /arch:AVX?
,
Nov 18 2016
Agreed. with analysis on revert. The root cause problem here is the use of an AVX instruction on a non AVX capable CPU. mtklein: What's the typical timeframe for Skia to know this change is okay?
,
Nov 18 2016
skia, libvpx, and possibly pdfium (although that is just a skia\BUILD.gn file in pdfium and is probably not relevant).
,
Nov 18 2016
Probably an hour or so, but I can start the Chromium CQ in parallel. Right, I think the pdfium you're seeing is for its standalone build, not as built by Chrome.
,
Nov 18 2016
Sounds good. I wonder if there will be a non-trivial performance impact to removing the AVX flags (not that we can do anything about it in one day). Reassigning to mtklein in the meantime since he's doing the legwork already.
,
Nov 18 2016
Yeah, very likely there's a performance impact here, but worst case we can go back to maxing out at SSE4.1 on Windows and just use AVX+ on Mac, Linux, etc. It's sort of moot to worry about speed if it's gonna crash. clang-cl may or may not have this problem (depending on the exact libraries it uses), so anything we undo here we can revisit if we start seriously using Clang on Windows.
,
Nov 18 2016
The Skia tree's looking pretty happy (https://status.skia.org/?commitLabel=author&commit_label=author&filter=search&search_value=Win) so I'm going to let https://codereview.chromium.org/2518473003 land. Bruce has gamely volunteered to handle the equivalent changes for libvpx.
,
Nov 18 2016
Issue 666708 has been merged into this issue.
,
Nov 18 2016
A change to disable /arch:AVX and /arch:AVX2 in libvpx\BUILD.gn is going through the try bots - crrev.com/2516703002. A quick glance suggests that most of the code uses intrinsics so we may not lose much performance. On the other hand, I think that switching between AVX and SSE without vzeroupper/vzeroall can harm performance, so it's hard to know what will happen. https://software.intel.com/en-us/articles/intel-avx-state-transitions-migrating-sse-code-to-avx
,
Nov 18 2016
Yeah, the SSE <-> AVX transition is what had me suspecting we'll see a perf hit from this in Skia. We've seen this throttle performance to a degree that we had to entirely redesign the system we're building to stay in AVX all the time, even if it otherwise would be possible to call some shared SSE code instead. To be fair, it's a whole lot worse if you _don't_ vzerouppper... Apparently this is not so bad since Skylake, but I don't have a machine to confirm that. I think I'm gonna have to link to crrev.com/2516703002 if this ever comes up again... that describes the problem as I understand it perfectly.
,
Nov 18 2016
The really fun thing here is, I don't think ceilf() ever has to be called from the AVX code in order for it to break non-AVX code. All we need is the AVX compiled source to include the math.h with the non-static ceilf()... at that point it's the same as if the AVX source file defined ceilf() itself. Whether it's otherwise used by code in that file is moot, right?
,
Nov 18 2016
With the VC++ toolchain that concern (ceilf being emitted even if not called) was real, but with VC++ 2013 and beyond as long as you specify "/Zc:inline" (which we do) then unreferenced COMDATs (functions, for our purposes here) are discarded from .obj files.
,
Nov 18 2016
Ah, good. I read "Removes unreferenced functions or data that are COMDATs or only have internal linkage." and thought, "okay, these are not internal linkage... I wonder what a COMDAT is".
,
Nov 18 2016
Ah, so, we may need to figure out clang-cl now instead of later. https://codereview.chromium.org/2518473003 clang-cl wants /arch:AVX or -mavx to use intrinsics. Any idea which math.h it uses? MSVC's? Its own?
,
Nov 18 2016
Adding thakis@ for clang-cl perspective. Temporary fix would be to pass /arch:AVX when is_clang == true. We may get no crashes, or we may get crashes when clang-cl binaries are run on non-AVX machines. Either way it's a huge improvement. I am fairly certain that clang-cl uses Microsoft's header files, including math.h. But the lack of LTCG and other differences means that the bug appearing in VC++ does not imply that it will appear in clang-cl builds.
,
Nov 18 2016
I just used /showIncludes on a clang-cl build to confirm that math.h comes from Microsoft in clang-cl builds: including file: c:\src\depot_tools\win_toolchain\vs_files\d5dc33b15d1b2c086f2f6632e2fd15882f80dbd3\win_sdk\bin\..\..\win_sdk\Include\10.0.10586.0\ucrt\math.h
,
Nov 18 2016
Yeah, so whether or not the bug bites us, it's still there with clang-cl. I'll update my CL to turn AVX/AVX2 back on when is_clang, but I'm not looking forward to the conversation we're going to have a few months. Seems like the only way to work around this problem with clang-cl is to use a different libc or to turn off AVX entirely.
,
Nov 18 2016
Agreed.
,
Nov 18 2016
clang-cl uses msvc's headers for most stuff; it has a handful of built-in headers (https://github.com/llvm-mirror/clang/tree/master/lib/Headers – many of them do some stuff and then forward to the ms headers though). Can someone summarize what you want to do? What exactly is the clang-cl question here? Passing different flags for cl and clang-cl is in general a bad idea. re inline functions: This is different in C and C++. If your header should be used from c and c++ files I think you do need to make your functions inline static; I believe C++ guarantees that inline functions are identical in all TUs. I'm not sure how this is supposed to interact with different /arch settings if a .h is built in different TUs. Maybe this means inline functions must not take advantage of /arch flags?
,
Nov 18 2016
From reading the bug, it sounds like skia has some files that use avx intrinsics. You build those with /arch:AVX, and then the compiler figures it can use that for some function from math.h, and then the linker picks the AVX'd function from math.h when it merges inline functions. clang-cl requires /arch:avx to be able to use avx intrinsics. Hm. Not sure what the best fix is, but to get things not crashy again, I'd suggest not building the avx parts of skia for now (I gather they're new, else we would've had this crash for longer), and then figure out what to do.
,
Nov 18 2016
That's exactly the problem we're seeing here. The same non-static inline ceilf() from urcrt\math.h is compiled differently in two translation units, one with VEX-encoded vmovss and one with non-VEX-encoded movss. We link those together and get one of the two arbitrarily. Things crash on older machines if we happened to pick the VEX encoded one. I think the distinction here is, C++ doesn't guarantee that inline functions are identical in all TUs... it requires it. It's very easy to violate that requirement by compiling an inline function with two different /arch settings. It's not hard, though, to do the same with the preprocessor, e.g. checking #ifdef NDEBUG.
,
Nov 18 2016
Well, if you have different definitions of an inline in two TUs that's an ODR violation. What's interesting here is that the different definition arises because of a compiler flag, and because the function is in a system header...
,
Nov 18 2016
Yeah, exactly. Ordinarily these things are my fault and I can fix them directly with more static or an anonymous namespace... in this case, I claim the bug is that ceilf() is defined inline in ucrt\math.h but not marked static. I don't know how to fix that.
,
Nov 18 2016
Yep, ODR violation is the basic problem. The question is *whose fault is it*? Is it our fault for compiling ceilf with two different /arch: settings? Or Microsoft's fault for not marking ceilf as static? Given the definition of ceilf the /arch:AVX flag seems almost useless. Some of Microsoft's thoughts on this are here: https://connect.microsoft.com/VisualStudio/feedback/details/3103806
,
Nov 18 2016
If the clang-cl concern about was wrt /Zc:inline: Are you sure clang-cl doesn't discard unreferenced comdats? I wouldn't thought we do (and if not, that seems like a bug?)
,
Nov 18 2016
https://connect.microsoft.com/VisualStudio/feedback/details/3103806 doesn't mention _ceilf and that this happens in system headers, so it seems somewhat tangential. To me it feels like a MS toolchain bug if compiling to distinct TUs (with distinct contents) with different /arch settings and linking them together breaks your program. But given that system headers currently contain non-static inline functions, I guess it's not currently safe to use different /arch flags on different TUs if those TUs include system headers.
,
Nov 18 2016
The basic clang-cl concern is, unlike MSVC, we have no way to express "we'll use AVX intrinsics, but you compiler please stay away from AVX" as a sort of halfway fix. Under that strategy, ceilf() would be compiled always using the same settings, and no ODR violation happens, but we can still use AVX in files that are specifically guarded by namespaces and runtime CPU checks. With clang-cl, we either pass -mavx, and both we and the compiler can use AVX, or not, and neither can.
,
Nov 18 2016
namespace {
#include <math.h>
}
:-/
,
Nov 18 2016
That'll break anything that's _not_ inline, won't it?
,
Nov 18 2016
Ah, yeah. #define __inline static __inline ?
,
Nov 18 2016
Yikes. Nothing here is sounding particularly robust. I think I'd better mention the alternative we're all thinking but no one is nuts enough to really suggest, which is to rewrite all the Windows libc headers. #define __inline static __inline is kind of a dynamic version of that that works when you remember to do it... I think what I'd like to do for now is disable our Skia's AVX parts on Windows with some stubs (CL incoming shortly) to buy me some time to breathe and ponder.
,
Nov 18 2016
I think the Real Fix (tm) is to report that to Microsoft and have them add static to their inlines in their headers. Then again, `find ~/src/depot_tools/win_toolchain/vs_files/d5dc33b15d1b2c086f2f6632e2fd15882f80dbd3/ -name '*.h' | xargs grep inline | grep -v static` prints 5837 lines that are in need of auditing. I agree that none of the suggestions are very robust. Another non-robust thing from the msconnect link Bruce linked to: It sounds like the linker picks the inline function it finds first, so making sure a sse-built TU including <math.h> is in front of the avx-built ones on the link command might also work. As said upbug, it sounds like the most robust way to do this is to not include system headers in the files that want to use avx.
,
Nov 18 2016
Agreed on all points. I'll start looking at avoiding system headers Monday.
,
Nov 18 2016
,
Nov 18 2016
Users experienced this crash on the following builds: Win Canary 56.0.2924.0 - 491.54 CPM, 1973 reports, 1124 clients (signature ThreadWatcherList::ParseCommandLineCrashOnHangThreads) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Nov 18 2016
Users experienced this crash on the following builds: Win Canary 56.0.2924.0 - 57.05 CPM, 229 reports, 142 clients (signature gfx::ScaleToEnclosingRect) Win Canary 56.0.2924.0 - 12.46 CPM, 50 reports, 35 clients (signature `anonymous namespace'::ScaleOffset) Win Canary 56.0.2924.0 - 117.84 CPM, 473 reports, 280 clients (signature gfx::ToRoundedInt) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Nov 18 2016
Users experienced this crash on the following builds: Win Canary 56.0.2924.0 - 21.92 CPM, 88 reports, 79 clients (signature skia::`anonymous namespace'::ResizeFilter::ComputeFilters) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Nov 18 2016
The following revision refers to this bug: https://skia.googlesource.com/skia.git/+/17b6e487b71b7ed541d3239f194e121fbe8efd1d commit 17b6e487b71b7ed541d3239f194e121fbe8efd1d Author: Mike Klein <mtklein@chromium.org> Date: Fri Nov 18 22:11:41 2016 Revert "Turn off /arch:AVX[2] on Windows builds." This reverts commit c89e2438ae24c862087f8ca1cb076052edbb27ea. Reason for revert: I'm going to stub this code out in Chrome instead for now. Chrome's not going to land something that looks like this, so I'd rather undo it than leave Skia in this odd state. Original change's description: > Turn off /arch:AVX[2] on Windows builds. > > This canaries a similar change Chrome may need. > > BUG= chromium:666707 > > GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=5005 > > Change-Id: Ibf7f9941968d905d865b9be1e63ebbf768870175 > Reviewed-on: https://skia-review.googlesource.com/5005 > Reviewed-by: Mike Klein <mtklein@chromium.org> > Commit-Queue: Mike Klein <mtklein@chromium.org> > TBR=mtklein@chromium.org,brucedawson@chromium.org,reviews@skia.org NOPRESUBMIT=true NOTREECHECKS=true NOTRY=true Change-Id: I9562f7877749c61f6ba4d48d6c4b557f09876128 Reviewed-on: https://skia-review.googlesource.com/5069 Commit-Queue: Mike Klein <mtklein@chromium.org> Reviewed-by: Mike Klein <mtklein@chromium.org> [modify] https://crrev.com/17b6e487b71b7ed541d3239f194e121fbe8efd1d/BUILD.gn [modify] https://crrev.com/17b6e487b71b7ed541d3239f194e121fbe8efd1d/gn/BUILD.gn
,
Nov 19 2016
Yeah, subtarget flags are kind of a huge problem for the C++ compilation model. Going forward, I think we'll want to add and use attributes to enable subtarget features without enabling them in headers. The usage model would look something like:
// foo.cpp
#include "foo.h"
#include "bar.h"
#include "immintrin.h"
#pragma enable_awesome_avx512_features
void doit() { ... }
// foo.h
#include "bar.h"
#include "immintrin.h"
#pragma push enable_awesome_avx512_features
inline void dothat() { ... }
inline void dothose() { ... }
#pragma pop enable_awesome_avx512_features
I don't know if the VC folks have any plans to add these kinds of features, but this is definitely something we want to do in clang. It's already sort of available with __attribute__((target)).
,
Nov 19 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2ea76e2274404af7e53cc4bb5d212b22856ee455 commit 2ea76e2274404af7e53cc4bb5d212b22856ee455 Author: mtklein <mtklein@chromium.org> Date: Sat Nov 19 07:08:59 2016 Stub out Skia AVX and HSW opts. This is hopefully temporary while we mull better options. BUG= 666707 Review-Url: https://codereview.chromium.org/2512093003 Cr-Commit-Position: refs/heads/master@{#433407} [modify] https://crrev.com/2ea76e2274404af7e53cc4bb5d212b22856ee455/skia/BUILD.gn [add] https://crrev.com/2ea76e2274404af7e53cc4bb5d212b22856ee455/skia/ext/SkOpts_avx_stub.cc [add] https://crrev.com/2ea76e2274404af7e53cc4bb5d212b22856ee455/skia/ext/SkOpts_hsw_stub.cc
,
Nov 19 2016
Just to confirm that this bug seems to be fixed (once again ref: Issue 654213 ). Update Version 57.0.2925.0 canary (64-bit) on Win 67 64 Bit with Intel DX58SO2 MB + I7 990-x (see CPU information in attachments.
,
Nov 19 2016
Users experienced this crash on the following builds: Win Canary 57.0.2925.0 - 21.29 CPM, 98 reports, 86 clients (signature skia::`anonymous namespace'::ResizeFilter::ComputeFilters) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Nov 19 2016
Users experienced this crash on the following builds: Win Canary 57.0.2925.0 - 380.53 CPM, 1752 reports, 1050 clients (signature ThreadWatcherList::ParseCommandLineCrashOnHangThreads) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Nov 19 2016
Users experienced this crash on the following builds: Win Canary 57.0.2925.0 - 92.96 CPM, 428 reports, 267 clients (signature gfx::ToRoundedInt) Win Canary 57.0.2925.0 - 7.82 CPM, 36 reports, 28 clients (signature `anonymous namespace'::ScaleOffset) Win Canary 57.0.2925.0 - 46.70 CPM, 215 reports, 113 clients (signature gfx::ScaleToEnclosingRect) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Nov 20 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8aa53c9b50de7f662697e64c1fe590d59ce6edcc commit 8aa53c9b50de7f662697e64c1fe590d59ce6edcc Author: skia-deps-roller <skia-deps-roller@chromium.org> Date: Sun Nov 20 16:11:06 2016 Roll src/third_party/skia/ eaef61537..792d0f13d (18 commits). https://skia.googlesource.com/skia.git/+log/eaef615377bc..792d0f13d6cb $ git log eaef61537..792d0f13d --date=short --no-merges --format='%ad %ae %s' 2016-11-20 egdaniel Revert "switched skslc from std::string to SkString" 2016-11-20 egdaniel Revert "fixed iOS build failure" 2016-11-18 krasin Avoid unnecessary cast on a garbage data. 2016-11-18 mar.kazmierczak Fix typo in GrGLCaps 2016-11-18 reed android does not need XFERMODE_PUBLIC flag 2016-11-18 mtklein Revert "Turn off /arch:AVX[2] on Windows builds." 2016-11-18 mtklein mirror tiling 2016-11-18 bsalomon Make GrSwizzle::GrSwizzle() constexpr 2016-11-18 bsalomon Remove unnecessary attribute and varying type modifiers 2016-11-18 mtklein Build fiddle and public_headers_warnings_check only when skia_enable_tools. 2016-11-18 brianosman VS SLN script: Automatically determine which folder to use/copy 2016-11-18 mtklein Turn off /arch:AVX[2] on Windows builds. 2016-11-18 mtklein update G3 build after crrev.com/2500113004 2016-11-18 bsalomon Make GrBufferAccess a nested class of GrProcessor 2016-11-18 liyuqian Add test for QuickFDot6Div 2016-11-18 ethannicholas fixed iOS build failure 2016-11-17 ethannicholas switched skslc from std::string to SkString 2016-11-18 robertphillips Add handling for instantiate failure up the call stack BUG= 666707 , 665681 , 665500 , 665621 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel TBR=egdaniel@google.com Review-Url: https://codereview.chromium.org/2516183002 Cr-Commit-Position: refs/heads/master@{#433454} [modify] https://crrev.com/8aa53c9b50de7f662697e64c1fe590d59ce6edcc/DEPS
,
Nov 21 2016
You can ignore #67. Nothing interesting there except me putting things back to Full Dangerous for local Skia testing, to help me track down the problem outside Chrome. The "Stub out ..." CL from #62 should be making its way out in 57.0.2926.0. So we should expect to see crashes on builds cut before that.
,
Nov 21 2016
,
Nov 21 2016
Issue 666934 has been merged into this issue.
,
Nov 21 2016
Can we merge the fixes into M56 (build 2924)? It looks like the crash is no longer happening in 57.0.2926.0, which goes along with #68. This is currently blocking the Dev release.
,
Nov 22 2016
,
Nov 22 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/bc63dc10b5972528108c2a5eafde9f2919e1a087 commit bc63dc10b5972528108c2a5eafde9f2919e1a087 Author: Bruce Dawson <brucedawson@chromium.org> Date: Tue Nov 22 00:45:00 2016 Stub out Skia AVX and HSW opts. This is hopefully temporary while we mull better options. BUG= 666707 Review-Url: https://codereview.chromium.org/2512093003 Cr-Commit-Position: refs/heads/master@{#433407} (cherry picked from commit 2ea76e2274404af7e53cc4bb5d212b22856ee455) Review URL: https://codereview.chromium.org/2519083004 . Cr-Commit-Position: refs/branch-heads/2924@{#47} Cr-Branched-From: 3a87aecc31cd1ffe751dd72c04e5a96a1fc8108a-refs/heads/master@{#433059} [modify] https://crrev.com/bc63dc10b5972528108c2a5eafde9f2919e1a087/skia/BUILD.gn [add] https://crrev.com/bc63dc10b5972528108c2a5eafde9f2919e1a087/skia/ext/SkOpts_avx_stub.cc [add] https://crrev.com/bc63dc10b5972528108c2a5eafde9f2919e1a087/skia/ext/SkOpts_hsw_stub.cc
,
Nov 22 2016
The CL is merged to M56, removing the blocker label.We will verify once the build is deployed to Dev channel.
,
Nov 28 2016
The following revision refers to this bug: https://skia.googlesource.com/skia.git/+/e9f78b41c65a78400c22a1b79007b1b9e187a745 commit e9f78b41c65a78400c22a1b79007b1b9e187a745 Author: Mike Klein <mtklein@chromium.org> Date: Tue Nov 22 13:57:45 2016 Guard against buggy ucrt\math.h. BUG= 666707 GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=5089 CQ_INCLUDE_TRYBOTS=master.client.skia:Test-Ubuntu-GCC-GCE-CPU-AVX2-x86_64-Release-SKNX_NO_SIMD-Trybot Change-Id: I3bebfdf635d541d92fb84236f0f6fae2da39d691 Reviewed-on: https://skia-review.googlesource.com/5089 Reviewed-by: Bruce Dawson <brucedawson@google.com> Reviewed-by: Herb Derby <herb@google.com> Commit-Queue: Mike Klein <mtklein@chromium.org> [modify] https://crrev.com/e9f78b41c65a78400c22a1b79007b1b9e187a745/include/private/SkFixed.h [modify] https://crrev.com/e9f78b41c65a78400c22a1b79007b1b9e187a745/include/private/SkFloatBits.h [modify] https://crrev.com/e9f78b41c65a78400c22a1b79007b1b9e187a745/include/private/SkFloatingPoint.h [add] https://crrev.com/e9f78b41c65a78400c22a1b79007b1b9e187a745/include/private/SkSafe_math.h [modify] https://crrev.com/e9f78b41c65a78400c22a1b79007b1b9e187a745/src/core/SkNx.h [modify] https://crrev.com/e9f78b41c65a78400c22a1b79007b1b9e187a745/src/opts/SkOpts_avx.cpp [modify] https://crrev.com/e9f78b41c65a78400c22a1b79007b1b9e187a745/src/opts/SkOpts_hsw.cpp [modify] https://crrev.com/e9f78b41c65a78400c22a1b79007b1b9e187a745/src/pathops/SkPathOpsTypes.h
,
Nov 29 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b4ee4bffa0c144adfb16968ca3645bc624e938db commit b4ee4bffa0c144adfb16968ca3645bc624e938db Author: skia-deps-roller <skia-deps-roller@chromium.org> Date: Tue Nov 29 04:21:16 2016 Roll src/third_party/skia/ d5de01364..99ab92b59 (22 commits). https://skia.googlesource.com/skia.git/+log/d5de01364378..99ab92b5958a $ git log d5de01364..99ab92b59 --date=short --no-merges --format='%ad %ae %s' 2016-11-28 raftias Moved A2B0 profile parsing before XYZ 2016-11-28 mtklein Consistent naming. 2016-11-28 reed use raster-pipeline in readPixels 2016-11-28 ethannicholas added support for layout(offset=...) to skslc 2016-11-28 benjaminwagner Merge changes from internal cl/140385880. 2016-11-28 mtklein simplify 2016-11-28 liyuqian Revert "Add the missing shift to the dy" 2016-11-28 mtklein Convert blitter over to new style from_srgb, to_srgb. 2016-11-28 lsalzman use __BYTE_ORDER__ macro to detect endianness when available 2016-11-28 brianosman Narrow the SkImageGenerator interface 2016-11-28 bsalomon Remove old driver bug workaround for glTexStorage. 2016-11-28 borenet Roll recipe DEPS 2016-11-28 ethannicholas unified ASTLayout/Layout and ASTModifiers/Modifiers 2016-11-28 ethannicholas removed textureProj() and legacy texture functions from sksl 2016-11-28 reed simplify SkConfig8888 logic: just fall-through if memcpy case isn't supported 2016-11-28 mtklein Split srgb out of accum stages. 2016-11-28 raftias Fuzzer fix for overflow in some Lut8 profiles. 2016-11-28 mtklein Fix unpremul stage. 2016-11-28 liyuqian Add the missing shift to the dy 2016-11-28 brianosman GrTextureProducer cleanup, phase two: Producer, Adjuster, Maker 2016-11-22 mtklein Guard against buggy ucrt\math.h. 2016-11-22 ethannicholas baked in a few more precision modifiers BUG= 668784 , 668907 , 666707 Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel TBR=brianosman@google.com Review-Url: https://codereview.chromium.org/2532083003 Cr-Commit-Position: refs/heads/master@{#434889} [modify] https://crrev.com/b4ee4bffa0c144adfb16968ca3645bc624e938db/DEPS
,
Nov 29 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0dcd8a3ef5f7cdcdd556aeeed1db09e86c3e5ae5 commit 0dcd8a3ef5f7cdcdd556aeeed1db09e86c3e5ae5 Author: mtklein <mtklein@chromium.org> Date: Tue Nov 29 14:24:25 2016 Revert of Stub out Skia AVX and HSW opts. (patchset #1 id:1 of https://codereview.chromium.org/2512093003/ ) Reason for revert: Now that we're guarding against ucrt\math.h in Skia and that's rolled into Chromium, it's as good a time as ever to try turning back on Skia's AVX+ code at head. (I intend to just leave it stubbed out like this in M56.) Original issue's description: > Stub out Skia AVX and HSW opts. > > This is hopefully temporary while we mull better options. > > BUG= 666707 > > Committed: https://crrev.com/2ea76e2274404af7e53cc4bb5d212b22856ee455 > Cr-Commit-Position: refs/heads/master@{#433407} TBR=brucedawson@chromium.org,thakis@chromium.org # Not skipping CQ checks because original CL landed more than 1 days ago. BUG= 666707 Review-Url: https://codereview.chromium.org/2539453003 Cr-Commit-Position: refs/heads/master@{#434983} [modify] https://crrev.com/0dcd8a3ef5f7cdcdd556aeeed1db09e86c3e5ae5/skia/BUILD.gn [delete] https://crrev.com/cea883202f2e33a6b48b643b3d57221a2ef6187e/skia/ext/SkOpts_avx_stub.cc [delete] https://crrev.com/cea883202f2e33a6b48b643b3d57221a2ef6187e/skia/ext/SkOpts_hsw_stub.cc
,
Nov 30 2016
Looks like 57.0.2937.0 has the stub revert from #77.
,
Dec 1 2016
It's probably too early to tell anything about 57.0.2937.0 which only has 120 crashes ATM. It looks like this issue is well understood so I'm removing the sheriff label. FWIW the original crash signature last appeared in 57.0.2925.0 and 56.0.2924.0. I have been looking at illegal instruction crash counts by version and CPU [1]. There was only one in 57.0.2926.0 (against 1526 crashes total). [1] https://crash.corp.google.com/dremel_query_ui?q=SELECT%20product.version%2C%20cpu.Info%2C%20COUNT(*)%0AFROM%20crash.prod.latest%0AWHERE%20product.name%3D%27Chrome%27%20AND%20custom_data.ChromeCrashProto.ptype%3D%27browser%27%20AND%20crash.Reason%3D%27EXCEPTION_ILLEGAL_INSTRUCTION%27%0AAND%20product.version%20%3C%20%276%27%20AND%20product.version%20%3E%20%2756%27%0AGROUP%20BY%201%2C%202%20ORDER%20BY%201%20DESC%2C%202
,
Dec 1 2016
FWIW, template functions behave like inline functions, and you can have the same problem with (say) std::min(). Requiring those to be static is probably somewhere between unreasonably and non-standards-conformant :-(
,
Dec 1 2016
Interesting. I'd have thought std::min and std::max would be static too. It's hard for me to imagine a good reason that anything, function or data, defined in a header is non-static, especially with ICF. In any case, we typically don't use std::min or std::max, especially in this code. There we're more likely to be using intrinsics like _mm_min_ps, _mm256_max_ps, etc. I think I'll just give up on MSVC for good if those intrinsics ever start giving us ODR problems...
,
Dec 1 2016
But std::min() would be a problem for non-MSVC too, right?
,
Dec 1 2016
Yes, anything's a problem if it's defined in a header and not at least one of static, force-inlined, or in an anonymous namespace. I may not have been clear about what I was going on about with intrinsics. Clang's x86 intrinsics are all static and force-inlined, doubly safe. GCC's are extern and force-inlined, which I think is safe but worrying. Don't know about MSVC's.
,
Dec 2 2016
Feels fixed right? I'm not seeing anything worrying in crash for 2937+.
,
Dec 5 2016
I filed https://connect.microsoft.com/VisualStudio/feedback/details/3114220 to request that the functions in math.h be marked as static. Regarding comment #81, I would agree that std::min and std::max also need the treatment, and probably many other functions as well. I would not expect intrinsics to be a problem because they aren't "really" functions.
,
Jan 12 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/9f158899c5bec9e547cbcc49aff6f087620f6129 commit 9f158899c5bec9e547cbcc49aff6f087620f6129 Author: skia-deps-roller <skia-deps-roller@chromium.org> Date: Thu Jan 12 16:45:36 2017 Roll src/third_party/skia/ 0c2997b6d..d32303e72 (2 commits). https://skia.googlesource.com/skia.git/+log/0c2997b6d85e..d32303e7276b $ git log 0c2997b6d..d32303e72 --date=short --no-merges --format='%ad %ae %s' 2017-01-10 mtklein disable runtime detected AVX2 raster pipelines 2017-01-12 djsollen Remove defunct include/images directory from GN. BUG=679147, 654213 ,664864, 666707 ,etc. Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, see: http://www.chromium.org/developers/tree-sheriffs/sheriff-details-chromium#TOC-Failures-due-to-DEPS-rolls CQ_INCLUDE_TRYBOTS=master.tryserver.blink:linux_trusty_blink_rel TBR=kjlubick@google.com Review-Url: https://codereview.chromium.org/2633483002 Cr-Commit-Position: refs/heads/master@{#443259} [modify] https://crrev.com/9f158899c5bec9e547cbcc49aff6f087620f6129/DEPS
,
Jan 31 2017
Microsoft has officially said that they will *not* be marking inline functions as static. Their closing comment was: "The Universal CRT team has discussed this and will not be declaring the functions in math.h as static inline. The C standard is clear that all library functions should have external linkage. In order to solve this scenario, we recommend using a separate binary for /arch:AVX compiled code." https://connect.microsoft.com/VisualStudio/feedback/details/3114220 FWIW
,
Jan 31 2017
Bummer. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by ajha@chromium.org
, Nov 18 2016Components: Internals>TaskScheduler
Labels: ReleaseBlock-Dev Stability-Sheriff-Desktop
Owner: robliao@chromium.org
Status: Assigned (was: Untriaged)