CrElementsToggleTest.All triggers Blink DCHECK failure (and then times out on Windows 10) |
||||||||||||||
Issue descriptionHello, I'm not sure if this is infra, or should be a sheriff issue. Feel free to triage and reassign. All my CLs are blocked on a consistently failing CQ test on the win10 trybot: https://build.chromium.org/p/tryserver.chromium.win/builders/win10_chromium_x64_rel_ng (Example CL that can't land due to this consistent test failure: https://chromium-review.googlesource.com/c/chromium/src/+/691555) Thanks much!
,
Oct 2 2017
(Sheriff Here) Test looks like it's been timing out on win 10 for a while now: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=CrElementsToggleTest.All I'll disable, and passing over to webui team for triage.
,
Oct 2 2017
The failure looks like an internal Blink error, pasting below in case the links stop working, [2656:2864:1001/015702.231:FATAL:TransformationMatrix.h(514)] Check failed: (reinterpret_cast<uintptr_t>(this) & (__alignof(TransformationMatrix) - 1)) == 0UL (8 vs. 0) Backtrace: base::debug::StackTrace::StackTrace [0x00007FF62338A535+69] base::debug::StackTrace::StackTrace [0x00007FF623341983+19] logging::LogMessage::~LogMessage [0x00007FF6232D91CA+234] blink::TransformationMatrix::CheckAlignment [0x00007FF62593E98C+92] blink::LayoutBoxModelObject::ComputedCSSPaddingRight [0x00007FF625A95193+403] blink::ComputedStyleCSSValueMapping::Get [0x00007FF625A99524+12644] blink::CSSComputedStyleDeclaration::GetPropertyCSSValue [0x00007FF625A6D072+242] blink::V8CSSStyleDeclaration::namedPropertyGetterCustom [0x00007FF62736C05A+74] blink::V8CSSStyleDeclaration::namedPropertyGetterCallback [0x00007FF62587A0FF+79] v8::internal::PropertyCallbackArguments::Call [0x00007FF622D48605+485] v8::internal::JSObject::GetPropertyWithFailedAccessCheck [0x00007FF622E08361+1585] v8::internal::Object::GetProperty [0x00007FF622E06877+87] v8::internal::LoadIC::Load [0x00007FF622D4D04A+410] v8::internal::KeyedStoreIC::UpdateStoreElement [0x00007FF622D55F47+10791] v8::internal::Runtime_LoadIC_Miss [0x00007FF622D4F449+585] (No symbol) [0x000002A6CD204948]
,
Oct 2 2017
,
Oct 2 2017
,
Oct 2 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b7a2616cce819b0d9eefc6af199918da865fef3d commit b7a2616cce819b0d9eefc6af199918da865fef3d Author: Devlin Cronin <rdevlin.cronin@chromium.org> Date: Mon Oct 02 20:00:52 2017 Disable test CrElementsToggleTest.All on windows Looks like it's been failing on the win10 bot for some time now. Bug: 770574 TBR: dpapad@chromium.org Change-Id: I6d9cc7137de216535b3a2c0bc45bdad2068a4ce2 Reviewed-on: https://chromium-review.googlesource.com/695704 Reviewed-by: Devlin <rdevlin.cronin@chromium.org> Commit-Queue: Devlin <rdevlin.cronin@chromium.org> Cr-Commit-Position: refs/heads/master@{#505744} [modify] https://crrev.com/b7a2616cce819b0d9eefc6af199918da865fef3d/chrome/test/data/webui/cr_elements/cr_elements_browsertest.js
,
Oct 2 2017
@rtoy: Is this not related to Blink? The error is thrown from within Blink's blink::TransformationMatrix, see [1]. [1] https://cs.chromium.org/chromium/src/third_party/WebKit/Source/platform/transforms/TransformationMatrix.h?type=cs&q=blink::TransformationMatrix::CheckAlignment&sq=package:chromium&l=508
,
Oct 2 2017
Re-adding the Blink component, please see question at comment #7.
,
Oct 2 2017
Yes, sorry about that. I'll have to look around for a better component than just blink.
,
Oct 2 2017
Thanks. I am also removing myself as on OWNER for now, since it does not seem there is anything to do on my side for this issue.
,
Oct 2 2017
Setting component to Platform, as suggested in Source/platform/OWNERS +eae who helped find the right component and further triage.
,
Oct 2 2017
Assigning to yutak as the reviewer for a number of recent changes by tkent to checks and macros in the file lately, including one that caused a win-only issue.
,
Oct 3 2017
Hmm actually this has nothing to do with macro rewrites, but this seems like an issue with WTF_ALIGN, so I'm the right owner :) Recent introduction of MSVC 2017 may have changed something. Let me check...
,
Oct 3 2017
+brucedawson
I now suspect this is due to MSVC 2017's regression.
The assertion hits in a function "ComputedTransform" defined in
core/css/ComputedStyleCSSValueMapping.cpp, at line 1573:
> 1564 TransformationMatrix transform;
> ...
> 1573 list->Append(*ValueForMatrixTransform(transform, style));
The |transform| argument is passed by value, so a new temporary object of type
TransformationMatrix is created here. TransformationMatrix is defined as:
> class PLATFORM_EXPORT TransformationMatrix {
> ...
> #if defined(TRANSFORMATION_MATRIX_USE_X86_64_SSE2)
> typedef WTF_ALIGNED(double, Matrix4[4][4], 16); // THIS IS USED THIS TIME
> #else
> typedef double Matrix4[4][4];
> #endif
> ...
> Matrix4 matrix_;
> };
So, it is expected to be 16-byte aligned, but the temporary object in question
is allocated on 8-byte boundary. The alignment is violated, so boom. Seems like
MSVC 2017 fails to allocate this temporary object on 16-byte boundary. This is
likely to be a bug of MSVC 2017.
HOWEVER,
This usage of WTF_ALIGNED, used in typedef, not directly applied to a variable
declaration), smells a bit. This macro expands to:
> typedef __declspec(align(16)) double Matrix4[4][4];
__declspec(align(n)) is almost compatible to C++11 standard "alignas", but
C++11 standard "alignas" can't be used in a typedef like this example: alignas
is not part of a type. It should be used in a variable declaration. This subtle
difference may perhaps be the source of MSVC's regression.
Going forward, I think we should also fix this kind of typedef usage, so we can
eventually migrate to standard alignas in the future. And moving the
__declspec(align(n)) to the variable declaration MAY also fix this assertion
failure. I don't have time to check my hypothesis today; will try tomorrow.
,
Oct 3 2017
I see that the test was disabled yesterday evening: commit b7a2616cce819b0d9eefc6af199918da865fef3d Author: Devlin Cronin <rdevlin.cronin@chromium.org> Date: Mon Oct 02 20:00:52 2017 Disable test CrElementsToggleTest.All on windows Looks like it's been failing on the win10 bot for some time now. Bug: 770574 TBR: dpapad@chromium.org Change-Id: I6d9cc7137de216535b3a2c0bc45bdad2068a4ce2 However when I look at https://build.chromium.org/p/tryserver.chromium.win/builders/win10_chromium_x64_rel_ng?numbuilds=200 I see that most builds over the last 24 hours succeeded. I'm having trouble reconciling "consistent test failure" with the mostly green bot. That said, I don't think MSVC (2015 or 2017) supports 16-byte alignment of function parameters. My guess is that it supports high degrees of alignment for local variables and for globals, and for offsets within larger structs. However I don't believe it supports high degrees of alignment for function parameters or for allocate objects (that is the allocator's job). I thought that MSVC gave a warning when objects that require high degrees of alignment were passed as function parameters. I will investigate to confirm this, but I think the code needs to be changed.
,
Oct 3 2017
@15 re test flakiness consistency. The test was disabled at 20:00 UTC, which is 13:00 PST. Bot runs are in PST (at least, they are if you're in PST - not sure if it's always PST or localized). So be sure to look at runs starting before Oct 2, 13:00 PST (around #2697). Bot runs before that show the failure pretty frequently. E.g., start looking at runs at or before 2696 (https://build.chromium.org/p/tryserver.chromium.win/builders/win10_chromium_x64_rel_ng/builds/2696). Many are successful despite failing the test because we have flakiness detection that won't count a run as failed if it fails with and without the patch applied. In this case, the description still includes the information: Build successful failed browser_tests (with patch) on windows-10-14393 failed renderer_side_navigation_browser_tests (with patch) on windows-10-14393 failed browser_tests (without patch) on windows-10-14393 failed renderer_side_navigation_browser_tests (without patch) on windows-10-14393
,
Oct 3 2017
I've found contradictory results. I wrote test code and I can see that VC++ 2017 ensures alignment of function parameters up to 32 bytes - I can see it masking rbp to enforce this:
and rbp,0FFFFFFFFFFFFFFE0h
On the other hand, the VC++ 2017 documentation says that this is not guaranteed:
https://docs.microsoft.com/en-us/cpp/cpp/align-cpp
"Note that ordinary allocators—for example, malloc, C++ operator new, and the Win32 allocators—return memory that is usually not sufficiently aligned for __declspec(align(#)) structures or arrays of structures. To guarantee that the destination of a copy or data transformation operation is correctly aligned, use _aligned_malloc, or write your own allocator.
You cannot specify alignment for function parameters. When data that has an alignment attribute is passed by value on the stack, its alignment is controlled by the calling convention. If data alignment is important in the called function, copy the parameter into correctly aligned memory before use."
Passing by const reference and then copying to a local variable should work. Note that the documentation also recommends switching to the C++ standard alignas however this does not change the behavior - the same restrictions apply.
https://docs.microsoft.com/en-us/cpp/cpp/align-cpp
So...
I guess there is something about our build options that is making the compiler not give the alignment guarantees that I am seeing in my tests. Since not aligning is consistent with the documentation it appears that we are inadvertently depending on behavior that was never guaranteed. I'm looking at the call stacks but I'm having a tough time seeing where the matrix is being passed and thus where it goes awry.
Possibly relevant: VS 2017 now generates SSE load/store instructions that don't require alignment so it is quite possible/likely that CheckAlignment() is not needed anymore.
I'll keep looking.
,
Oct 3 2017
@brucedawson: I don't know if this helps, but if you need to interactively use an actual cr-toggle UI element to replicate what the CrElementsToggleTest.All test does, you can turn on enable-md-extensions from chrome://flags and then go to chrome://extensions, then use any of the toggle controls in that page.
,
Oct 3 2017
Thanks - that should be helpful. Still building my 64-bit debuggable browser with dchecks enabled...
,
Oct 3 2017
I cannot reproduce this. In fact, I can't see any sign (despite poring over the C++ source and the assembly generated from a build that matches the one that was failing) that blink::LayoutBoxModelObject::ComputedCSSPaddingRight ever calls blink::TransformationMatrix::CheckAlignment, either directly or indirectly. Can anybody explain how that could happen? CheckAlignment is only called when a TransformationMatrix object is created (with one exception, I filed crbug.com/771329 for that) and I don't see any TransformationMatrix objects being created in that part of the code. I'm still investigating - I would like to be able to locally repro - but I think that the solution is to disable the alignment check for VS 2017 builds. I see no indication that it is actually needed given the new code-gen rules (VS 2017 essentially makes it impossible to depend on 16-byte alignment).
,
Oct 3 2017
Here's the explanation of why alignment doesn't matter with VS 2017: https://developercommunity.visualstudio.com/content/problem/19160/regression-from-vs-2015-in-ssseavx-instructions-ge.html Even if movaps is explicitly requested the compiler will still generate movups, which doesn't care about alignment.
,
Oct 4 2017
I ran a couple of try-bot tests with the disabling of CrElementsToggleTest.All reverted (that is, with that test re-enabled), in crrev.com/c/699045. The first try job failed, as expected. Then I used the preprocessor to remove the checks in CheckAlignment when building with VS 2017 and above. This is safe to do because VS 2017 generates code that doesn't require alignment. The second try jobs passed without crashing or otherwise failing, thus proving the the alignment checks are not needed with VS 2017. I'm going to investigate a little bit more but I'm inclined to land my change and declare victory. I don't want to spend a lot of time enforcing a rule that is no longer necessary (possible modest performance differences for those few times when the matrix is misaligned, but that's it).
,
Oct 5 2017
I ran the test locally under the debugger with this syntax:
out\release_64\browser_tests.exe --single_process --single-process --gtest_filter=CrElementsToggleTest.All
and captured the crash. The call stack at the leaf end was quite different from what the try bots were reporting for some reason:
02 browser_tests!blink::TransformationMatrix::CheckAlignment
03 browser_tests!blink::TransformationMatrix::{ctor} - this was inlined
04 browser_tests!blink::ComputedTransform
- same stack below here
05 browser_tests!blink::ComputedStyleCSSValueMapping::Get
06 browser_tests!blink::CSSComputedStyleDeclaration::GetPropertyCSSValue
07 browser_tests!blink::V8CSSStyleDeclaration::namedPropertyGetterCustom
08 browser_tests!blink::V8CSSStyleDeclaration::namedPropertyGetterCallback
09 browser_tests!v8::internal::PropertyCallbackArguments::Call
0a browser_tests!v8::internal::`anonymous namespace'::GetPropertyWithInterceptorInternal
0b browser_tests!v8::internal::JSObject::GetPropertyWithInterceptor
0c browser_tests!v8::internal::Object::GetProperty
0d browser_tests!v8::internal::LoadIC::Load
0e browser_tests!v8::internal::__RT_impl_Runtime_LoadIC_Miss
0f browser_tests!v8::internal::Runtime_LoadIC_Miss
The problem is in third_party\webkit\source\core\css\computedstylecssvaluemapping.cpp, line 1573, which goes:
list->Append(*ValueForMatrixTransform(transform, style));
In particular, ValueForMatrixTransform takes a TransformationMatrix by value and VC++ has never promised to support this.
Changing ValueForMatrixTransform like this should resolve this particular issue:
static CSSFunctionValue* ValueForMatrixTransform(const TransformationMatrix& transform_param,
const ComputedStyle& style) {
TransformationMatrix transform = transform_param;
I'm testing that now.
That seems worth doing. However we may end up chasing thousands of violations, in which case landing crrev.com/c/699045 (with an exclusion so that it is certain to not apply to clang) would be advisable.
,
Oct 5 2017
Sorry I was off yesterday due to sick. brucedawson: Yes, that's the issue I found in comment #14 (I actually said the argument was passed by value!). Sorry that I couldn't give you some hints earlier. If MSVC doesn't guarantee the alignment of passed-by-value arguments, then yeah, we have to make them pass-by-constref. I actually don't know how many TransformationMatrix instances violate this. Let me check that.
,
Oct 5 2017
BTW, I was able to have windbg hooked at the crash position. If you need some help, let me know.
,
Oct 5 2017
Provided __declspec(align()) and alignas() works identically, I suspect MSVC's behavior could be a violation of C++ spec. C++ spec actually prohibits "alignas" specifier specified for a function parameter, but it does NOT prohibit passing a value to a function that contains alignment specification inside its type.
,
Oct 5 2017
OK, the solution mentioned in #23 (making the param const ref) actually worked on my local builds, at least for this specific instance. I'm going to write a patch that fixes this and reverts the disabled test.
,
Oct 5 2017
This patch should do the trick (and it passed the try bots). I'll tidy up the commit message and send it for review: https://chromium-review.googlesource.com/701921
,
Oct 5 2017
(I'm not sure why I missed your explanation in comment #14 - oops)
,
Oct 5 2017
It's possible that alignas would work. I misread the documentation and it looks like maybe it does support function parameter alignment: https://docs.microsoft.com/en-us/cpp/cpp/alignof-and-alignas-cpp
,
Oct 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/619577502a3b8e9aff680d59fdaa6fee3d2b2278 commit 619577502a3b8e9aff680d59fdaa6fee3d2b2278 Author: Bruce Dawson <brucedawson@chromium.org> Date: Fri Oct 06 18:44:19 2017 Avoid passing TransformationMatrix by value ValueForMatrixTransform takes a TransformationMatrix by value which VC++ does not promise to support. It *usually* gives the request alignment, but not always. This change makes ValueForMatrixTransform take its parameter by const reference and then copies it to a local. This change also reverts "Disable test CrElementsToggleTest.All on windows" (b7a2616cce819b0d9eefc6af199918da865fef3d) because that test now passes. Note that VS 2017 doesn't actually *require* 16-byte alignment so another option to solve this would have been to NOP CheckAlignment on VS 2017 builds. This seems better. Bug: 770574 Cq-Include-Trybots: master.tryserver.chromium.win:win10_chromium_x64_rel_ng Change-Id: Ide78fbfcf30dc01c750d82ad216d0863d66d4e59 Reviewed-on: https://chromium-review.googlesource.com/701921 Reviewed-by: Yuta Kitamura <yutak@chromium.org> Commit-Queue: Bruce Dawson <brucedawson@chromium.org> Cr-Commit-Position: refs/heads/master@{#507129} [modify] https://crrev.com/619577502a3b8e9aff680d59fdaa6fee3d2b2278/chrome/test/data/webui/cr_elements/cr_elements_browsertest.js [modify] https://crrev.com/619577502a3b8e9aff680d59fdaa6fee3d2b2278/third_party/WebKit/Source/core/css/ComputedStyleCSSValueMapping.cpp
,
Oct 6 2017
I'm not convinced this is the only place where we are passing TransformationMatrix by value, but it appears to be the only place where the compiler is failing to give the requested alignment, so I'm tagging this as fixed. If we hit subsequent instances we can reapply this pattern. And this is the last blocking bug for VS 2017 so I'm gonna close it as Fixed also! |
||||||||||||||
►
Sign in to add a comment |
||||||||||||||
Comment 1 by dpranke@chromium.org
, Oct 2 2017Labels: -Infra-Troopers Sheriff-Chromium