Issue metadata
Sign in to add a comment
|
Windows SyzyAsan layout test bot falls out of gpu compositing (which crashes) |
||||||||||||||||||||||||
Issue descriptionDetailed report: https://clusterfuzz.com/testcase?key=5597014296625152 Fuzzer: inferno_layout_test_unmodified Job Type: windows_syzyasan_content_shell Platform Id: windows Crash Type: Null-dereference Crash Address: 0x000001d3 Crash State: viz::TestLayerTreeFrameSink::RequestCopyOfOutput content::CopyRequestSwapPromise::WillSwap cc::LayerTreeImpl::FinishSwapPromises Memory Tool: SYZYASAN Regressed: https://clusterfuzz.com/revisions?job=windows_syzyasan_content_shell&range=512946:513015 Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5597014296625152 Issue filed automatically. See https://github.com/google/clusterfuzz-tools for more information.
,
Nov 2 2017
Automatically assigning owner based on suspected regression changelist https://chromium.googlesource.com/chromium/src/+/ab9ef4d2c5eb7a84463bfe503854d6450d0be72f (Introduce CompostingModeWatcher interface for global coordination.). If this is incorrect, please remove the owner and apply the Test-Predator-Wrong-CLs label.
,
Nov 2 2017
Doesn't reproduce on linux with
content_shell --run-layout-test third_party/WebKit/LayoutTests/d/clusterfuzz-testcase-5597014296625152.HTM
#READY
DevTools listening on ws://127.0.0.1:34699/devtools/browser/bdaf71d4-95d0-40f5-a511-9e64499e3ac1
[124759:124759:1102/112540.699483:ERROR:gpu_info.cc(103)] No active GPU found, returning primary GPU.
CONSOLE ERROR: line 56: Uncaught ReferenceError: runTest is not defined
Content-Type: text/plain
layer at (0,0) size 800x600
LayoutView at (0,0) size 800x600
layer at (0,0) size 800x600
LayoutBlockFlow {HTML} at (0,0) size 800x600
LayoutBlockFlow {BODY} at (8,8) size 784x576
LayoutBlockFlow {P} at (0,0) size 784x20
LayoutText {#text} at (0,0) size 389x19
text run at (0,0) width 389: "Tests the Timeline API instrumentation of a paint image event"
#EOF
#EOF
#EOF
,
Nov 2 2017
I think it's a null TestCompositorFrameSink in the CopyRequestSwapPromise. Used in CopyRequestSwapPromise::WillSwap(): https://cs.chromium.org/chromium/src/content/test/layouttest_support.cc?rcl=3269dc6d217f328b525de23f4e6e0cef67030cf0&l=308 It's set in CopyRequestSwapPromise::OnCommit() from the output of FindLayerTreeFrameSink(): https://cs.chromium.org/chromium/src/content/test/layouttest_support.cc?rcl=8422017277dd337f3697776b681c69d0cdd7bbf3&l=424 Which will return null if layer_tree_frame_sinks_.find(routing_id) doesn't have a frame sink. layer_tree_frame_sinks_[routing_id] = layer_tree_frame_sink.get() is set when the CreateLayerTreeFrameSink() call runs. CreateLayerTreeFrameSink() never returns null. But if (is_gpu_compositing_disabled_) {} is true, then we'd early out of RenderThreadImpl::RequestNewLayerTreeFrameSink() with a software compositing frame sink, and leave nothing in the set for the routing id. So it seems to me that this machine fails to make a context or has gpu compositing blacklisted or something. Even though layout tests are being run on it and would pass with gpu compositing regardless.
,
Nov 2 2017
+enne fyi about layout test shinanigans. If we ever tried to use the browser's display compositor we'll have to deal with this all over again somehow, as it appears to not want to use gpu compositing. Before my patch it would make a context and check |software_rendering|, which I guess is not true on this machine. So I don't know why the display compositor fails to make a context - maybe because it's making a view context?
,
Nov 2 2017
The layout test doesn't run correctly unless its put in third_party/WebKit/LayoutTests/http/tests/inspector/<somedir>/<someotherdir> this isn't made clear in the clusterfuzz page. $ xvfb-run -s "-extension glx" out_desktop/Release/content_shell --run-layout-test third_party/WebKit/LayoutTests/http/tests/inspector/a/b/clusterfuzz-testcase-5597014296625152.HTM Tried that to make GL fail but test works. $ out_desktop/Release/content_shell --run-layout-test --disable-gpu-compositing third_party/WebKit/LayoutTests/http/tests/inspector/clusterfuzz-testcase-5597014296625152.HTM Tried that to use software compositing, but test works. I don't think this test is even doing a readback for me wth. Put a print in RenderWidgetCompositor::CompositeAndReadbackAsync and don't see it hit.
,
Nov 2 2017
Well ok so if I use the actual layout test launcher then it will try to do the copy request so I guess the command line it prints out is a lie idk.
Also the test file has to be renamed to <something>.html for the actual launcher to find it.
After jumping though these hoops. I ran:
third_party/WebKit/Tools/Scripts/run-webkit-tests --build-directory=out_desktop third_party/WebKit/LayoutTests/http/tests/inspector/a/b/c.html
And the test just times out.
third_party/WebKit/Tools/Scripts/run-webkit-tests --build-directory=out_desktop third_party/WebKit/LayoutTests/http/tests/inspector/a/b/c.html --additional-driver-flag=--disable-gpu-compositing
Also times out. Ok.
So I tried a test that does work and does do a readback, which is compositing/masks/mask-of-clipped-layer.html with --disable-gpu-compositing.
This does crash, on the dcheck
if (is_gpu_compositing_disabled_) {
DCHECK(!layout_test_mode());
...
}
Which is of course bad.. cuz that makes a software based non-layout-test frame sink.
If I remove that I get a crash on this dcheck
void OnCommit() override {
layer_tree_frame_sink_from_commit_ =
find_layer_tree_frame_sink_callback_.Run();
DCHECK(layer_tree_frame_sink_from_commit_);
}
And if I remove that one.. then it crashes in a weird stack but through:
#17 0x0000010b3c62 in ~tuple buildtools/third_party/libc++/trunk/include/tuple:474:0
#18 0x0000010b3c62 in base::internal::BindState<void (bluetooth::mojom::FakeCentral_GetLastWrittenValue_ProxyToResponder::*)(bool, base::Optional<std::__1::vector<unsigned char, std::__1::allocator<unsigned char> > > const&), base::internal::PassedWrapper<std::__1::unique_ptr<bluetooth::mojom::FakeCentral_GetLastWrittenValue_ProxyToResponder, std::__1::default_delete<bluetooth::mojom::FakeCentral_GetLastWrittenValue_ProxyToResponder> > > >::~BindState() base/bind_internal.h:469:0
#19 0x7f6dfa83d13b in cc::LayerTreeImpl::FinishSwapPromises(viz::CompositorFrameMetadata*) cc/trees/layer_tree_impl.cc:1565:19
#20 0x7f6dfa7ea827 in cc::LayerTreeHostImpl::DrawLayers(cc::LayerTreeHostImpl::FrameData*) cc/trees/layer_tree_host_impl.cc:1800:18
And it has a null |layer_tree_frame_sink_from_commit_|:
[1:1:1102/121421.944361:ERROR:layouttest_support.cc(308)] (nil)
So I'ma go with that's what is going on.
,
Nov 2 2017
2 options here: 1 - If layout_test_mode() and gpu compositing disabled, fail to make a frame sink. 2 - If layout_test_mode() and gpu compositing disabled, make the gpu based test frame sink anyways.
,
Nov 2 2017
3 - Don't run layout tests on a machine where gpu compositing gets blacklisted.
,
Nov 2 2017
+inferno for comment on #8/9 I think this machine is mis-configured. It should be using osmesa anyway to provide GL so maybe that dll is not being built or found? What will happen: 1 - If we fail to make a frame sink, then the masks test times out. Probably the clusterfuzz test would also (it does anyways for me locally). 2 - If we make a gpu frame sink anyways, I don't know what will happen. I guess this used to work so it would probably continue to work somehow for now. 3 - I'm not sure how to debug on this machine to see what's wrong there.
,
Nov 2 2017
,
Nov 2 2017
,
Nov 2 2017
1 - https://chromium-review.googlesource.com/c/chromium/src/+/751722 (i made it LOG(FATAL) instead of time out? 2 - https://chromium-review.googlesource.com/c/chromium/src/+/751404
,
Nov 2 2017
=> inferno for feedback
,
Nov 2 2017
Assigning to current CF sheriff.
,
Nov 6 2017
Dana, thanks for your analysis here. Your CLs looks reasonable to me, LGTMed. I wonder if that GL problem is similar to what we've discussed in https://bugs.chromium.org/p/chromium/issues/detail?id=768697#c15 Do you think that we need to switch from "--use-gl=any" to any explicit value?
,
Nov 6 2017
Assigning back to Dana, please see my previous comment.
,
Nov 6 2017
Thanks, it could be that this bot should use --use-gl=swiftshader, if the gpu/driver is something old and blacklisted then that should help avoid the problem. I think the plan is to make layout tests run in swiftshader, so it would match that. +sugoi to confirm. Regarding the CLs, you gave a more enthusiastic response on option #1. Is that the one you prefer?
,
Nov 6 2017
Yes, Layout Tests on Windows and Linux already use SwiftShader. MacOS will follow as soon as Angle is enabled on MacOS.
,
Nov 6 2017
Yes, #1 is the one that I understood better :) I think that having an explicit LOG(FATAL) sounds better than timing out.
,
Nov 6 2017
Ok thanks, I will send that to the CQ. We should expect this bot to then crash on the LOG(FATAL) until hopefully the --use-gl line fixes it, unless that lands first. I could write that patch if you can point me to the right place, or lmk if u wanna and I'll reassign to you mmoroz@ once my patch lands. Thanks!
,
Nov 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/946e3173ce19c18e805b4fbb2218cb7149429793 commit 946e3173ce19c18e805b4fbb2218cb7149429793 Author: danakj <danakj@chromium.org> Date: Mon Nov 06 23:46:51 2017 Do LOG(FATAL) if layout tests fall out of gpu compositing. Layout tests require gpu compositing, so they expect an environment where that functions. Some bots seem to be unable to provide that while running layout tests, so make the error mode explicit. R=inferno@chromium.org, piman@chromium.org Bug: 780757 Change-Id: I97ad19b55bc2459b17d26c0babcc64afba72c723 Reviewed-on: https://chromium-review.googlesource.com/751722 Reviewed-by: Max Moroz <mmoroz@chromium.org> Reviewed-by: Antoine Labour <piman@chromium.org> Commit-Queue: danakj <danakj@chromium.org> Cr-Commit-Position: refs/heads/master@{#514297} [modify] https://crrev.com/946e3173ce19c18e805b4fbb2218cb7149429793/content/renderer/render_thread_impl.cc
,
Nov 6 2017
=> mmoroz for the --use-gl= flag
,
Nov 7 2017
ClusterFuzz has detected this issue as fixed in range 514242:514358. Detailed report: https://clusterfuzz.com/testcase?key=5597014296625152 Fuzzer: inferno_layout_test_unmodified Job Type: windows_syzyasan_content_shell Platform Id: windows Crash Type: Null-dereference Crash Address: 0x000001d3 Crash State: viz::TestLayerTreeFrameSink::RequestCopyOfOutput content::CopyRequestSwapPromise::WillSwap cc::LayerTreeImpl::FinishSwapPromises Memory Tool: SYZYASAN Regressed: https://clusterfuzz.com/revisions?job=windows_syzyasan_content_shell&range=512946:513015 Fixed: https://clusterfuzz.com/revisions?job=windows_syzyasan_content_shell&range=514242:514358 Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5597014296625152 See https://github.com/google/clusterfuzz-tools for more information. If you suspect that the result above is incorrect, try re-doing that job on the test case report page.
,
Nov 7 2017
ClusterFuzz testcase 5597014296625152 is verified as fixed, so closing issue as verified. If this is incorrect, please add ClusterFuzz-Wrong label and re-open the issue.
,
Nov 7 2017
https://bugs.chromium.org/p/chromium/issues/detail?id=782186 is the bug for the LOG(FATAL) happening on the windows fuzzers instead of this crash now.
,
Nov 7 2017
,
Nov 7 2017
,
Nov 7 2017
,
Nov 7 2017
|
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by ClusterFuzz
, Nov 2 2017Labels: Test-Predator-AutoComponents