New issue
Advanced search Search tips

Issue 922025 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner: ----
Closed: Jan 16
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: ----



Sign in to add a comment

All Mac Bots are having problems

Project Member Reported by sheriff-...@appspot.gserviceaccount.com, Jan 15

Issue description

Filed by sheriff-o-matic@appspot.gserviceaccount.com on behalf of battre@google.com

This is a meta bug to collaborate on the current problem of mac bots.
 
I have been able to reproduce the crashes with the binaries I downloaded from some of the failing test runs.

I have not been able to reproduce the crashes by building the tests myself.
 Issue 922024  has been merged into this issue.
Hundreds of layout tests are crashing, but the blamed culprit (r622639) doesn't even seem to apply to Mac. :/
Cc: piman@chromium.org
+piman

The Mac MSAN bots have some symbols and show similar failures:

https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8924263323536845856/+/steps/extensions_browsertests/0/logs/URLLoaderFactoryManagerBrowserTest.ContentScriptMatching_ChainTraversalForFoo__status_CRASH_/0

[ RUN      ] URLLoaderFactoryManagerBrowserTest.ContentScriptMatching_ChainTraversalForFoo

DevTools listening on ws://127.0.0.1:53107/devtools/browser/ea4e46c7-341e-46eb-9cdf-5c8f63b8db93
[12735:775:0115/045153.688062:3960382639759:ERROR:vt_video_encode_accelerator_mac.cc(513)]  VTCompressionSessionCreate failed: -12908
=================================================================
==12737==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000027bd8 at pc 0x00012a4e9a52 bp 0x7ffeec62fa30 sp 0x7ffeec62f1e0
READ of size 24 at 0x602000027bd8 thread T0
    #0 0x12a4e9a51 in __asan_memcpy ??:0:0
    #1 0x105c9998c in gpu::raster::RasterImplementation::VerifySyncTokensCHROMIUM(signed char**, int) ??:0:0
    #2 0x11d7209f4 in content::RenderThreadImpl::SharedCompositorWorkerContextProvider(bool) ??:0:0
    #3 0x11d71e3b9 in content::RenderThreadImpl::RequestNewLayerTreeFrameSink(int, scoped_refptr<content::FrameSwapMessageQueue>, GURL const&, base::OnceCallback<void (std::__1::unique_ptr<cc::LayerTreeFrameSink, std::__1::default_delete<cc::LayerTreeFrameSink> >)>, mojo::InterfaceRequest<content::mojom::RenderFrameMetadataObserverClient>, mojo::InterfacePtr<content::mojom::RenderFrameMetadataObserver>, char const*) ??:0:0
    #4 0x11d77b60b in content::RenderWidget::DoRequestNewLayerTreeFrameSink(base::OnceCallback<void (std::__1::unique_ptr<cc::LayerTreeFrameSink, std::__1::default_delete<cc::LayerTreeFrameSink> >)>) ??:0:0
    #5 0x11d77ac40 in content::RenderWidget::RequestNewLayerTreeFrameSink(base::OnceCallback<void (std::__1::unique_ptr<cc::LayerTreeFrameSink, std::__1::default_delete<cc::LayerTreeFrameSink> >)>) ??:0:0
    #6 0x11da25917 in content::LayerTreeView::RequestNewLayerTreeFrameSink() ??:0:0
    #7 0x1137fe26d in cc::ProxyMain::RequestNewLayerTreeFrameSink() ??:0:0
    #8 0x1137f9384 in base::internal::Invoker<base::internal::BindState<void (cc::ProxyMain::*)(), base::WeakPtr<cc::ProxyMain> >, void ()>::RunOnce(base::internal::BindStateBase*) ??:0:0
    #9 0x10e37f19c in base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) ??:0:0
    #10 0x10e5855ec in base::sequence_manager::internal::ThreadControllerImpl::DoWork(base::sequence_manager::internal::ThreadControllerImpl::WorkType) ??:0:0
    #11 0x10e58a874 in base::internal::Invoker<base::internal::BindState<void (base::sequence_manager::internal::ThreadControllerImpl::*)(base::sequence_manager::internal::ThreadControllerImpl::WorkType), base::WeakPtr<base::sequence_manager::internal::ThreadControllerImpl>, base::sequence_manager::internal::ThreadControllerImpl::WorkType>, void ()>::Run(base::internal::BindStateBase*) ??:0:0
    #12 0x10e37f19c in base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) ??:0:0


Could this be related to your CLs?
https://chromium-review.googlesource.com/c/chromium/src/+/1407259
https://chromium-review.googlesource.com/c/chromium/src/+/1407739
Project Member

Comment 5 by bugdroid1@chromium.org, Jan 15

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2aa2303669972f7f150b4eab9aafeaab11e3262d

commit 2aa2303669972f7f150b4eab9aafeaab11e3262d
Author: Jeremy Roman <jbroman@chromium.org>
Date: Tue Jan 15 15:49:57 2019

Revert "Rework RasterInterface::CopySubTexture"

This reverts commit 0da668539c5a880a203609988404e27100bc4b55.

Reason for revert: Speculative revert to fix Mac bots.

Original change's description:
> Rework RasterInterface::CopySubTexture
> 
> Pass mailboxes directyl instead of requiring CreateAndConsumeTexture +
> DeleteTextures.
> Simplify CreateAndConsumeTexture and DeleteTextures that are now only
> used for GPU Raster.
> Remove tracking structures in RasterDecoder and RasterImplementation*
> which become entirely unnecessary.
> 
> Bug: 829435
> 
> Change-Id: I73c3155932fd417b4f95dd99e7fe8e3511685d61
> Reviewed-on: https://chromium-review.googlesource.com/c/1407259
> Reviewed-by: Jonathan Backer <backer@chromium.org>
> Commit-Queue: Antoine Labour <piman@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#622636}

TBR=backer@chromium.org,piman@chromium.org

Change-Id: I044ecf68623faa5f6da4f2021f4f7440abd91ac0
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug: 829435, 922025 
Reviewed-on: https://chromium-review.googlesource.com/c/1412552
Reviewed-by: Jeremy Roman <jbroman@chromium.org>
Commit-Queue: Jeremy Roman <jbroman@chromium.org>
Cr-Commit-Position: refs/heads/master@{#622869}
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/cc/raster/gpu_raster_buffer_provider.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/cc/raster/one_copy_raster_buffer_provider.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/cc/raster/raster_buffer_provider_perftest.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/cc/test/test_in_process_context_provider.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/components/viz/test/test_context_provider.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/build_raster_cmd_buffer.py
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_cmd_helper_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_implementation.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_implementation.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_implementation_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_implementation_gles.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_implementation_gles.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_implementation_gles_unittest.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_implementation_impl_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_implementation_unittest_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_interface.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/client/raster_interface_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/common/raster_cmd_format_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/common/raster_cmd_format_test_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/common/raster_cmd_ids_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/raster_cmd_buffer_functions.txt
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/service/raster_decoder.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/service/raster_decoder_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/service/raster_decoder_unittest.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/service/raster_decoder_unittest_1_autogen.h
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/gpu/command_buffer/service/raster_decoder_unittest_base.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/services/ws/public/cpp/gpu/context_provider_command_buffer.cc
[modify] https://crrev.com/2aa2303669972f7f150b4eab9aafeaab11e3262d/ui/compositor/test/in_process_context_provider.cc

Revert doesn't seem to have solved it. Looking again, new stack trace appears to be:

0   headless_browsertests               0x000000011a24d31f base::debug::StackTrace::StackTrace(unsigned long) + 31
1   headless_browsertests               0x000000011a24ce77 base::debug::(anonymous namespace)::StackDumpSignalHandler(int, __siginfo*, void*) + 4135
2   libsystem_platform.dylib            0x00007fff6955bf5a _sigtramp + 26
3   headless_browsertests               0x000000011d1f9dcd mojo::InterfaceEndpointClient::HandleValidatedMessage(mojo::Message*) + 3165
4   headless_browsertests               0x0000000122a4c95d blink::GraphicsContext::~GraphicsContext() + 1245
5   headless_browsertests               0x0000000122a6dc2d blink::GraphicsLayer::PaintWithoutCommit(blink::IntRect const*, blink::GraphicsContext::DisabledMode) + 1421
6   headless_browsertests               0x0000000122a6c27c blink::GraphicsLayer::Paint(blink::IntRect const*, blink::GraphicsContext::DisabledMode) + 332
7   headless_browsertests               0x0000000122a6ba70 blink::GraphicsLayer::PaintRecursivelyInternal(WTF::Vector<blink::GraphicsLayer*, 0u, WTF::PartitionAllocator>&) + 256
8   headless_browsertests               0x0000000122a6bc5e blink::GraphicsLayer::PaintRecursivelyInternal(WTF::Vector<blink::GraphicsLayer*, 0u, WTF::PartitionAllocator>&) + 750
9   headless_browsertests               0x0000000122a6bc5e blink::GraphicsLayer::PaintRecursivelyInternal(WTF::Vector<blink::GraphicsLayer*, 0u, WTF::PartitionAllocator>&) + 750
10  headless_browsertests               0x0000000122a6b637 blink::GraphicsLayer::PaintRecursively() + 231
11  headless_browsertests               0x00000001247ce47b blink::LocalFrameView::PaintTree() + 2699

e.g. https://logs.chromium.org/logs/chromium/buildbucket/cr-buildbucket.appspot.com/8924243604176609760/+/steps/headless_browsertests/0/logs/HeadlessWebContentsTest.WindowOpen__status_TIMEOUT_/0
Cc: danakj@chromium.org
re: #4, VerifySyncTokensCHROMIUM should be called from this stack, so this is heavily bogus. Also most likely unrelated to my CLs.

+danakj do you think it could be related to your ongoing RW/RV work? Seems like a stretch though.

@sheriffs, can you link to the earliest failure?
Probably not, this is in BeginFrame and my work is around swap out/in and startup/shutdown/navigation.
Hm I see the request framesink stack above, but that's dying in RenderThreadImpl which is widget agnostic code.
> I have been able to reproduce the crashes with the binaries I downloaded from some of the failing test runs.
> 
> I have not been able to reproduce the crashes by building the tests myself.

Would a gn clean maybe help?
First failed build on 10.10: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac10.10%20Tests/38435

There's a couple of toolchain-related CLs in the blame list. Among others, https://chromium-review.googlesource.com/c/chromium/src/+/1409558 says it only touches Android, but if I'm reading correctly, changes libcxx_is_shared on many platforms (msan builds in particular). Could it be related? 
Cc: thakis@chromium.org thomasanderson@chromium.org
thomasanderson/thakis, see comment #11
use_custom_libcxx is false on macOS, so that at least shouldn't affect mac.
except if is_msan?
#4 suggested otherwise. Now I see that they're ASAN bots, not MSAN.

I don't know then.
Possibly related: the "Mac FYI GPU ASAN" bot has been failing/crashing consistently since this run: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Mac%20FYI%20GPU%20ASAN%20Release/3836

which corresponds to Chromium r622707-622722.

This is a bit earlier than "Mac ASan 64 Tests" started consistently failing (r622745-622753), although the latter had intermittent failures before that, as far back as r622640-r622654.
There was stale build output that was corrupting the Mac builds. The fix was to clobber the bot's output directory. See crbug.com/922069
Status: Fixed (was: Available)

Sign in to add a comment