Weekly tab crashes with EXC_BAD_ACCESS / EXC_I386_GPFLT
Reported by
conrad.i...@gmail.com,
Sep 13
|
||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_2) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.81 Safari/537.36 Steps to reproduce the problem: 1. Use https://mail.superhuman.com regularly 2. Wait for a few weeks... 3. You'll see Chrome's "Aw, Snap!" page What is the expected behavior? The tab should not crash. What went wrong? We're seeing "Aw, Snap" crashes reported by several users. This seems to have become more frequent around the beginning of June 2018, but due to the low incidence rate, and the even lower report rate it's hard to be certain. From reading the internet it seems that the known causes of "Aw, Snap" are: * Out of javascript heap space (we are also seeing these somewhat regularly, but this seems to crash as EXC_BAD_ACCESS / EXC_I386_BPT, and tend to happen with high tab uptime so I think are not related to this issue, which has happened as soon as 13 seconds after booting the page) * An actual bug in the javascript interpreter. * Anything else? I've attached a few of the interesting crashes in the hope that someone can help me debug this further. I'd ideally like to know if this is a bug in our code (or a resource leak we can just fix) or whether we're triggering a bug further downstream (in which case can we work around it, or help you fix it). If you could even just symbolicate these for me, that would be hugely helpful! Did this work before? N/A Chrome version: 69.0.3497.81 Channel: n/a OS Version: OS X 10.13.2 Flash Version:
,
Sep 13
Thread 0 ( * CRASHED * EXC_BAD_ACCESS / EXC_I386_GPFLT @ 0x11224be81 ) 0 [Google Chrome Framework - scrollbar.cc:629] blink::Scrollbar::SetNeedsPaintInvalidation(blink::ScrollbarPart) 1 [Google Chrome Framework - trace_event.h:1106] blink::TimerBase::RunInternal() 2 [Google Chrome Framework - callback_forward.h:11] base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) 3 [Google Chrome Framework - weak_ptr.h:243] blink::scheduler::internal::ThreadControllerImpl::DoWork(blink::scheduler::internal::SequencedTaskSource::WorkType) 4 [Google Chrome Framework - callback_forward.h:11] base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) 5 [Google Chrome Framework - vector:639] base::MessageLoop::RunTask(base::PendingTask*) 6 [Google Chrome Framework - message_loop.cc:408] base::MessageLoop::DoWork() 7 [Google Chrome Framework - message_pump_mac.mm:462] base::MessagePumpCFRunLoopBase::RunWork() 8 [Google Chrome Framework - 0x209de9a] base::mac::CallWithEHFrame(void () block_pointer) 9 [Google Chrome Framework - message_pump_mac.mm:441] base::MessagePumpCFRunLoopBase::RunWorkSource(void*) 10 [CoreFoundation - 0xa3821] __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ 11 [CoreFoundation - 0x15d4cc] __CFRunLoopDoSource0 12 [CoreFoundation - 0x862c0] __CFRunLoopDoSources0 13 [CoreFoundation - 0x8573d] __CFRunLoopRun 14 [CoreFoundation - 0x84fa3] CFRunLoopRunSpecific 15 [Foundation - 0x213f6] -[NSRunLoop(NSRunLoop) runMode:beforeDate:] 16 [Google Chrome Framework - message_pump_mac.mm:734] base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) 17 [Google Chrome Framework - message_pump_mac.mm:311] base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) 18 [Google Chrome Framework - run_loop.cc:136] <name omitted> 19 [Google Chrome Framework - renderer_main.cc:248] content::RendererMain(content::MainFunctionParams const&) 20 [Google Chrome Framework - content_main_runner.cc:922] content::ContentMainRunnerImpl::Run() 21 [Google Chrome Framework - main.cc:452] service_manager::Main(service_manager::MainParams const&) 22 [Google Chrome Framework - content_main.cc:19] content::ContentMain(content::ContentMainParams const&) 23 [Google Chrome Framework - chrome_main.cc:0] ChromeMain 24 [Google Chrome Helper - chrome_exe_main_mac.cc:169] main 25 [libdyld.dylib - 0x1145] start
,
Sep 14
,
Sep 14
@rsesek, thank you for the symbolicated stack trace! Is the other one similar? Looking at this it seems likely to do with scrolling, but I'm not sure what to do about that from our side. Any ideas what we might be doing to cause this?
,
Sep 14
Yes, the two traces are identical. I'm not sure of the root cause - hopefully someone with more expertise in Blink>Scroll will triage.
,
Sep 18
,
Sep 20
This one likely happens in scroll_animator_mac.mm setCurrentProgress and scrollbar is wild pointer. Maybe we have some cleanup missing in ScrollAnimatorMac::Dispose
,
Sep 21
This is continuing to happen to us at a relative high rate (we just had another user complain about it today). Is there anything I can do on my end that would avoid this (assuming given the lack of urgency that this crash isn't high on your internal crash list), or help you to fix it?
,
Sep 24
In case that helps, here's another crash dump when I saw "Aw, snap".
,
Sep 24
@chaopeng, I see you fixed a similar bug here https://chromium-review.googlesource.com/1060628 ā is it possible that there's another missing case? (I tried to read through the code, but I'm not very familiar with it). Another related change would be https://bugs.chromium.org/p/chromium/issues/detail?id=860499 ? I can't see the detail of the issue, but the commit message that fixes it also sounds similar: https://chromium.googlesource.com/chromium/src/+/7c9912717b5f0e5127d2453b02be6d58174d6860 We had another report of this crash from a user today. Are you seeing it in internal Chrome crash tracking?
,
Sep 25
I don't know how this happens maybe try call vertical_scrollbar_painter_delegate_ cancelAnimations in WillRemoveVerticalScrollbar.
,
Sep 25
,
Oct 1
This just happened to me again. @chaopeng ā thanks for looking into this. Is there anything I can do to help debug? (Happy to run a custom build of Chromium if you think it would be useful)
,
Oct 2
Hi conrad, Do you have any simple reproduce step? I don't know what casuse this wild pointer.
,
Oct 2
Unfortunately we don't. We see this happening relatively infrequently in production, and it seems to usually be caused by a large change to the view. We use iframes to render emails, and we usually see this crash when transitioning from one email to the next. I suspect, but cannot reproduce, that sometimes this crashes when an iframe has been removed from the DOM; but it happens maybe 0.1% of the time, and so I haven't figured out the other necessary conditions. From the code, are there any things that we should try?
,
Oct 2
Chao, this looks like bug 843262 - maybe that wasn't fixed? Or it flared up somehow. Also, from crash data, it looks like it started in 69.0.3489.0. The change log between that and the previous version is https://chromium.googlesource.com/chromium/src/+log/69.0.3488.0..69.0.3489.0?pretty=fuller&n=10000. There's just one scrollbar related change there: https://chromium-review.googlesource.com/c/chromium/src/+/1128333 That looks related to paint but on the face of it doesn't seem like it should cause an issue - take a closer look at that CL and see if anything looks suspect? Is there any way we can set some crash keys to narrow down where the crash is occurring?
,
Oct 2
Actually, nvm regarding the range above. It looks like we only get about 1-2 reports per milestone while in dev/canary so the range isn't helpful. See if there's any data we could include in a crash key that'd be helpful. From the exception type it looks like we're using an object that's likely been GC'd/cleaned up.
,
Oct 5
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e149b499781e45dad92c3a21818d6520185e52dc commit e149b499781e45dad92c3a21818d6520185e52dc Author: chaopeng <chaopeng@chromium.org> Date: Fri Oct 05 23:42:46 2018 Change HasBeenDisposed DCHECK to CHECK in PLSA dtor The crash log shows SetNeedsPaintInvalidation maybe causes by PLSA destroyed before dispose. In this patch, we change the HasBeenDisposed DCHECK to CHECK to make it crash early. Bug: 883876 Change-Id: Idc65a71861bc93462030b9f7193be074abf3343f Reviewed-on: https://chromium-review.googlesource.com/c/1263594 Reviewed-by: Philip Rogers <pdr@chromium.org> Commit-Queue: Jianpeng Chao <chaopeng@chromium.org> Cr-Commit-Position: refs/heads/master@{#597370} [modify] https://crrev.com/e149b499781e45dad92c3a21818d6520185e52dc/third_party/blink/renderer/core/paint/paint_layer_scrollable_area.cc
,
Oct 5
Thanks @chaopeng! :D If I start running canary will you start getting error reports?
,
Oct 6
Yes, new crash report will come in next canary if our guess is correct.
,
Oct 9
I just got this crash while running Chrome Canary
,
Oct 9
The log landed at 71.0.3572.0. But I am still seeing Scrollbar::SetNeedsPaintInvalidation crash after that. So the crash is not caused by not dispose in PLSA dtor.
,
Oct 9
Szager's fix landed in 71.0.3574.0 (https://chromium.googlesource.com/chromium/src/+/b75f6ec6f8b552fb25c4aaa320c432a02f51925e) so lets wait a little and see if that drops the crash rate. Because of our analysis for https://crrev.com/597370, I think it's okay to leave the CHECK in the code (instead of reverting to DCHECK).
,
Oct 9
Awesome, thanks everyone! I've upgraded to the latest Canary and will let you know if I see anything. Sent via Superhuman ( https://sprh.mn/?vip=conrad@superhuman.com ) On Tue, Oct 09, 2018 at 7:36 AM, p⦠< monorail+v2.339614835@chromium.org > wrote:
,
Oct 11
Szager's fix seems good. No blink::Scrollbar::SetNeedsPaintInvalidation crashes since 71.0.3574.0
,
Oct 11
+cc szager fyi, you fixed this bug.
,
Oct 11
Awesome, thanks all! Out of interest, when will this fix reach Chrome stable?
,
Oct 11
M71 stable should be roughly early December. It's possible to pull fixes into releases but my intuition is that this fix was too complex and too late to merge.
,
Oct 12
This doesn't seem that complicated to me, though maybe I'm not familiar enough with the codebase. Given that this is causing crashes for our users, I'd be extremely keen to get this out sooner ā what's the process for that?
,
Oct 12
Quick update: we just had another user run into this crash. (He's seeing them a few times a week.) |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by meh...@chromium.org
, Sep 13