New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Some types of PDFs crash within seconds on newest dev channel

Project Member Reported by jwer...@chromium.org, Jul 11 2016

Issue description

Version: 53.0.2785.4 dev
OS: 8530.6.0 (Official Build) dev-channel veyron_minnie

Since the last few dev channel releases, some of my PDFs cause the PDF viewer to crash all the time within less than a minute of opening and scrolling through them (giving me the sad faced page icon). It seems to happen on all my PCB schematic PDFs while the normal text document ones are fine, so maybe it's caused by landscape vs. portrait or something.

Example PDF (Google internal and confidential, unfortunately): https://drive.google.com/a/google.com/file/d/0Bz_cehT3Y6YsVV9jaU96RFhYaDg/view
 
Do you have a crash report id I can look at?
... because it doesn't crash for me with a ToT Linux build.
I assume it's one of these (from today), but I don't really know how to look at them: https://crash.corp.google.com/browse?stbtiq=cd6c362a-e174-48b0-8b1e-92938f49bf42

When I get a completely black tab with the sad-faced page icon in the middle, does that always mean it's a crash that will get uploaded to crash/ or could it be something else?
- I think ChromeOS builds have (had?) trouble where the debugging symbols weren't  uploaded to the crash server. I forget the bug number. I can probably manually decode them.

- I haven't worked on crash reporting in a while, so I don't know the exact crash dump upload policy, but most likely all your sad-faced crashes get uploaded.
The bug number that I was forgetting is  bug 610902  - still open. :(
Owner: wjmaclean@chromium.org
wjmaclean: This is bug 616213, right? jwerner is hitting this on with 53.0.2785.4. What version is it fixed in for M53? Do we need more merges?

Thread 0 (crashed)
 0  libc-2.19.so!__libc_do_syscall + 0x5
 1  libc-2.19.so!raise [raise.c : 56 + 0x7]
 2  libc-2.19.so!abort [abort.c : 89 + 0x5]
 3  chrome!base::debug::BreakDebugger [debugger_posix.cc : 249 + 0x3]
 4  chrome!ui::FlingCurve::FlingCurve [fling_curve.cc : 48 + 0x3]
 5  chrome!ui::WebGestureCurveImpl::CreateFromDefaultPlatformCurve [web_gesture_curve_impl.cc : 46 + 0x7]
Project Member

Comment 7 by sheriffbot@chromium.org, Jul 12 2016

Labels: -M-53 M-54 MovedFrom-53
Moving this nonessential bug to the next milestone.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Bug 616213 seems to have been fixed by 52.0.2743.40/41-ish, so if it's still showing up on 53.0.2785.4 then it may be something different.

That being said, I can't seem to get it to reproduce.

I tried to reproduce this using the same document in the original report, and a tip-of-tree build (master@{#404810}), but was unsuccessful.

I am testing a CrOS build on Linux, with a touchscreen, but I've also attempt scrolling the document through its full range with mousewheel, page-up/down, dragging the scrollbar with a mouse. 

But for CrOS on Linux I don't have a touchpad. So I also built 53.0.2785.4 and deployed it on a Link running 8350.27.0, and I can't find any PDFs that crash there using either touchscreen- or touchpad-fling, despite spending maore thatn a few minutes on each pdf.

jwerner@ - there are a lot of PDF schematics at

site:bryston.com pdf schematics

Could you try and see if any of those repro for you? If so, I'll re-image my link to 8530.6 and try to repro again.
Yes, I can reproduce it easily with the first two hits from that search term. FWIW I'm not sure if there's any relation to schematics or landscape anymore... I've seen the problem with normal, text-based portrait documents too now.

I noticed that it only seems to happen when I actively scroll with the touch*pad*... I've tried the touchscreen for a few minutes now and can't make it happen with that. (Scrolling purely with arrows and keyboards also seems to prevent it.) Maybe it's even related to some specific touchpad weirdness on my device (Veyron_Minnie)... I've seen other multitouch issues with the recent builds (it often zooms when I want to scroll, and sometimes it seems to lag behind my movements by a few seconds).

I also confirmed that I can repro in guest mode now, so it's not related to my profile.
Addin kenrb@ since he's been working in this area lately, esp. w.r.t. ChromeOS and trackpad scrolling.

Ken, do you have any thoughts here?
I've re-imaged a link with 8530.6.0 and 53.0.2785.4, which appears to be the same as in the initial comment on this bug. Using http://bryston.com/PDF/Schematics/3B-8BST_SCHEMATICS.pdf I cannot seem to reproduce this, despite scrolling up and down with the trackpad for several minutes at a time. (I did see one PDF tab that appeared to have crashed while it was in the background, but it wouldn't have been scrolling at the time, so I don't know if that's related.)

I wonder if there's any chance this could be veyron_minnie-specific?

Comment 12 by cdel...@gmail.com, Jul 13 2016

I see the problem as well and I have the veyron_minnie firmware.

Comment 13 by kenrb@chromium.org, Jul 13 2016

Nobody has been able to get a symbolized crash output for this yet?

Out of curiosity, can anybody who can reproduce this try it with the Extensions Isolation flag turned off? We have that on in dev channel, and I'm wondering if it is relevant.
Actually, I just realized, if the problem re-produces in guest mode, would --isolate-extensions even be on there?
re: comment 13 - I symbolized the crash in comment 6. If you have a minidump from a production build that needs symbolizing, let me know.
Looking at comment #6, a common source of failures in this function are GestureFlingStart events with zero velocity ...
I just got  bug 627900  today, which I suspect is this bug, but that's happening on Samus, so it might be a bit easier to work with vs an ARM board.
 Issue 627900  has been merged into this issue.
I have a crash dump from  bug 627900  loaded in gdb.

#0  0x00007f8bcee72b82 in __GI_raise at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1  0x00007f8bcee748a0 in __GI_abort at abort.c:89
#2  0x00007f8bd1e50835 in base::debug::BreakDebugger at src/base/debug/debugger_posix.cc:249
#3  0x00007f8bd646d36f in ui::FlingCurve::FlingCurve at src/ui/events/gestures/fling_curve.cc:48
#4  0x00007f8bd6057a79 in CreateDefaultPlatformCurve at src/ui/events/gestures/blink/web_gesture_curve_impl.cc:46
#5  ui::WebGestureCurveImpl::CreateFromDefaultPlatformCurve at src/ui/events/gestures/blink/web_gesture_curve_impl.cc:60
#6  0x00007f8bd600f5cc in content::BlinkPlatformImpl::createFlingAnimationCurve at src/content/child/blink_platform_impl.cc:846
#7  0x00007f8bd608000c in ui::InputHandlerProxy::HandleGestureFlingStart at src/ui/events/blink/input_handler_proxy.cc:733

At frame 3, |velocity| is {x_ = 0, y_ = 0}, |max_start_velocity| is 0.

I'll keep my gdb session running. Let me know what other variables you want me to try to fish out. Minidumps are mini and don't have complete information. e.g. I can't do "print *this" at frame 3.
Cc: kenrb@chromium.org moh...@chromium.org
Ok, this is helpful.

There have been a couple of CLs recently that have changed how scroll events and touchpad events are handled, and my guess is that somewhere we're create a WebInputEvent::GestureFlingStart with zero velocity and it's getting passed along to the renderer.

I'll see if I can track it down, but I'm sheriffing today and tomorrow, so I may not get to it until the start of next week, depending on the state of the tree.
wjmaclean: I still have my gdb session open. Is there anything else I can fish out of the crash dump for you?
No, you can close it. I now have an easily reproducible test case, and I am experimenting with a live CrOS image to debug. I'll probably land a temporary fix (that prevents the crash) while we look for the root cause.
Possibly related to  bug 628525 ?
Status: Started (was: Untriaged)
Cc: pucchakayala@chromium.org dhadd...@chromium.org wjmaclean@chromium.org
 Issue 625319  has been merged into this issue.
Project Member

Comment 26 by bugdroid1@chromium.org, Jul 20 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e80690ff28bf6db9392ab9b96a31105e1a6d4205

commit e80690ff28bf6db9392ab9b96a31105e1a6d4205
Author: wjmaclean <wjmaclean@chromium.org>
Date: Wed Jul 20 21:40:27 2016

Don't create zero-velocity GestureFlingStarts.

Currently trackpad initiated GestureFlingStarts on CrOS can have zero-
velocity, but this causes crashes in the renderer. This CL converts
these events to be GestureFlingCancels (just in case they are issued
in the middle of ongoing fling).

As it is hard to find the sources of such events after the fact, a
DCHECK has been added to catch new pathways if they issue such events.

BUG= 627227 

Review-Url: https://codereview.chromium.org/2165043002
Cr-Commit-Position: refs/heads/master@{#406678}

[modify] https://crrev.com/e80690ff28bf6db9392ab9b96a31105e1a6d4205/content/browser/renderer_host/render_widget_host_input_event_router.cc
[modify] https://crrev.com/e80690ff28bf6db9392ab9b96a31105e1a6d4205/ui/events/ozone/evdev/libgestures_glue/gesture_interpreter_libevdev_cros.cc

Status: Fixed (was: Started)
 Issue 627335  has been merged into this issue.
Labels: -M-54 ReleaseBlock-Stable M-53
Status: Assigned (was: Fixed)
This was first reported on M-53 - will it need a merge?
Status: Started (was: Assigned)
Probably yes. Let's give the CL a few days to bake on M54 first.

Comment 31 by khmel@chromium.org, Jul 21 2016

We have revert CL https://codereview.chromium.org/2165043002. It introduces regression in 2 finger navigation in Chrome.

Steps to reproduce:

Open 2 pages, for example www.google.com and then www.yahoo.com
Now use slow 2 finger animation to go back. At the end of the animation wait idle a little bit. Now release finger. Animation hangs in the middle. Reproducible on my minnie 100%. Without this CL it works as expected.

Cc: tdres...@chromium.org
I did test this with 2-finger back/forth navigation, trying to hold my fingers still before lifting from the touchpad, and did not observe this ... but I was doing it on a Pixel so perhaps there's some difference there.

So this still leaves us with the issue of zero-velocity flings necessary to make this animation work. For what it's worth, I'm not comfortable with sending the zero-velocity flings through to the renderer, and it seems like a bit of a design flaw that we need to do this. If anything, I think it makes more sense to send a GestureFlingCancel in this case, and have whatever drives the animation be revised to accept that.

But I'll defer to the gesture events people on this one. I've added tdresser@ for more input.

In the short term, what happens with the hanging animation ... is is stuck for ever? Can the userfinish the gesture in any way? I'm not comfortable to go back to having it crash either.
As an experiment I'll dump out the gesture event sequence for both touchpad and touchscreen and add them to this bug shortly ...

Comment 34 by khmel@chromium.org, Jul 21 2016

Cc: warx@chromium.org abodenha@chromium.org durga.behera@chromium.org xiy...@chromium.org vsu...@chromium.org ajha@chromium.org derat@chromium.org adlr@chromium.org khmel@chromium.org asimjour@chromium.org mtomasz@chromium.org puneetster@chromium.org kavvaru@chromium.org kuscher@chromium.org mark@chromium.org w...@chromium.org osh...@chromium.org
 Issue 627623  has been merged into this issue.

Comment 35 by khmel@chromium.org, Jul 21 2016

>> what happens with the hanging animation ... is is stuck for ever? 
It is resumed if you press any button or create any touchpad/touchscreen event.

Comment 36 by khmel@chromium.org, Jul 21 2016

I did browser test: https://codereview.chromium.org/2168693003/diff/40001/chrome/browser/chrome_fling_navigation_browsertest.cc

If you change empty FlingStart to FlingCancel event this test does not pass.
Re 36: But that test is part of an unlanded CL, or am I mistaken?

I'm trying to understand why the touchpad scrolls only crash when targeted to a WebView. Reading in  InputHandlerProxy::HandleGestureFlingStart (see https://cs.chromium.org/chromium/src/ui/events/blink/input_handler_proxy.cc?rcl=1469114341&l=685) I wonder if calling RootScrollBegin might be happening on events destined for a subframe? Would that matter?

I do suspect that for scrolls targeting a WebView, InputHandlerProxy always targets the event to the main thread (but this needs verification), explaining why the renderer is seeing the zero velocity flings for WebViews but (apparently) not for main frames.

Comment 38 by khmel@chromium.org, Jul 21 2016

>> Re 36: But that test is part of an unlanded CL, or am I mistaken?
Yes, it is not landed. I use this test to check for navigation regression locally.

Does it expect the zero-velocity FlingStart, as opposed to a FlingCancel?

Comment 40 by khmel@chromium.org, Jul 21 2016

Is #39 about browser test?
If so, currently this test generates ScrollBegin, multiple ScrollUpdate, ScrollEnd and FlingStart event. There are 2 cases in the test. One case is fast fling when FlingStart is not empty and second case is slow fling with FlingStart empty. This sequence was chosen because it was the sequence which Chrome OS sent us on real device. With your CL landed we have to change empty FlingStart start to FlingCancel to have parity with new sequence of event. And in this case the test starts failing.

Let's proceed as follows: I'm going to continue to investigate why WebViews crash when non-WebViews don't, as this may provide us with a way forward. I am working on this immediately (now).

In the short term I don't think we should revert the CL since it would mean preferring crashes in PDF viewer and Hangouts Chat over an animation getting stuck (in an easily recoverable way), though I acknowledge it's important to fix the animation regression asap.

Comment 42 by khmel@chromium.org, Jul 21 2016

>> In the short term I don't think we should revert the CL
If so, do we want to have fix on Chrome side? Or probably we need fix this on Chrome OS side (converting empty FlingStart to Cancel). For me it looks kind of strange, patching event in Chrome that coming from Chrome OS. Do we have other consumers of such events that depend on this? If no I would prefer to have the fix on Chrome OS. 

WDYT?

I think the ideal solution would make CrOS touchpad two-finger swipe behave (event-wise) the same as touchscreen swipe, which seems to work at present without issue. I think there are other things we could do to minimize differences between the Touchpad/Touchscreen pathways, though that's probably a bigger chunk of work outside the scope of this discussion.

Ok, quick status update: in terms of WebView vs. non-WebView, when CrOS sends a zero velocity GestureFlingStart, in the non-WebView case it is *not* ever sent to the InputHandlerProxy::HandleGestureFlingStart(), while in the WebView case it is.

So we have a path forward: find the difference, and eliminate it. This means the fix is (almost certainly) in content/ and maybe not in the renderer process even (why send the event if it's not going to be consumed).

I'll continue to work on this; I don't expect it will take too long to find where this is happening, and come up with a proposed fix. I'll try to post a CL before I leave for home (though I won't guarantee a test tonight).
Project Member

Comment 46 by bugdroid1@chromium.org, Jul 22 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d51673024bb1990807d86e8acfbf7759748fbae3

commit d51673024bb1990807d86e8acfbf7759748fbae3
Author: wjmaclean <wjmaclean@chromium.org>
Date: Fri Jul 22 21:52:42 2016

Filter GestureFlingStarts with Vx=Vy=0 in RenderWidgetHostViewChildFrame

This CL fixes both (1) the renderer crash originally fixed by
https://codereview.chromium.org/2165043002, and (2) reverts the part
of that CL which prevents ChromeOS from having zero-velocity
GestureFlingStarts, which caused a regression in two-finger swipe
navigation for touchpad. It also moves the DCHECK from the original CL
to a new location which would have prevented the renderer crashes from
being introduced in the first place.

This CL doesn't include a test, though the test proposed by
khmel@ in https://codereview.chromium.org/2168693003 presumably will
land shortly after this CL.

BUG= 627227 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation

Review-Url: https://codereview.chromium.org/2173603002
Cr-Commit-Position: refs/heads/master@{#407275}

[modify] https://crrev.com/d51673024bb1990807d86e8acfbf7759748fbae3/content/browser/frame_host/render_widget_host_view_child_frame.cc
[modify] https://crrev.com/d51673024bb1990807d86e8acfbf7759748fbae3/content/browser/frame_host/render_widget_host_view_child_frame.h
[modify] https://crrev.com/d51673024bb1990807d86e8acfbf7759748fbae3/content/browser/renderer_host/input/input_router_impl.cc
[modify] https://crrev.com/d51673024bb1990807d86e8acfbf7759748fbae3/content/browser/renderer_host/render_widget_host_input_event_router.cc
[modify] https://crrev.com/d51673024bb1990807d86e8acfbf7759748fbae3/content/browser/renderer_host/render_widget_host_view_aura.cc
[modify] https://crrev.com/d51673024bb1990807d86e8acfbf7759748fbae3/ui/events/ozone/evdev/libgestures_glue/gesture_interpreter_libevdev_cros.cc

Cc: fukino@chromium.org sdantul...@chromium.org
 Issue 630767  has been merged into this issue.
thestig@ - is it ok to mark this as fixed now?

We'll need to request a merge at some point.
Some people mark it fixed and then request merges, others leave it as started until the merges have finished. Do whichever makes more sense to you. In either case, it will be helpful to say this is fixed in 54.x.y.z, and after the merge, fixed in 53.x.y.z.
Labels: Merge-Request-53

Comment 51 by dimu@chromium.org, Jul 25 2016

Labels: -Merge-Request-53 Merge-Approved-53 Hotlist-Merge-Approved
Your change meets the bar and is auto-approved for M53 (branch: 2785)
Project Member

Comment 52 by bugdroid1@chromium.org, Jul 25 2016

Labels: -merge-approved-53 merge-merged-2785
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1e514f96624ed05d048b7adb87c4a4406be1ad54

commit 1e514f96624ed05d048b7adb87c4a4406be1ad54
Author: W. James MacLean <wjmaclean@chromium.org>
Date: Mon Jul 25 21:08:09 2016

Don't create zero-velocity GestureFlingStarts.

Currently trackpad initiated GestureFlingStarts on CrOS can have zero-
velocity, but this causes crashes in the renderer. This CL converts
these events to be GestureFlingCancels (just in case they are issued
in the middle of ongoing fling).

As it is hard to find the sources of such events after the fact, a
DCHECK has been added to catch new pathways if they issue such events.

BUG= 627227 

Review-Url: https://codereview.chromium.org/2165043002
Cr-Commit-Position: refs/heads/master@{#406678}
(cherry picked from commit e80690ff28bf6db9392ab9b96a31105e1a6d4205)

Review URL: https://codereview.chromium.org/2176373002 .

Cr-Commit-Position: refs/branch-heads/2785@{#341}
Cr-Branched-From: 68623971be0cfc492a2cb0427d7f478e7b214c24-refs/heads/master@{#403382}

[modify] https://crrev.com/1e514f96624ed05d048b7adb87c4a4406be1ad54/content/browser/renderer_host/render_widget_host_input_event_router.cc
[modify] https://crrev.com/1e514f96624ed05d048b7adb87c4a4406be1ad54/ui/events/ozone/evdev/libgestures_glue/gesture_interpreter_libevdev_cros.cc

Project Member

Comment 53 by bugdroid1@chromium.org, Jul 25 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9fce52f932fd3d1521fb113d61253c0b1482653a

commit 9fce52f932fd3d1521fb113d61253c0b1482653a
Author: W. James MacLean <wjmaclean@chromium.org>
Date: Mon Jul 25 21:16:37 2016

Filter GestureFlingStarts with Vx=Vy=0 in RenderWidgetHostViewChildFrame

This CL fixes both (1) the renderer crash originally fixed by
https://codereview.chromium.org/2165043002, and (2) reverts the part
of that CL which prevents ChromeOS from having zero-velocity
GestureFlingStarts, which caused a regression in two-finger swipe
navigation for touchpad. It also moves the DCHECK from the original CL
to a new location which would have prevented the renderer crashes from
being introduced in the first place.

This CL doesn't include a test, though the test proposed by
khmel@ in https://codereview.chromium.org/2168693003 presumably will
land shortly after this CL.

BUG= 627227 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_site_isolation

Review-Url: https://codereview.chromium.org/2173603002
Cr-Commit-Position: refs/heads/master@{#407275}
(cherry picked from commit d51673024bb1990807d86e8acfbf7759748fbae3)

Review URL: https://codereview.chromium.org/2182693003 .

Cr-Commit-Position: refs/branch-heads/2785@{#342}
Cr-Branched-From: 68623971be0cfc492a2cb0427d7f478e7b214c24-refs/heads/master@{#403382}

[modify] https://crrev.com/9fce52f932fd3d1521fb113d61253c0b1482653a/content/browser/frame_host/render_widget_host_view_child_frame.cc
[modify] https://crrev.com/9fce52f932fd3d1521fb113d61253c0b1482653a/content/browser/frame_host/render_widget_host_view_child_frame.h
[modify] https://crrev.com/9fce52f932fd3d1521fb113d61253c0b1482653a/content/browser/renderer_host/input/input_router_impl.cc
[modify] https://crrev.com/9fce52f932fd3d1521fb113d61253c0b1482653a/content/browser/renderer_host/render_widget_host_input_event_router.cc
[modify] https://crrev.com/9fce52f932fd3d1521fb113d61253c0b1482653a/content/browser/renderer_host/render_widget_host_view_aura.cc
[modify] https://crrev.com/9fce52f932fd3d1521fb113d61253c0b1482653a/ui/events/ozone/evdev/libgestures_glue/gesture_interpreter_libevdev_cros.cc

 Issue 630542  has been merged into this issue.
Status: Fixed (was: Started)
This is now fixed as of master@{#407275} in M54, and has been merged to the M53 branch.

Please note! If it ever becomes necessary to revert the merge (or the ToT commits), then both the CLs indicated in comments 52 & 53 need to be reverted, in reverse order. I don't think this will be necessary, but just thought I should make it explicit here.
Status: Verified (was: Fixed)
 8530.76.0/53.0.2785.87

Sign in to add a comment