New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 747110 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Android , Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Multiple crash files triggered during cheets_StartAndroid.stress

Reported by jrbarnette@chromium.org, Jul 20 2017

Issue description

Two recent CQ runs have failed due to timeouts from
cheets_StartAndroid.stress.  The timeouts trigger aborts,
and after the aborts, the job gathers a half-dozen or so
chrome crash dumps.

First failure was seen on caroline:
    https://luci-milo.appspot.com/buildbot/chromeos/caroline-paladin/627

The test was tried, and retried.  Test failures:
    http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=129299398
    http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=129290898

The second failure happened on a later run on cyan:
    https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/3264

The test was tried only once.  The test failure:
    http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=129338823

I've every reason to believe that failures like this will also be
occurring on the PFQs and in the canaries, but I haven't checked.

Passing to a randomly selected ARC constable for triage.

 
If the problem was introduced by a CL in the first failed run, this
is the blamelist:

arc-camera | henryhsu | 571256 | fails:1 | usb: Fix testCameraDeviceCreateCaptureBuilder CTS
arc-camera | henryhsu | 571623 | fails:0(1) | usb: Sync fixes from arc-camera-service
arc-camera | henryhsu | 571624 | fails:0(1) | usb: Skip invalid MJPEG
autotest-cheets | tedlai | *414589 | Add util functions for maximizing the window and rotating the screen in uiaut...
kernel | nathan.d.ciobanu-AT-intel.com | 489689 | UPSTREAM: drm/i915/kbl: Introduce the first official DMC for Kabylake.
kernel | nathan.d.ciobanu-AT-intel.com | 489690 | UPSTREAM: drm/i915: Remove misleading CSR firmware loading docs
kernel | nathan.d.ciobanu-AT-intel.com | 489691 | UPSTREAM: drm/i915/DMC/KBL: Load DMC on KBL using the no_stepping_info array
kernel | chintan.m.patel-AT-intel.com | 505912 | BACKPORT: drm/i915: Show dmc debug registers on Kabylake
kernel | sadashiva.rao.pv-AT-intel.com | 564186 | FROMLIST: PCI / PM: Fix native PME handling during system suspend/resume
kernel | luciano.coelho-AT-intel.com | 577812 | CHROMIUM: iwl7000: mvm: allow vendor command to get SAR profile with the inte...
project-cheets-private | tedlai | *413268 | Add cheets_ScreenRotation.

adding Chrome Gardener
It looks like there's not a lot of info in the Chrome logs.
I just spoke with elijahtaylor@ and he's going to take a look in a few minutes when he's out of his meeting.
Cc: jhorwich@chromium.org
Components: Platform>ARC
Labels: -Pri-2 OS-Chrome Pri-1
This looks very similar to https://bugs.chromium.org/p/chromium/issues/detail?id=746573
Seeing chrome crash dump files with stack:  chrome!base::ObserverListBase<AboutSigninInternals::Observer>::RemoveObserver(AboutSigninInternals::Observer*) [stl_iterator.h : 729 + 0x0]

Owner: hidehiko@chromium.org
One frame further, one of the chrome crashes from Caroline and many from cyan seem to be indicating a cleanup problem in ArcImeService: https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/129338823-chromeos-test/chromeos4-row6-rack9-host7/crashinfo.chromeos4-row6-rack9-host7/

 1  chrome!arc::ArcImeService::~ArcImeService() [arc_ime_service.cc : 108 + 0x6]


Hidehiko is doing some refactoring on the Chrome side, do you think this is related to the latest Chrome roll? (this is the diff which includes some of your changes: https://chromium.googlesource.com/chromium/src/+log/61.0.3160.0..61.0.3161.0?pretty=fuller&n=10000)


FYI: looking...
All paladins and Chrome PFQ are green now. 

All caroline paladin runs passed after 627.
https://luci-milo.appspot.com/buildbot/chromeos/caroline-paladin/
All cyan paladin runs passed after 3264.
https://luci-milo.appspot.com/buildbot/chromeos/cyan-paladin/

Cc: sammiequon@chromium.org
There are two CLs triggering Shutdown crash.

1) https://chromium-review.googlesource.com/c/572470/
Before the CL, the shutdown order was;
- ArcService (= ArcImeService)
- exo::WMHelper
- aura::Window (via ash::Shell).
which was changed to
- exo::WMHelper
- aura::Window
- KeyedService (= ArcImeService).

So, with the CL, when aura::Window is being destroyed, it is notified to ArcImeService, which unconditionally touches exo causing crash.
Here is the fix: https://chromium-review.googlesource.com/c/581328/

2) https://chromium-review.googlesource.com/c/572053
The tablet mode related observer was extracted from ash::Shell to TabletModeController. The destruction order is;
- VirtualKeyboardController
- TabletModeController
- Window destruction
So, at least in views dtor, null check is needed.
Here is the fix: https://chromium-review.googlesource.com/c/581329/

I locally made sure the shutdown without crash with these two patches applied.


Please expedite a fix or revert for this problem:  We're
continuing to see timeouts on the CQ.  This is the latest:
    https://luci-milo.appspot.com/buildbot/chromeos/reef-paladin/3050

Cc: -wuchengli@chromium.org
Status: Started (was: Available)
Project Member

Comment 14 by bugdroid1@chromium.org, Jul 24 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b8767566c03649866907b122c4234befd1192bd4

commit b8767566c03649866907b122c4234befd1192bd4
Author: Hidehiko Abe <hidehiko@chromium.org>
Date: Mon Jul 24 17:22:18 2017

Fix crash on Shutdown.

Window is destroyed after TabletModeControler is destroyed.
So, in dtor, null check is needed.

BUG= 747110 
TEST=Ran on DUT.

Change-Id: I62ed1a6183cd3057efc2d051e5f410bb9dea0db2
Reviewed-on: https://chromium-review.googlesource.com/581329
Commit-Queue: Hidehiko Abe <hidehiko@chromium.org>
Reviewed-by: Scott Violet <sky@chromium.org>
Reviewed-by: Mitsuru Oshima <oshima@chromium.org>
Reviewed-by: Sammie Quon <sammiequon@chromium.org>
Cr-Commit-Position: refs/heads/master@{#489002}
[modify] https://crrev.com/b8767566c03649866907b122c4234befd1192bd4/ash/system/rotation/tray_rotation_lock.cc
[modify] https://crrev.com/b8767566c03649866907b122c4234befd1192bd4/ash/virtual_keyboard_controller.cc
[modify] https://crrev.com/b8767566c03649866907b122c4234befd1192bd4/chrome/browser/ui/views/frame/browser_non_client_frame_view_ash.cc

Project Member

Comment 15 by bugdroid1@chromium.org, Jul 25 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/75ad08b9227cc50f2d6d43a154efa8826adcd1ea

commit 75ad08b9227cc50f2d6d43a154efa8826adcd1ea
Author: Hidehiko Abe <hidehiko@chromium.org>
Date: Tue Jul 25 07:19:39 2017

Fix crash on Shutdown.

Now ArcImeService is migrated into KeyedService.
Before that, the service was destroyed ealier,
but now it is destroyed when Profile is destroyed,
which is after Window destruction.
Because exo::WMHelper is destroyed before Window destruction,
we need null check.

BUG= 747110 
TEST=Ran on DUT to verify no crash on shutdown.

Change-Id: I5d48a850f3c456abc076f1fdf39df8bfd3fa1969
Reviewed-on: https://chromium-review.googlesource.com/581328
Commit-Queue: Hidehiko Abe <hidehiko@chromium.org>
Reviewed-by: David Reveman <reveman@chromium.org>
Reviewed-by: Kazuhiro Inaba <kinaba@chromium.org>
Reviewed-by: Luis Hector Chavez <lhchavez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#489247}
[modify] https://crrev.com/75ad08b9227cc50f2d6d43a154efa8826adcd1ea/components/arc/ime/arc_ime_service.cc
[modify] https://crrev.com/75ad08b9227cc50f2d6d43a154efa8826adcd1ea/components/exo/wm_helper.cc
[modify] https://crrev.com/75ad08b9227cc50f2d6d43a154efa8826adcd1ea/components/exo/wm_helper.h

Labels: Merge-Request-61
Status: Fixed (was: Started)
These CLs fix the crash which happens on branch cut of M61. Requesting cherry pick.
Project Member

Comment 17 by sheriffbot@chromium.org, Jul 26 2017

Labels: -Merge-Request-61 Hotlist-Merge-Approved Merge-Approved-61
Your change meets the bar and is auto-approved for M61. Please go ahead and merge the CL to branch 3163 manually. Please contact milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), ketakid @(ChromeOS), govind@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 18 by sheriffbot@chromium.org, Jul 31 2017

This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Project Member

Comment 19 by bugdroid1@chromium.org, Aug 2 2017

Labels: -merge-approved-61 merge-merged-3163
The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2b6ae5717f4a8f55c2b0f6ed0c850d61623b8bbc

commit 2b6ae5717f4a8f55c2b0f6ed0c850d61623b8bbc
Author: Hidehiko Abe <hidehiko@chromium.org>
Date: Wed Aug 02 05:36:29 2017

Fix crash on Shutdown.

Window is destroyed after TabletModeControler is destroyed.
So, in dtor, null check is needed.

BUG= 747110 
TEST=Ran on DUT.
TBR=hidehiko@chromium.org

(cherry picked from commit b8767566c03649866907b122c4234befd1192bd4)

Change-Id: I62ed1a6183cd3057efc2d051e5f410bb9dea0db2
Reviewed-on: https://chromium-review.googlesource.com/581329
Commit-Queue: Hidehiko Abe <hidehiko@chromium.org>
Reviewed-by: Scott Violet <sky@chromium.org>
Reviewed-by: Mitsuru Oshima <oshima@chromium.org>
Reviewed-by: Sammie Quon <sammiequon@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#489002}
Reviewed-on: https://chromium-review.googlesource.com/597568
Reviewed-by: Hidehiko Abe <hidehiko@chromium.org>
Cr-Commit-Position: refs/branch-heads/3163@{#232}
Cr-Branched-From: ff259bab28b35d242e10186cd63af7ed404fae0d-refs/heads/master@{#488528}
[modify] https://crrev.com/2b6ae5717f4a8f55c2b0f6ed0c850d61623b8bbc/ash/system/rotation/tray_rotation_lock.cc
[modify] https://crrev.com/2b6ae5717f4a8f55c2b0f6ed0c850d61623b8bbc/ash/virtual_keyboard_controller.cc
[modify] https://crrev.com/2b6ae5717f4a8f55c2b0f6ed0c850d61623b8bbc/chrome/browser/ui/views/frame/browser_non_client_frame_view_ash.cc

Project Member

Comment 20 by bugdroid1@chromium.org, Aug 2 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/75b17b91e0157083bff7407c218ae178c21b2e90

commit 75b17b91e0157083bff7407c218ae178c21b2e90
Author: Hidehiko Abe <hidehiko@chromium.org>
Date: Wed Aug 02 05:38:31 2017

Fix crash on Shutdown.

Now ArcImeService is migrated into KeyedService.
Before that, the service was destroyed ealier,
but now it is destroyed when Profile is destroyed,
which is after Window destruction.
Because exo::WMHelper is destroyed before Window destruction,
we need null check.

BUG= 747110 
TEST=Ran on DUT to verify no crash on shutdown.
TBR=hidehiko@chromium.org

(cherry picked from commit 75ad08b9227cc50f2d6d43a154efa8826adcd1ea)

Change-Id: I5d48a850f3c456abc076f1fdf39df8bfd3fa1969
Reviewed-on: https://chromium-review.googlesource.com/581328
Commit-Queue: Hidehiko Abe <hidehiko@chromium.org>
Reviewed-by: David Reveman <reveman@chromium.org>
Reviewed-by: Kazuhiro Inaba <kinaba@chromium.org>
Reviewed-by: Luis Hector Chavez <lhchavez@chromium.org>
Cr-Original-Commit-Position: refs/heads/master@{#489247}
Reviewed-on: https://chromium-review.googlesource.com/597569
Reviewed-by: Hidehiko Abe <hidehiko@chromium.org>
Cr-Commit-Position: refs/branch-heads/3163@{#233}
Cr-Branched-From: ff259bab28b35d242e10186cd63af7ed404fae0d-refs/heads/master@{#488528}
[modify] https://crrev.com/75b17b91e0157083bff7407c218ae178c21b2e90/components/arc/ime/arc_ime_service.cc
[modify] https://crrev.com/75b17b91e0157083bff7407c218ae178c21b2e90/components/exo/wm_helper.cc
[modify] https://crrev.com/75b17b91e0157083bff7407c218ae178c21b2e90/components/exo/wm_helper.h

Issue 747198 has been merged into this issue.
Issue 751347 has been merged into this issue.

Comment 23 by aluo@chromium.org, Aug 17 2017

Labels: OS-Android

Sign in to add a comment