New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 690717 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

"Cancel" tests in ImageWriterUtilityClientTest are flaky

Project Member Reported by bsep@chromium.org, Feb 10 2017

Issue description

Project Member

Comment 1 by bugdroid1@chromium.org, Feb 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/20f1a16e0ed6c595f50df84cb829e7f4fe842fd3

commit 20f1a16e0ed6c595f50df84cb829e7f4fe842fd3
Author: bsep <bsep@chromium.org>
Date: Fri Feb 10 01:25:34 2017

Disable ImageWriterUtilityClientTest.VerifyCancel and [etc].WriteCancel.

They're flaking a lot on the Win7 Debug bot.

TBR=haven@chromium.org
BUG= 690717 

Review-Url: https://codereview.chromium.org/2684103004
Cr-Commit-Position: refs/heads/master@{#449504}

[modify] https://crrev.com/20f1a16e0ed6c595f50df84cb829e7f4fe842fd3/chrome/browser/extensions/api/image_writer_private/image_writer_utility_client_browsertest.cc

Comment 2 by noel@chromium.org, Mar 22 2017

Cc: rdevlin....@chromium.org
Built a DEBUG chrome on WIN10
 - was not able to reproduce the current  bug 690717  reliably.
 - failed about 1 / 1000 browser test runs sometimes.
 - unreliable reproduction.

Changed the Cancel() code to read:

void ImageWriterUtilityClient::Cancel(const CancelCallback& cancel_callback) {	 
  DCHECK_CURRENTLY_ON(content::BrowserThread::IO);

  ResetRequest();
  task_runner_->PostDelayedTask(FROM_HERE, cancel_callback, kDelay);
}

Used a kDelay >= 600ms, this to simulate a really slow DEBUG bot, and I was able to reproduce this  bug 690717  reliably (in all runs, I did 100) in DEBUG Win10.

Problem seems to do with the task_runner_->PostTask that ImageWriterUtilityClient uses to callback its clients.

Why are the callers and ImageWriterUtilityClient code using a task_runner anyway?
 - callers bounce requests rx-ed on FILE thread to the IO thread, which
   ImageWriterUtilityClient expects.
 - seems that is a hang over from an earlier time when ImageWriterUtilityClient
   used UtilityProcessHostClient to control the utility process (needed IO
   thread for that).
 - UtilityProcessHostClient responses are bounced back from IO thread to FILE
   using task_runner_->PostTask() (Bug seems to be here).

Hmmmm.

What if we moved all caller and ImageWriterUtilityClient work to FILE thread?
  - callers are on FILE thread already.
  - mojo utility process client works fine with FILE thread callers.
  - change ImageWriterUtilityClient to expect FILE thread callers.
  - and ImageWriterUtilityClient responses will be on FILE thread, so we ...
  - can remove task_runner_->PostTask from ImageWriterUtilityClient.

Easy to test locally in my WIN10 DEBUG build: make these changes.
  - retain the kDelay >= 600ms
  - re-run the browser test with these changes.
  - can no longer reproduce this  bug 690717  in my DEBUG Win10.

Bug reproduction correctly fingers task_runner_->PostTask() as a problem.
 - Move all work onto the FILE thread, bug no longer reproduces.
  - uploaded a patch to do that.
 - Latent bug maybe, the existing unit test does not test the Cancel case.
 - Browser test I added does test the Cancel case, hence the current bug.
 - Filed bug 703514 about Cancel, production code does not seem to use it.

Other things noted:
 - this component (Platform>Apps>Image Recovery Tool) has no owners.
 - no response from past owners in bug pings [1] or in review [2].
 - maybe just go ahead and delete it all, assign new owners, etc?

[1] https://bugs.chromium.org/p/chromium/issues/detail?id=352442#c18
[2] https://codereview.chromium.org/2734773002

Comment 3 by noel@chromium.org, Mar 22 2017

Status: Started (was: Untriaged)
>- Move all work onto the FILE thread, bug no longer reproduces.
  - uploaded a patch to do that.

https://codereview.chromium.org/2756393002
Project Member

Comment 4 by bugdroid1@chromium.org, Mar 23 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/54af805eac68ac00ef7a6a9f3ee40fb49894792f

commit 54af805eac68ac00ef7a6a9f3ee40fb49894792f
Author: noel <noel@chromium.org>
Date: Thu Mar 23 01:05:26 2017

Deflake ImageWriterUtilityClient browsertest and clients

The image writer browser tests WriteCancel and VerifyCancel added on
crrev.com/449296 flaked on the Windows DEBUG bots,  issue 690717 .

Re-enable these browser tests. Move all work to the FILE thread both
in the callers (operation.cc, operation_nonchromeos.cc) of the image
writer interface, and in the browser test (to match caller behavior)
to see if this removes the flake on the Windows DEBUG bots (analysis
per  crbug.com/690717#c2  suggests that it should).

The browsertest can't test the low-level USB driver code on any port
since our bots don't have USB sticks. So I manually tested using the
the ChromeOS Recovery Tool [1], and I was able to write and verify a
ChromeOS image to my USB stick using this patch on Windows 10.

Covered by existing tests:
  image_writer_utility_client_browsertest, image_writer_unittest

[1] http://bit.ly/2jFtsZZ

BUG= 690717 , 352442 

Review-Url: https://codereview.chromium.org/2756393002
Cr-Commit-Position: refs/heads/master@{#458965}

[modify] https://crrev.com/54af805eac68ac00ef7a6a9f3ee40fb49894792f/chrome/browser/extensions/api/image_writer_private/image_writer_utility_client.cc
[modify] https://crrev.com/54af805eac68ac00ef7a6a9f3ee40fb49894792f/chrome/browser/extensions/api/image_writer_private/image_writer_utility_client.h
[modify] https://crrev.com/54af805eac68ac00ef7a6a9f3ee40fb49894792f/chrome/browser/extensions/api/image_writer_private/image_writer_utility_client_browsertest.cc
[modify] https://crrev.com/54af805eac68ac00ef7a6a9f3ee40fb49894792f/chrome/browser/extensions/api/image_writer_private/operation.cc
[modify] https://crrev.com/54af805eac68ac00ef7a6a9f3ee40fb49894792f/chrome/browser/extensions/api/image_writer_private/operation_nonchromeos.cc

Comment 5 by noel@chromium.org, Mar 24 2017

> Other things noted:
 - this component (Platform>Apps>Image Recovery Tool) has no owners.
 - no response from past owners in bug pings [1] or in review [2].
 - maybe just go ahead and delete it all, assign new owners, etc?
 - maybe create a native app users could download for win / osx

@devlin I'm not sure if these issues are best dealt with on bugs, so maybe add them to your notes.

Comment 6 by noel@chromium.org, Mar 24 2017

Status: Fixed (was: Started)
Cancel tests passing reliably (no evidence of flake) on the WIN DEBUG bots for the last 24 hours.

Sign in to add a comment