New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 653238 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Last visit > 30 days ago
Closed: Oct 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

HWTest [bvt-inline] timeout on lumpy-chrome-pfq during json dump

Project Member Reported by glevin@chromium.org, Oct 5 2016

Issue description

Seen in HWTest [bvt-inline] on
https://uberchromegw.corp.google.com/i/chromeos/builders/lumpy-chrome-pfq/builds/9189
Tests completed successfully, but subsequent json dump timed out:


Output below this line is for buildbot consumption:
Will return from run_suite with status: OK
02:52:11: INFO: RunCommand: /b/cbuild/internal_master/chromite/third_party/swarming.client/swarming.py run --swarming [...] --json_dump -m 79394408

@@@STEP_FAILURE@@@
02:55:26: ERROR: Timeout occurred- waited 15976 seconds, failing. Timeout reason: This build has reached the timeout deadline set by the master. Either this stage or a previous one took too long (see stage timing historical summary in ReportStage) or the build failed to start on time.

@@@STEP_FAILURE@@@
02:55:26: ERROR: Timeout occurred- waited 15660 seconds, failing. Timeout reason: This build has reached the timeout deadline set by the master. Either this stage or a previous one took too long (see stage timing historical summary in ReportStage) or the build failed to start on time.
02:55:26: INFO: Running cidb query on pid 30054, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x62bc550>
02:55:27: INFO: Running cidb query on pid 30054, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x639f750>


Although the specific message is different, superficially this looks like  Issue 589844  and  Issue 621405  (tests passed, timeout during json dump).
+shuqianz@, jrbarnette@, dshi@, fdeng@ , as owners and contributors to those issues, in case this one's actually related.
 

Comment 1 by ihf@chromium.org, Oct 5 2016

Yes, it looks like all tests passed
http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=79394408

Don't know about json.


Comment 2 by autumn@chromium.org, Oct 12 2016

Status: Unconfirmed (was: Untriaged)

Comment 3 by lpique@chromium.org, Oct 13 2016

From the Report step on for example todays failure:

https://uberchromegw.corp.google.com/i/chromeos/builders/lumpy-chrome-pfq/builds/9216/steps/Report/logs/stdio

    [...]
    AFDODataGenerate:
      start:    2:15:23 median 2:15:16 mean 2:15:48 min 2:14:11 max 2:20:52
      duration: 1:26:06 median 0:04:27 mean 0:21:26 min 0:04:13 max 1:18:34
      finish:   3:41:29 median 2:20:04 mean 2:37:14 min 2:18:27 max 3:37:39
    VMTest (attempt 1):
      start:    2:15:23 median 2:15:17 mean 2:15:48 min 2:14:11 max 2:20:52
      duration: 0:48:56 median 0:48:38 mean 0:48:37 min 0:47:33 max 0:49:13
      finish:   3:04:19 median 3:03:55 mean 3:04:25 min 3:02:29 max 3:09:15
    HWTest [AFDO_record]:
      start:    2:18:07 median 2:18:30 mean 2:18:55 min 2:17:06 max 2:23:37
      duration: 1:47:16 median 0:00:08 mean 0:17:36 min 0:00:08 max 1:15:26
      finish:   4:05:23 median 2:19:01 mean 2:36:31 min 2:17:34 max 3:37:58
    [...]

You may want to look at what is going on with the AFDODataGenerate and AFDO_record test steps. The duration on those two is very long compared to the mean.

Comment 4 by laszio@chromium.org, Oct 13 2016

The AFDO_record does nothing when there is no chrome version change so the running time may look flaky. The max duration time is pretty close to what we observed, however. There are substantial increases on running time due to recent telemetry changes. The workaround we currently have is cutting down the training set:
https://chromium-review.googlesource.com/#/c/398199/

That should shovel 17 mins out of 75 mins to make it under 60 mins.
Project Member

Comment 5 by bugdroid1@chromium.org, Oct 14 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/0ab93253bcc62fe7a68698fd89448c20479f33b9

commit 0ab93253bcc62fe7a68698fd89448c20479f33b9
Author: Ting-Yuan Huang <laszio@chromium.org>
Date: Thu Oct 13 18:56:20 2016

telemetry_AFDOGenerate: Cutdown the training set.

This has been causing some builders to timeout.

BUG= chromium:653238 
TEST=none

Change-Id: I3a433af99b12b105a461c57a0655eb433e7a7434
Reviewed-on: https://chromium-review.googlesource.com/398199
Commit-Ready: Ting-Yuan Huang <laszio@chromium.org>
Tested-by: Luis Lozano <llozano@chromium.org>
Tested-by: Ting-Yuan Huang <laszio@chromium.org>
Reviewed-by: Luis Lozano <llozano@chromium.org>

[modify] https://crrev.com/0ab93253bcc62fe7a68698fd89448c20479f33b9/server/site_tests/telemetry_AFDOGenerate/telemetry_AFDOGenerate.py

Comment 6 by lpique@chromium.org, Oct 14 2016

 Issue 650288  has been merged into this issue.
Labels: Build-Toolchain
Owner: laszio@chromium.org
Status: Assigned (was: Unconfirmed)

Comment 8 by laszio@chromium.org, Oct 18 2016

Cc: bhthompson@chromium.org
Labels: Merge-Request-55

Comment 9 by laszio@chromium.org, Oct 18 2016

The CL to M55 is here:
https://chromium-review.googlesource.com/#/c/400003/

Comment 10 by dimu@chromium.org, Oct 18 2016

Labels: -Merge-Request-55 Merge-Approved-55 Hotlist-Merge-Approved
Your change meets the bar and is auto-approved for M55 (branch: 2883)
Project Member

Comment 11 by bugdroid1@chromium.org, Oct 18 2016

Labels: merge-merged-release-R55-8872.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/d7e94d571e0f814f66c30098bc072ab7002c37a8

commit d7e94d571e0f814f66c30098bc072ab7002c37a8
Author: Ting-Yuan Huang <laszio@chromium.org>
Date: Thu Oct 13 18:56:20 2016

telemetry_AFDOGenerate: Cutdown the training set.

This has been causing some builders to timeout.

BUG= chromium:653238 
TEST=none

Change-Id: I3a433af99b12b105a461c57a0655eb433e7a7434
Reviewed-on: https://chromium-review.googlesource.com/398199
Commit-Ready: Ting-Yuan Huang <laszio@chromium.org>
Tested-by: Luis Lozano <llozano@chromium.org>
Tested-by: Ting-Yuan Huang <laszio@chromium.org>
Reviewed-by: Luis Lozano <llozano@chromium.org>
(cherry picked from commit 0ab93253bcc62fe7a68698fd89448c20479f33b9)
Reviewed-on: https://chromium-review.googlesource.com/400003
Trybot-Ready: Ting-Yuan Huang <laszio@chromium.org>
Commit-Queue: Ting-Yuan Huang <laszio@chromium.org>

[modify] https://crrev.com/d7e94d571e0f814f66c30098bc072ab7002c37a8/server/site_tests/telemetry_AFDOGenerate/telemetry_AFDOGenerate.py

Project Member

Comment 12 by sheriffbot@chromium.org, Oct 22 2016

This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Status: Fixed (was: Assigned)
Project Member

Comment 14 by sheriffbot@chromium.org, Oct 26 2016

This issue has been approved for a merge. Please merge the fix to any appropriate branches as soon as possible!

If all merges have been completed, please remove any remaining Merge-Approved labels from this issue.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Hotlist-Merge-Approved -Merge-Approved-55

Comment 16 by dchan@google.com, Jan 21 2017

Labels: VerifyIn-57

Comment 17 by dchan@google.com, Mar 4 2017

Labels: VerifyIn-58

Comment 18 by dchan@google.com, Apr 17 2017

Labels: VerifyIn-59

Comment 19 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61
Status: Verified (was: Fixed)
Closing. Please reopen it if its not fixed. Thanks!
Components: -Infra>Client>ChromeOS

Sign in to add a comment