New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 718660 link

Starred by 2 users

Issue metadata

Status: Archived
Owner: ----
Closed: Jan 10
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Decrease moblab_quick timeout

Project Member Reported by davidri...@chromium.org, May 5 2017

Issue description

guado_moblab-paladin has had a number of cases where the moblab_quick suite times out at 2h.  Reduce it to 90 minutes.

All data here is since the beginning of April ~35 days.

Successful builds normally take ~42 minutes:
mysql> select build_id,  board, bst.status, bst.start_time, avg(time_to_sec(timediff(bst.finish_time, bst.start_time))) as 'duration', build_config, build_number from buildStageTable bst join buildTable bt on bst.build_id = bt.id where build_config = 'guado_moblab-paladin' and name = 'HWTest [moblab_quick]' and bt.last_updated > '2017-04-01' group by(bst.status);
+----------+--------------+----------+---------------------+-----------+----------------------+--------------+
| build_id | board        | status   | start_time          | duration  | build_config         | build_number |
+----------+--------------+----------+---------------------+-----------+----------------------+--------------+
|  1425056 | guado_moblab | fail     | 2017-04-02 07:18:42 | 3562.1186 | guado_moblab-paladin |         5513 |
|  1422874 | guado_moblab | pass     | 2017-04-01 01:03:30 | 2550.4449 | guado_moblab-paladin |         5500 |
|  1427067 | guado_moblab | inflight | 2017-04-03 16:13:28 |      NULL | guado_moblab-paladin |         5526 |
+----------+--------------+----------+---------------------+-----------+----------------------+--------------+
3 rows in set (0.17 sec)

There's only one successful build over 90 minutes, yet 15 of the failed ones took over 90 minutes (and many/most went 2h before they were aborted).
mysql> select build_id,  board, bst.status, bst.start_time, count(time_to_sec(timediff(bst.finish_time, bst.start_time))) as 'duration', build_config, build_number from buildStageTable bst join buildTable bt on bst.build_id = bt.id where build_config = 'guado_moblab-paladin' and name = 'HWTest [moblab_quick]' and bt.last_updated > '2017-04-01' and time_to_sec(timediff(bst.finish_time, bst.start_time)) > 5400 group by bst.status;
+----------+--------------+--------+---------------------+----------+----------------------+--------------+
| build_id | board        | status | start_time          | duration | build_config         | build_number |
+----------+--------------+--------+---------------------+----------+----------------------+--------------+
|  1468554 | guado_moblab | fail   | 2017-04-22 06:12:13 |       15 | guado_moblab-paladin |         5707 |
|  1494127 | guado_moblab | pass   | 2017-05-04 04:00:14 |        1 | guado_moblab-paladin |         5816 |
+----------+--------------+--------+---------------------+----------+----------------------+--------------+
2 rows in set (0.17 sec)

The "successful" build actually failed ( crbug.com/717336 ).  https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fguado_moblab-paladin%2F5816%2F%2B%2Frecipes%2Fsteps%2FHWTest__moblab_quick_%2F0%2Fstdout

There's a number of successful builds over an hour, up to right before 90 minutes:
mysql> select build_id,  board, bst.status, bst.start_time, time_to_sec(timediff(bst.finish_time, bst.start_time)) as 'duration', build_config, build_number from buildStageTable bst join buildTable bt on bst.build_id = bt.id where build_config = 'guado_moblab-paladin' and name = 'HWTest [moblab_quick]' and bt.last_updated > '2017-04-01' and time_to_sec(timediff(bst.finish_time, bst.start_time)) > 3600 and bst.status = 'pass';
+----------+--------------+--------+---------------------+----------+----------------------+--------------+
| build_id | board        | status | start_time          | duration | build_config         | build_number |
+----------+--------------+--------+---------------------+----------+----------------------+--------------+
|  1426860 | guado_moblab | pass   | 2017-04-03 11:41:26 |     5378 | guado_moblab-paladin |         5524 |
|  1429188 | guado_moblab | pass   | 2017-04-04 10:41:44 |     4410 | guado_moblab-paladin |         5536 |
|  1462618 | guado_moblab | pass   | 2017-04-19 18:21:48 |     3807 | guado_moblab-paladin |         5685 |
|  1463297 | guado_moblab | pass   | 2017-04-19 23:30:19 |     4287 | guado_moblab-paladin |         5687 |
|  1465342 | guado_moblab | pass   | 2017-04-20 21:33:14 |     3850 | guado_moblab-paladin |         5694 |
|  1465537 | guado_moblab | pass   | 2017-04-21 00:31:17 |     3701 | guado_moblab-paladin |         5695 |
|  1467965 | guado_moblab | pass   | 2017-04-22 00:27:25 |     3736 | guado_moblab-paladin |         5705 |
|  1472798 | guado_moblab | pass   | 2017-04-25 03:19:31 |     3634 | guado_moblab-paladin |         5727 |
|  1473457 | guado_moblab | pass   | 2017-04-25 07:13:55 |     3986 | guado_moblab-paladin |         5728 |
|  1475784 | guado_moblab | pass   | 2017-04-26 07:09:40 |     4610 | guado_moblab-paladin |         5737 |
|  1482004 | guado_moblab | pass   | 2017-04-28 17:42:08 |     3864 | guado_moblab-paladin |         5764 |
|  1494127 | guado_moblab | pass   | 2017-05-04 04:00:14 |     8073 | guado_moblab-paladin |         5816 |
+----------+--------------+--------+---------------------+----------+----------------------+--------------+
12 rows in set (0.17 sec)

Are people okay reducing the time limit to 90 minutes, at the very slight chance of a false positive (~2h of damage), to save potentially ~7-8h of wasted time before timing out?
 

Comment 1 by sbasi@chromium.org, May 5 2017

Cc: haddowk@chromium.org
SGTM
Project Member

Comment 2 by bugdroid1@chromium.org, May 5 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/ac92fe9c923e5d2d2bcddf1d33517969432e4a8a

commit ac92fe9c923e5d2d2bcddf1d33517969432e4a8a
Author: David Riley <davidriley@chromium.org>
Date: Fri May 05 03:33:22 2017

chromeos_config: reduce moblab_quick timeout to 90 minutes.

BUG= chromium:718660 
TEST=cbuildbot/chromeos_config_unittest

Change-Id: I60782d24f9733183bb9e7dbbe6c3061cab47fc25
Reviewed-on: https://chromium-review.googlesource.com/496791
Tested-by: David Riley <davidriley@chromium.org>
Trybot-Ready: David Riley <davidriley@chromium.org>
Reviewed-by: Simran Basi <sbasi@chromium.org>
Commit-Queue: David Riley <davidriley@chromium.org>

[modify] https://crrev.com/ac92fe9c923e5d2d2bcddf1d33517969432e4a8a/cbuildbot/config_dump.json
[modify] https://crrev.com/ac92fe9c923e5d2d2bcddf1d33517969432e4a8a/cbuildbot/chromeos_config.py

Status: Archived (was: Untriaged)
Archiving P3s older than 1 year with no owner or component.

Sign in to add a comment