Decrease moblab_quick timeout |
||
Issue descriptionguado_moblab-paladin has had a number of cases where the moblab_quick suite times out at 2h. Reduce it to 90 minutes. All data here is since the beginning of April ~35 days. Successful builds normally take ~42 minutes: mysql> select build_id, board, bst.status, bst.start_time, avg(time_to_sec(timediff(bst.finish_time, bst.start_time))) as 'duration', build_config, build_number from buildStageTable bst join buildTable bt on bst.build_id = bt.id where build_config = 'guado_moblab-paladin' and name = 'HWTest [moblab_quick]' and bt.last_updated > '2017-04-01' group by(bst.status); +----------+--------------+----------+---------------------+-----------+----------------------+--------------+ | build_id | board | status | start_time | duration | build_config | build_number | +----------+--------------+----------+---------------------+-----------+----------------------+--------------+ | 1425056 | guado_moblab | fail | 2017-04-02 07:18:42 | 3562.1186 | guado_moblab-paladin | 5513 | | 1422874 | guado_moblab | pass | 2017-04-01 01:03:30 | 2550.4449 | guado_moblab-paladin | 5500 | | 1427067 | guado_moblab | inflight | 2017-04-03 16:13:28 | NULL | guado_moblab-paladin | 5526 | +----------+--------------+----------+---------------------+-----------+----------------------+--------------+ 3 rows in set (0.17 sec) There's only one successful build over 90 minutes, yet 15 of the failed ones took over 90 minutes (and many/most went 2h before they were aborted). mysql> select build_id, board, bst.status, bst.start_time, count(time_to_sec(timediff(bst.finish_time, bst.start_time))) as 'duration', build_config, build_number from buildStageTable bst join buildTable bt on bst.build_id = bt.id where build_config = 'guado_moblab-paladin' and name = 'HWTest [moblab_quick]' and bt.last_updated > '2017-04-01' and time_to_sec(timediff(bst.finish_time, bst.start_time)) > 5400 group by bst.status; +----------+--------------+--------+---------------------+----------+----------------------+--------------+ | build_id | board | status | start_time | duration | build_config | build_number | +----------+--------------+--------+---------------------+----------+----------------------+--------------+ | 1468554 | guado_moblab | fail | 2017-04-22 06:12:13 | 15 | guado_moblab-paladin | 5707 | | 1494127 | guado_moblab | pass | 2017-05-04 04:00:14 | 1 | guado_moblab-paladin | 5816 | +----------+--------------+--------+---------------------+----------+----------------------+--------------+ 2 rows in set (0.17 sec) The "successful" build actually failed ( crbug.com/717336 ). https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fguado_moblab-paladin%2F5816%2F%2B%2Frecipes%2Fsteps%2FHWTest__moblab_quick_%2F0%2Fstdout There's a number of successful builds over an hour, up to right before 90 minutes: mysql> select build_id, board, bst.status, bst.start_time, time_to_sec(timediff(bst.finish_time, bst.start_time)) as 'duration', build_config, build_number from buildStageTable bst join buildTable bt on bst.build_id = bt.id where build_config = 'guado_moblab-paladin' and name = 'HWTest [moblab_quick]' and bt.last_updated > '2017-04-01' and time_to_sec(timediff(bst.finish_time, bst.start_time)) > 3600 and bst.status = 'pass'; +----------+--------------+--------+---------------------+----------+----------------------+--------------+ | build_id | board | status | start_time | duration | build_config | build_number | +----------+--------------+--------+---------------------+----------+----------------------+--------------+ | 1426860 | guado_moblab | pass | 2017-04-03 11:41:26 | 5378 | guado_moblab-paladin | 5524 | | 1429188 | guado_moblab | pass | 2017-04-04 10:41:44 | 4410 | guado_moblab-paladin | 5536 | | 1462618 | guado_moblab | pass | 2017-04-19 18:21:48 | 3807 | guado_moblab-paladin | 5685 | | 1463297 | guado_moblab | pass | 2017-04-19 23:30:19 | 4287 | guado_moblab-paladin | 5687 | | 1465342 | guado_moblab | pass | 2017-04-20 21:33:14 | 3850 | guado_moblab-paladin | 5694 | | 1465537 | guado_moblab | pass | 2017-04-21 00:31:17 | 3701 | guado_moblab-paladin | 5695 | | 1467965 | guado_moblab | pass | 2017-04-22 00:27:25 | 3736 | guado_moblab-paladin | 5705 | | 1472798 | guado_moblab | pass | 2017-04-25 03:19:31 | 3634 | guado_moblab-paladin | 5727 | | 1473457 | guado_moblab | pass | 2017-04-25 07:13:55 | 3986 | guado_moblab-paladin | 5728 | | 1475784 | guado_moblab | pass | 2017-04-26 07:09:40 | 4610 | guado_moblab-paladin | 5737 | | 1482004 | guado_moblab | pass | 2017-04-28 17:42:08 | 3864 | guado_moblab-paladin | 5764 | | 1494127 | guado_moblab | pass | 2017-05-04 04:00:14 | 8073 | guado_moblab-paladin | 5816 | +----------+--------------+--------+---------------------+----------+----------------------+--------------+ 12 rows in set (0.17 sec) Are people okay reducing the time limit to 90 minutes, at the very slight chance of a false positive (~2h of damage), to save potentially ~7-8h of wasted time before timing out?
,
May 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/ac92fe9c923e5d2d2bcddf1d33517969432e4a8a commit ac92fe9c923e5d2d2bcddf1d33517969432e4a8a Author: David Riley <davidriley@chromium.org> Date: Fri May 05 03:33:22 2017 chromeos_config: reduce moblab_quick timeout to 90 minutes. BUG= chromium:718660 TEST=cbuildbot/chromeos_config_unittest Change-Id: I60782d24f9733183bb9e7dbbe6c3061cab47fc25 Reviewed-on: https://chromium-review.googlesource.com/496791 Tested-by: David Riley <davidriley@chromium.org> Trybot-Ready: David Riley <davidriley@chromium.org> Reviewed-by: Simran Basi <sbasi@chromium.org> Commit-Queue: David Riley <davidriley@chromium.org> [modify] https://crrev.com/ac92fe9c923e5d2d2bcddf1d33517969432e4a8a/cbuildbot/config_dump.json [modify] https://crrev.com/ac92fe9c923e5d2d2bcddf1d33517969432e4a8a/cbuildbot/chromeos_config.py
,
Jan 10
Archiving P3s older than 1 year with no owner or component. |
||
►
Sign in to add a comment |
||
Comment 1 by sbasi@chromium.org
, May 5 2017