caroline gets canceled because the build takes too long to finish |
||||||||
Issue descriptionThere are only 3 green M58 and 9 green M57 for caroline for the past month M58 https://cros-goldeneye.corp.google.com/chromeos/console/listBuild?boards=caroline&milestone=58&chromeOsVersion=&chromeVersion=&startTimeFrom=&startTimeTo=#/ M57 https://cros-goldeneye.corp.google.com/chromeos/console/listBuild?boards=caroline&milestone=57&chromeOsVersion=&chromeVersion=&startTimeFrom=&startTimeTo=#%2F example failure in M58 https://uberchromegw.corp.google.com/i/chromeos/builders/caroline-release/builds/456 and logs @@@BUILD_STEP@PaygenTestBeta@@@ ************************************************************ @@@STEP_LINK@stdout-->stdio@https://luci-logdog.appspot.com/v/?s=chromeos%2Fbb%2Fchromeos%2Fcaroline-release%2F456%2F%2B%2Frecipes%2Fsteps%2FPaygenTestBeta%2F0%2Fstdout@@@ ** Start Stage PaygenTestBeta - Fri, 03 Mar 2017 07:40:59 -0800 (PST) ** ** Stage that schedules the payload tests. ************************************************************ 07:40:59: INFO: Running cidb query on pid 20093, repr(query) starts with <sqlalchemy.sql.expression.Update object at 0x7fe5983c7dd0> Preconditions for the stage successfully met. Beginning to execute stage... 07:40:59: INFO: Running cidb query on pid 20093, repr(query) starts with <sqlalchemy.sql.expression.Update object at 0x7fe598276090> 07:40:59: INFO: RunCommand: /b/cbuild/internal_master/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /tmp/cbuildbot-tmpfNJ632/tmpXa5EcM/temp_summary.json --raw-cmd --task-name caroline-release/R58-9334.0.0-paygen_au_beta --dimension os Ubuntu-14.04 --dimension pool default --print-status-updates --timeout 14400 --io-timeout 14400 --hard-timeout 14400 --expiration 1200 '--tags=priority:Build' '--tags=suite:paygen_au_beta' '--tags=build:caroline-release/R58-9334.0.0' '--tags=task_name:caroline-release/R58-9334.0.0-paygen_au_beta' '--tags=board:caroline' -- /usr/local/autotest/site_utils/run_suite.py --build caroline-release/R58-9334.0.0 --board caroline --suite_name paygen_au_beta --pool bvt --file_bugs True --priority Build --timeout_mins 180 --retry True --suite_min_duts 2 -c Autotest instance: cautotest 03-03-2017 [07:41:04] Submitted create_suite_job rpc 03-03-2017 [07:41:11] Created suite job: http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=104589734 @@@STEP_LINK@Link to suite@http://cautotest.corp.google.com/afe/#tab_id=view_job&object_id=104589734@@@ --create_and_return was specified, terminating now. Will return from run_suite with status: OK 07:41:13: INFO: RunCommand: /b/cbuild/internal_master/chromite/third_party/swarming.client/swarming.py run --swarming chromeos-proxy.appspot.com --task-summary-json /tmp/cbuildbot-tmpfNJ632/tmpZEaMRP/temp_summary.json --raw-cmd --task-name caroline-release/R58-9334.0.0-paygen_au_beta --dimension os Ubuntu-14.04 --dimension pool default --print-status-updates --timeout 14400 --io-timeout 14400 --hard-timeout 14400 --expiration 1200 '--tags=priority:Build' '--tags=suite:paygen_au_beta' '--tags=build:caroline-release/R58-9334.0.0' '--tags=task_name:caroline-release/R58-9334.0.0-paygen_au_beta' '--tags=board:caroline' -- /usr/local/autotest/site_utils/run_suite.py --build caroline-release/R58-9334.0.0 --board caroline --suite_name paygen_au_beta --pool bvt --file_bugs True --priority Build --timeout_mins 180 --retry True --suite_min_duts 2 -m 104589734 08:02:49: INFO: Refreshing due to a 401 (attempt 1/2) 08:02:49: INFO: Refreshing access_token 09:08:14: INFO: Refreshing due to a 401 (attempt 1/2) 09:08:14: INFO: Refreshing access_token @@@STEP_FAILURE@@@ [1;31m09:50:44: ERROR: Timeout occurred- waited 27462 seconds, failing. Timeout reason: This build has reached the timeout deadline set by the master. Either this stage or a previous one took too long (see stage timing historical summary in ReportStage) or the build failed to start on time.[0m 09:50:44: INFO: Running cidb query on pid 27201, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7fe5983b4b50> @@@STEP_FAILURE@@@ [1;31m09:50:45: ERROR: Timeout occurred- waited 27805 seconds, failing. Timeout reason: This build has reached the timeout deadline set by the master. Either this stage or a previous one took too long (see stage timing historical summary in ReportStage) or the build failed to start on time.[0m 09:50:46: INFO: Running cidb query on pid 27201, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7fe59d3bab90>
,
Mar 8 2017
,
Mar 8 2017
+infra deputy to get some attention Caroline is android N device target for mid april, we need some green build ASAP to have some CTS run. There are only one run in R57 for caroline https://wmatrix.googleplex.com/unfiltered?suites=arc-gts&days_back=30&releases=57
,
Mar 8 2017
The build got aborted because it took too long. We run 3 release builds a day, each should not take more than 8 hours (master + slaves). Some stages of this build take too long. You can investigate if the long run is expected or if you can remove some unnecessary stages from the build. 04:19:20: ERROR: Timeout occurred- waited 27835 seconds, failing. Timeout reason: This build has reached the timeout deadline set by the master. Either this stage or a previous one took too long (see stage timing historical summary in ReportStage) or the build failed to start on time.
,
Mar 8 2017
,
Mar 8 2017
This is blocking CTS run in the lab on caroline. +igo, pyeh, yueherngl to find owner
,
Mar 10 2017
,
Mar 10 2017
I see some errors like this in the autoserv logs: 03/10 10:42:54.838 ERROR| dev_server:2434| No devserver has the capacity to be selected. 03/10 10:43:49.258 ERROR| base_utils:0280| [stderr] mux_client_request_session: read from master failed: Broken pipe 03/10 10:43:56.483 ERROR| base_utils:0280| [stderr] [0310/104355:INFO:update_engine_client.cc(471)] Forcing an update by setting app_version to ForcedUpdate. 03/10 10:43:56.483 ERROR| base_utils:0280| [stderr] [0310/104355:INFO:update_engine_client.cc(473)] Initiating update check and install. 03/10 10:43:56.565 ERROR| base_utils:0280| [stderr] [0310/104355:INFO:update_engine_client.cc(502)] Waiting for update to complete. 03/10 11:00:51.938 ERROR| base_utils:0280| [stderr] [0310/110051:ERROR:update_engine_client.cc(217)] Update failed, current operation is UPDATE_STATUS_IDLE, last error code is ErrorCode::kDownloadTransferError(9) https://sponge.corp.google.com/target?tab=Output+Files&sortBy=STATUS&show=ALL&id=d040d7cf-6a8a-445c-b641-5ce9f3ef2783&target=caroline-release/R59-9355.0.0/paygen_au_dev/autoupdate_EndToEndTest_paygen_au_dev_full_9355.0.0&searchFor=label:parent_job_id:105900921+user:chromeos-test I think I found another crash log earlier that seemed like the payloads weren't downloading properly either.
,
Mar 13 2017
Another open bug is related to builder's healthy on M56, M57, and M58 below https://bugs.chromium.org/p/chromium/issues/detail?id=695366 I saw this issue also failed on "paygentest[canary|dev|beta]" test. Maybe we could merge this one into one above.
,
Mar 13 2017
is https://bugs.chromium.org/p/chromium/issues/detail?id=695366 going to fix this also ?
,
Mar 13 2017
Yes, I think this is a dup, will mark it as such |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by dchan@chromium.org
, Mar 7 2017Status: Assigned (was: Untriaged)