Provision fails, but the status of the DUT is not always 'need_repair'. |
||
Issue descriptionBot fails dummy_pass: https://chrome-swarming.appspot.com/task?id=3e190df118795210&refresh=10&show_raw=1 But its status is need_reset: https://chrome-swarming.appspot.com/bot?id=chromeos-skylab-bot-3112ee71-9e46-4351-b0ff-9eca16975f22&sort_stats=total%3Adesc
,
Jun 19 2018
Is this failing due to the same issue? Symptom is the same but I don't know more than that yet. https://luci-milo.appspot.com/buildbot/chromeos/kip-paladin/5421 https://logs.chromium.org/v/?s=chromeos%2Fbb%2Fchromeos%2Fkip-paladin%2F5421%2F%2B%2Frecipes%2Fsteps%2FHWTest__provision_%2F0%2Fstdout Seems to have started failing with #5417.
,
Jun 19 2018
#5417 failed for a different reason. dummy_Pass completed successfully.
#5418 success
#5419 platform_ToolchainOptions FAIL: Test Executable Stack 1 failures, Test LOAD Writable and Exec 1 failures
"cat: /usr/local/autotest/.checksum: No such file or directory"
#5420 success
In other words, it's something that was part of this build or flake in ToT.
,
Jun 19 2018
Re #2 & #3, different reasons. This bug is not for autotest in prod, it's for our Skylab project in testing, in other words, any failure in prod is not related to this bug. For example, for build 5421, it fails due to 'chromeos-firmwareupdate failed: from Google_Kip.5216.227.78 to Google_Kip.5216.227.152'. Check status.log in any failed dummy_Pass's logs like https://stainless.corp.google.com/browse/chromeos-autotest-results/209870701-chromeos-test/, you can see the real failure reason.
,
Jun 22 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/infra/lucifer/+/c47ae2b4696089a35f9465a4c9ae7b945108d609 commit c47ae2b4696089a35f9465a4c9ae7b945108d609 Author: Allen Li <ayatane@google.com> Date: Fri Jun 22 01:47:45 2018 lucifer_run_job: Fix error returning when provision exits non-zero If provision exits non-zero, we end up not returning an error so we continue to run the test despite provision failing. This is a bit of a temporary fix; I plan on fixing the atutil API so it returns an error for non-zero exit so we dont need this tricky error handling logic, but I need to make sure I get the semantics right first so I dont end up with new weird edge cases that callers have to handle. BUG= chromium:853026 TEST=None Change-Id: I9ed10ce6c33c310984e8062e2d9c53851d1f50e8 Reviewed-on: https://chromium-review.googlesource.com/1105507 Commit-Ready: Allen Li <ayatane@chromium.org> Tested-by: Allen Li <ayatane@chromium.org> Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org> [modify] https://crrev.com/c47ae2b4696089a35f9465a4c9ae7b945108d609/src/lucifer/cmd/lucifer_run_job/autotest.go
,
Jun 22 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/infra/lucifer/+/12b1297a23932a2ce7c2d1fb9bc88cae93a48103 commit 12b1297a23932a2ce7c2d1fb9bc88cae93a48103 Author: Allen Li <ayatane@google.com> Date: Fri Jun 22 19:44:24 2018 atutil: Clean up API This commit makes some functional changes to the atutil public API to make it easier to use and harder to misuse (see bug for example of misuse). 1. RunAutoserv now returns an error if autoserv exits non-zero. 2. Result as a new Started field to differentiate errors where autoserv was not started. 3. All of the places that call RunAutoserv had to handle the non-zero exit case; these handlers have been removed since they are no longer needed. Also: 4. runTest now calls appendJobFinished even if readTestsFailed returns an error. appendJobFinished is independent of readTestsFailed. This was done in the same commit because it touches the same code and would be annoying to commit separately. BUG= chromium:853026 TEST=None Change-Id: I5bee30b45c2be2d46dd8b9c8c6330bb0fd23f4da Reviewed-on: https://chromium-review.googlesource.com/1107269 Commit-Ready: Allen Li <ayatane@chromium.org> Tested-by: Allen Li <ayatane@chromium.org> Reviewed-by: Prathmesh Prabhu <pprabhu@chromium.org> [modify] https://crrev.com/12b1297a23932a2ce7c2d1fb9bc88cae93a48103/src/lucifer/autotest/atutil/types.go [modify] https://crrev.com/12b1297a23932a2ce7c2d1fb9bc88cae93a48103/src/lucifer/cmd/lucifer_run_job/autotest.go [modify] https://crrev.com/12b1297a23932a2ce7c2d1fb9bc88cae93a48103/src/lucifer/cmd/lucifer_admin_task/main.go [modify] https://crrev.com/12b1297a23932a2ce7c2d1fb9bc88cae93a48103/src/lucifer/autotest/atutil/atutil.go [modify] https://crrev.com/12b1297a23932a2ce7c2d1fb9bc88cae93a48103/src/lucifer/autotest/atutil/os.go [modify] https://crrev.com/12b1297a23932a2ce7c2d1fb9bc88cae93a48103/src/lucifer/cmd/lucifer_run_job/task.go
,
Jul 17
Done?
,
Jul 17
|
||
►
Sign in to add a comment |
||
Comment 1 by ayatane@chromium.org
, Jun 19 2018Status: Started (was: Untriaged)