Swarming: enable task to signal new DUT_FAILURE task state |
||||||
Issue descriptionThis new task state would have the following properties: - It's essentially a failed task - It is to mark that there was an hardware failure, thus there is no valid results (vs test failure) - As such (the twist), it will be retries the same way as a BOT_DIED failure.
,
Nov 7
+erikchen who I believe was asking about this for android tests a bit ago
,
Nov 7
This is (slightly) related to issue 852040, as one is about task input of device state, this issue is about task output of the device state.
,
Nov 7
Err ignore comment #3, I wanted to refer to another issue but I forget the number.
,
Nov 8
Can you clarify when DUT_FAILURE would be emitted?
,
Nov 8
- Add a new TaskProperties.dut_failure_exit_code - When the task returns this specific value, the state is set to DUT_FAILURE.
,
Nov 13
,
Nov 21
I wonder what does "DUT" stands for? Also why not just re-use BOT_DIED state for this? Using a specific exit code might conflict with task own exit codes, e.g. our test launcher is using code 87 for INFRA_FAILURE and all other codes for normal failure. Can we use another channel to communicate this, e.g. creating some kinda of file somewhere?
,
Nov 21
DUT = "Device Under Test" #8: if your use of 87 is via https://codesearch.chromium.org/chromium/src/third_party/catapult/devil/devil/constants/exit_codes.py?rcl=d69ae20edf9486cf53e4ee007c89a1518b03abab&l=8, note that I expect us to implement support for this in devil.
,
Nov 30
could this be generalized to INTERRUPTED? then it would be also usable for tasks that got their bot stolen due to quotascheduler. A "rationale" field could be added to clarify the reason for interruption.
,
Nov 30
A DUT_FAILURE task is never interrupted, it's exclusively designed to deal with irrecoverable DUT specific (hardware) issues. It's better to not conflate this with transient state.
,
Dec 6
,
Dec 6
,
Dec 19
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by bpastene@chromium.org
, Nov 7