Improve UX in reporting leftover processes when displaying the webkit_layout_tests results |
||
Issue descriptionIn https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win7%20Tests%20%28dbg%29%281%29/68456 shard #7 of the webkit_layout_tests step runs to completion, but apparently leaves some processes running, so the swarming bot fails the task: https://chromium-swarm.appspot.com/user/task/3cda1661e10ea210 It's not clear how this failure would get reported back to users through the merge script and displayed on the build results page or when we archive and upload the test results, since it's a harness failure, not a test failure (at least in some sense). Of course, in this case, we also didn't archive and upload the results because that was disabled :(. The CL to re-enable the archiving is https://crrev.com/c/1013294 .
,
Apr 16 2018
I don't know if something needs to be as specific as "leftover child processes", but some way to distinguish between "the harness thought it exited successfully" and "swarming bot code disagreed" would be a start.
,
Apr 17 2018
Technically, this information is one click away in the swarming task UI. Is it worth the effort to propagate it back to the build page? Many other build steps do not display specific errors and leave it to the logs, and it seems to work OK. What would be the ideal outcome for this example? A more informative error message for shard #7 ? (I'm assuming we don't want the entire swarming bot rant copied to the step, right?)
,
Apr 17 2018
The problem I'm trying to solve is that it's too difficult for a user to figure out why their CL was rejected. In this particular case, *no* tests failed, so they're going to be even more confused. Arguably, one way to deal with this would be to make this an Infra failure and turn the step purple, but that's not quite right either, which is why I wrote "unclear" in the initial bug report, I'm not sure what the right thing to do is here, just that we have to find something better than what we're doing now.
,
Apr 17 2018
Thanks for the explanation. To paraphrase (please correct me if I got it wrong), the issue is to improve the UX and make it more intuitive and obvious in conveying the error message when something other than actual tests failed (but still not due to infra). In particular, it's not necessarily the build page - maybe the logs need to contain more info or have even more specific instructions on how to debug it. I'm adding it to the chrome-client-infra-backlog for tracking, and updating the title slightly (again, feel free to correct if I got it wrong).
,
Apr 17 2018
Yup, that's about right. |
||
►
Sign in to add a comment |
||
Comment 1 by sergeybe...@chromium.org
, Apr 16 2018