New issue
Advanced search Search tips

Issue 832999 link

Starred by 2 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug


Show other hotlists

Hotlists containing this issue:
chrome-client-infra-backlog


Sign in to add a comment

Improve UX in reporting leftover processes when displaying the webkit_layout_tests results

Project Member Reported by dpranke@chromium.org, Apr 14 2018

Issue description

In https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/Win7%20Tests%20%28dbg%29%281%29/68456

shard #7 of the webkit_layout_tests step runs to completion, but apparently leaves some processes running, so the swarming bot fails the task:

https://chromium-swarm.appspot.com/user/task/3cda1661e10ea210

It's not clear how this failure would get reported back to users through the merge script and displayed on the build results page or when we archive and upload the test results, since it's a harness failure, not a test failure (at least in some sense). 

Of course, in this case, we also didn't archive and upload the results because that was disabled :(. The CL to re-enable the archiving is https://crrev.com/c/1013294 .
 
I'm not sure what this issue is requesting... Do you want the build step to say that shard #7 failed because of leftover child processes?

Perhaps, this calls for some more informative message passing from a swarming test task back to the builder recipe?
I don't know if something needs to be as specific as "leftover child processes", but some way to distinguish between "the harness thought it exited successfully" and "swarming bot code disagreed" would be a start.
Technically, this information is one click away in the swarming task UI. Is it worth the effort to propagate it back to the build page? Many other build steps do not display specific errors and leave it to the logs, and it seems to work OK.

What would be the ideal outcome for this example? A more informative error message for shard #7 ? (I'm assuming we don't want the entire swarming bot rant copied to the step, right?)
The problem I'm trying to solve is that it's too difficult for a user to figure out why their CL was rejected. In this particular case, *no* tests failed, so they're going to be even more confused.

Arguably, one way to deal with this would be to make this an Infra failure and turn the step purple, but that's not quite right either, which is why I wrote "unclear" in the initial bug report, I'm not sure what the right thing to do is here, just that we have to find something better than what we're doing now.
Status: Available (was: Untriaged)
Summary: Improve UX in reporting leftover processes when displaying the webkit_layout_tests results (was: Unclear how to report leftover processes when displaying the webkit_layout_tests results)
Thanks for the explanation. To paraphrase (please correct me if I got it wrong), the issue is to improve the UX and make it more intuitive and obvious in conveying the error message when something other than actual tests failed (but still not due to infra). In particular, it's not necessarily the build page - maybe the logs need to contain more info or have even more specific instructions on how to debug it.

I'm adding it to the chrome-client-infra-backlog for tracking, and updating the title slightly (again, feel free to correct if I got it wrong).
Yup, that's about right.

Sign in to add a comment