New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 698603 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner: ----
Closed: Mar 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

HWTest Stage was executed, but not logged by builder

Reported by jrbarnette@chromium.org, Mar 5 2017

Issue description

Comment 1 by autumn@chromium.org, Mar 15 2017

Does this mean it was a one off? "I suspect the reason for logging difference is related to the Paygen failures." Or is there more investigation needed here? 
<sigh> Trying to answer the question in c#1, I went digging
a bit more.  The answer is that the Paygen failures and the
lack of HWTest logging are a fabulous example of this:
    https://en.wikipedia.org/wiki/Correlation_does_not_imply_causation

I went and looked at the cases that were missing the HWTest
logging.  In all cases, the build run time was about 7 hours,
45 minutes.  I'm going to guess that this means that the builds
were timing out.  Most likely, that was because DUTs were constantly
failing provision, and tests were constantly retrying.  That
led lots of stuff to take too long, leading to timeouts.

It's a well-understood, hard-to-fix problem that builder timeouts
lead to undiagnosable build failures, starting with the basic problem
that the builder can't even say it was killed by the timeout.

We should dig a bit more:  I'm not satisfied with an answer that says
"we can't make the failure more debuggable", not because I don't
believe that but because I need to be convinced that avoiding the
failure needs to be made rarer:
  * Why did the provisioning job failures not lead to suite failure
    sooner?
  * Why were the HWTest phases allowed to run so long that the
    hard buildbot timeouts kicked in?

I think a while back we increased the timeouts for HWTest phases.
possibly, we should revisit that choice.

The HWTest stage suites should timeout before the builder timeout is hit, IF they started on time. However, if they started late because something else was slow their timers might still be running when the builder timeout is hit.

Comment 4 by autumn@chromium.org, Mar 21 2017

Status: WontFix (was: Available)
Known issue that builder timeouts won't generate logs. 

Sign in to add a comment