beaglebone paladin timedout because no output to logdog for 9000 seconds |
|||||||
Issue descriptionHas only happened once, so low priority. https://uberchromegw.corp.google.com/i/chromeos/builders/beaglebone-paladin/builds/12256/steps/steps/logs/stdio Timed out after 2+ hours. Towards the end, we see: @@@STEP_CLOSED@@@ 2017/01/24 07:25:50 proto: duplicate proto type registered: google.protobuf.Duration 2017/01/24 07:25:50 proto: duplicate proto type registered: google.protobuf.Timestamp 2017/01/24 07:25:50 proto: duplicate proto type registered: google.protobuf.Empty command timed out: 9000 seconds without output, attempting to kill process killed by signal 9 program finished with exit code -1 elapsedTime=9009.788312 ------------------------------------- Note that the build started around 7:25, and the timeout in the end is around 9:30. So, the builder logged the duplicate proto warning and then went silent for ~2 hours, at which point it was killed.
,
Jan 24 2017
relevant master build: https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/13446
,
Jan 24 2017
,
Jan 25 2017
,
Jan 25 2017
,
Jan 25 2017
OK, we do need to look at this. + chrome-infra trooper. nodir@: Do you have any more visibility into why logdog's log stream dried up?
,
Jan 25 2017
not much. I see that the logdog stream for recipes stdout does not exist https://luci-logdog.appspot.com/rpcexplorer/services/logdog.Logs/Get?request={%20%20%20%20%22project%22:%20%22chromeos%22,%20%20%20%20%22path%22:%20%22bb/chromeos/stumpy-paladin/27232/+/recipes/stdout%22} not sure why logdog appears to hang
,
Jan 25 2017
,
Jan 25 2017
FWIW the proto warnings are irrelevant
,
Jan 25 2017
LogDog / Annotee are just proxies for an underlying command. In this case, it's the recipe engine's checkout. If that checkout fails, or hangs, or doesn't produce any logs, LogDog's not going show any output either. From a successful build: 2017/01/24 09:56:58 proto: duplicate proto type registered: google.protobuf.Duration 2017/01/24 09:56:58 proto: duplicate proto type registered: google.protobuf.Timestamp 2017/01/24 09:56:58 proto: duplicate proto type registered: google.protobuf.Empty INFO:root:Running ['git', 'rev-parse', '--verify', '836fb28be142e5385a8715e4c61e23e8ee505ccd^{commit}'] This isn't a LogDog line. In fact, LogDog tooling showed exactly what it always showed. I'm not sure why you think this is a LogDog-related timeout, since the LogDog bootstrap step finished and the LogDog-revlevant part of the output seems consistent. At a glance, my guess is that Git hung when the recipe engine was performing its recipe checkout. Assigning back to reporter in someone pprabhu@ wants to take a deeper look; I don't think there's anything else for me to weigh in on here.
,
Jan 25 2017
|
|||||||
►
Sign in to add a comment |
|||||||
Comment 1 by pprabhu@chromium.org
, Jan 24 2017