New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 803998 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Feb 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

Tricium: Failed to call Tracker.WorkerDone ... missing output hash

Project Member Reported by qyears...@chromium.org, Jan 19 2018

Issue description

This is a common error that appears in the logs of tricium-dev:

Example:
2018-01-19 14:27:25.719 PST
[tracker] Worker done request (run ID: 5713990399295488, Worker: Spacey_UBUNTU)
2018-01-19 14:27:25.719 PST
[tracker] Failed to call Tracker.WorkerDone :: {"error":"rpc error: code = InvalidArgument desc = missing output hash"

Error is raised here:
https://cs.chromium.org/chromium/infra/go/src/infra/tricium/appengine/tracker/rpc_worker_done.go?l=37

Could be related to (causing?)  bug 803996 .
 

Comment 1 by emso@chromium.org, Jan 22 2018

Labels: Tricium
Looks like this is due to the addition of aborted tasks which will never have isolated outputs: https://cs.chromium.org/chromium/infra/go/src/infra/tricium/appengine/driver/rpc_collect.go?rcl=d1c711fb64ba29f585e99b278a6e15387d1cff87&l=89

The validation in the tracker needs to accept such requests here: https://cs.chromium.org/chromium/infra/go/src/infra/tricium/appengine/tracker/rpc_worker_done.go?rcl=d1c711fb64ba29f585e99b278a6e15387d1cff87&l=36
Additional note: The tasks in the tracker-queue task queue that are being repeatedly retried are all hitting this error.
Owner: qyears...@chromium.org
Status: Assigned (was: Available)
Related change: https://chromium-review.googlesource.com/c/infra/infra/+/848777

This is part of supporting aborted tasks,  bug 798092 .
Project Member

Comment 4 by bugdroid1@chromium.org, Feb 22 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/d6542b90860752f9eff3354c526399fe93d21950

commit d6542b90860752f9eff3354c526399fe93d21950
Author: Quinten Yearsley <qyearsley@chromium.org>
Date: Thu Feb 22 17:35:16 2018

Consider worker aborted the same as worker failed

This CL changes it so that:

 - In a tracker "worker done" request, aborted workers
   are treated the same as failed workers, i.e.  they cause their
   functions and workflows to be considered failure.
 - In a driver "collect" request, it's OK if there's no isolated
   input hash, since this is expected for tasks for aborted workers.

Bug:  803998 
Change-Id: I461259a27e446ad48bc635f38fd340a69e4a03c6
Reviewed-on: https://chromium-review.googlesource.com/929869
Reviewed-by: Marc-Antoine Ruel <maruel@chromium.org>
Commit-Queue: Quinten Yearsley <qyearsley@chromium.org>

[modify] https://crrev.com/d6542b90860752f9eff3354c526399fe93d21950/go/src/infra/tricium/appengine/driver/driver.infra_testing
[modify] https://crrev.com/d6542b90860752f9eff3354c526399fe93d21950/go/src/infra/tricium/appengine/driver/rpc_collect_test.go
[modify] https://crrev.com/d6542b90860752f9eff3354c526399fe93d21950/go/src/infra/tricium/appengine/tracker/rpc_worker_done.go
[modify] https://crrev.com/d6542b90860752f9eff3354c526399fe93d21950/go/src/infra/tricium/appengine/driver/rpc_collect.go
[modify] https://crrev.com/d6542b90860752f9eff3354c526399fe93d21950/go/src/infra/tricium/appengine/tracker/rpc_worker_done_test.go

Status: Fixed (was: Assigned)

Sign in to add a comment