New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 890697 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 1
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

lakitu-gpu-paladin failing since #4548 with "fatal: The remote end hung up unexpectedly"

Project Member Reported by yamaguchi@chromium.org, Oct 1

Issue description

lakitu-gpu-paladin has been failing after #4548, 10 times in a row.

https://cros-goldeneye.corp.google.com/chromeos/legoland/builderHistory?buildConfig=lakitu-gpu-paladin&buildBranch=master&startCursor=Ci0SJ2oQc35jci1idWlsZGJ1Y2tldHITCxIFQnVpbGQYoJ_Zkp3V-P17DBgAIAA%3D

example log
https://luci-logdog.appspot.com/v/?s=chromeos/bb/chromeos/lakitu-gpu-paladin/4558/+/recipes/steps/InitialCheckout/0/stdout

Fetching project chromeos/overlays/overlay-lulu-private
Fetching project chromiumos/third_party/rootdev
error: object file /b/c/cbuild/repository/.repo/projects/infra_internal/skylab_inventory.git/objects/02/53ee8d28787b0568245ebca35712da1b8dc8a0 is empty
error: object file /b/c/cbuild/repository/.repo/projects/infra_internal/skylab_inventory.git/objects/02/53ee8d28787b0568245ebca35712da1b8dc8a0 is empty
fatal: loose object 0253ee8d28787b0568245ebca35712da1b8dc8a0 (stored in /b/c/cbuild/repository/.repo/projects/infra_internal/skylab_inventory.git/objects/02/53ee8d28787b0568245ebca35712da1b8dc8a0) is corrupt
fatal: The remote end hung up unexpectedly
[W git.go:283] Transient error string identified in STDERR: "fatal: The remote end hung up unexpectedly\n"
[W git.go:294] Retrying after 4.5s (rc=128): transient error string encountered
error: object file /b/c/cbuild/repository/.repo/projects/src/platform/ap-daemons.git/objects/77/d1335457b46df65411318124d5c728f2c4602b is empty
error: object file /b/c/cbuild/repository/.repo/projects/src/platform/ap-daemons.git/objects/77/d1335457b46df65411318124d5c728f2c4602b is empty
fatal: loose object 77d1335457b46df65411318124d5c728f2c4602b (stored in /b/c/cbuild/repository/.repo/projects/src/platform/ap-daemons.git/objects/77/d1335457b46df65411318124d5c728f2c4602b) is corrupt
fatal: The remote end hung up unexpectedly
[W git.go:283] Transient error string identified in STDERR: "fatal: The remote end hung up unexpectedly\n"
[W git.go:294] Retrying after 4.5s (rc=128): transient error string encountered
 
Cc: dgarr...@chromium.org jclinton@chromium.org
Owner: dgarr...@chromium.org
Status: Assigned (was: Available)
Is this considered as an infra or dut issues, as it says "remote end hung up"?
Cc: wonderfly@chromium.org
Owner: wonderfly@chromium.org
It's probably a lakitu-gpu product issue.

If you need to, mark it as experimental (link this bug), and hand off to the lakitu team. I'm handing to wonderfly@ because I've forgotten the proper Lakitu escalation path.


Cc: apronin@chromium.org
Summary: lakitu-gpu-paladin failing since #4548 with "fatal: The remote end hung up unexpectedly" (was: lakitu-gpu-paladin failing since #4548)
https://uberchromegw.corp.google.com/i/chromeos/builders/lakitu-gpu-paladin/builds/4581

Another failure.

error: object file /b/c/cbuild/repository/.repo/projects/src/platform/ap-daemons.git/objects/77/d1335457b46df65411318124d5c728f2c4602b is empty
error: object file /b/c/cbuild/repository/.repo/projects/src/platform/ap-daemons.git/objects/77/d1335457b46df65411318124d5c728f2c4602b is empty
fatal: loose object 77d1335457b46df65411318124d5c728f2c4602b (stored in /b/c/cbuild/repository/.repo/projects/src/platform/ap-daemons.git/objects/77/d1335457b46df65411318124d5c728f2c4602b) is corrupt
fatal: The remote end hung up unexpectedly
[W git.go:283] Transient error string identified in STDERR: "fatal: The remote end hung up unexpectedly\n"

But there are tons of these across various repos. That doesn't sound like lakitu's problem...
Owner: jclinton@chromium.org
FWIW, that particular blob fetches fine for me:

commit 77d1335457b46df65411318124d5c728f2c4602b (m/master, cros-internal/master)
Author: Raju Konduru <raju.konduru@globaledgesoft.com>
Date:   Thu Sep 27 22:48:43 2018 +0530

    ap-daemons: nest cam debug logs in conntrack provider
    
    This CL added additional debug logs in conntrack metric provider and
    wan usage plugin to triage nest cam's over usage issue.
    
    BUG=b:35448008
    TEST=cros_workon_make --board=gale --test ap-daemons
    
    Change-Id: I092faa231ea6e702ebed9ae6c9e3b54152e269cd
    Reviewed-on: https://chrome-internal-review.googlesource.com/687109
    Commit-Ready: Shashidhar Jodatti <jr.shashidhar@globaledgesoft.com>
    Tested-by: Shashidhar Jodatti <jr.shashidhar@globaledgesoft.com>
    Reviewed-by: Kishan Kunduru <kkunduru@chromium.org>

Could this be a flaky git fetch? And then some builder keeps its corrupt state for too long? Tentatively moving to CI team. Not sure who the right is owner, but I highly doubt that'd be the Lakitu team.
I super misread this as a DUT connection issue (quick scanning too many bugs).

I've just wiped the cache dirs on the machine (which includes the working repository), and rebooted it. After that, we should check that puppet is running correctly to ensure that proper credentials are in place.
Owner: dgarr...@chromium.org
Status: Fixed (was: Assigned)
I wiped /b/c (all named cache directories) and the builder seems to have recovered.

Sign in to add a comment