New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 781430 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Nov 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 781413
issue 782070



Sign in to add a comment

[LUCI-Beta-Bug] fetch_telemetry_binary_dependencies has high failure rate on swarming.

Project Member Reported by sky@chromium.org, Nov 3 2017

Issue description

Owner: no...@chromium.org
Status: Assigned (was: Untriaged)
Nodir, have you noticed this in our migration correctness stats? Anything we can do about this particular error?

Comment 2 by jam@chromium.org, Nov 8 2017

Labels: Pri-0
this is making luci unusable. i.e. 
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng?limit=200 has 53 flakes out of 200 right now
while
https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng?numbuilds=200
has 0

(i'll have to opt out in the meantime)

Comment 3 by jam@chromium.org, Nov 8 2017

Cc: jam@chromium.org

Comment 4 by no...@chromium.org, Nov 8 2017

Status: Started (was: Assigned)
Cc: vadimsh@chromium.org
This bug is the same one I hit on test LUCI CI builder around the same time, see issue 781413.

The solution should be turning on task service accounts for luci.chromium.try. +vadimsh@

Yeah, except I don't understand why it is flaky rather than always failing. There's something we don't know about telemetry hooks...
Cc: dpranke@chromium.org nedngu...@google.com
Components: Speed>Telemetry
Summary: [LUCI-Beta-Bug] fetch_telemetry_binary_dependencies has high failure rate on swarming. (was: [LUCI-Beta-Bug] Seems like gclient runhooks fails more often on luci builders)
+people who may know specifics about why fetch_telemetry_binary_dependencies is failing on some swarming bots.
That is the sign of cloud storage authentication is not set up properly. But that script is supposed to be run during "gclient sync" only, why is it run on swarming bot?

Comment 9 by no...@chromium.org, Nov 8 2017

Blockedon: 782070
this is a LUCI build. All LUCI builds run on swarming.
I am adding task service account
Blockedon: 781413
https://chromium.googlesource.com/chromium/src/+/infra/config/cr-buildbucket.cfg was updated to include task service account for all builds on luci.chromium.try. It should help.
Labels: Type-Bug
i meant, it should fix the issue

Comment 13 by efoo@chromium.org, Nov 9 2017

Labels: LUCI-Blocker-M4
Labels: -Pri-0 Pri-1
There were no runhooks failures since 4:10pm PDT
Status: Fixed (was: Started)
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng?limit=200
has no runhooks faiures

sorry for taking so long to fix this

Comment 16 by sky@chromium.org, Nov 9 2017

Status: Assigned (was: Fixed)
I'm reopening this as I'm seeing it on try jobs from different patches today:

WARNING:root:Unable to import cv2 due to: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /b/swarming/w/ir/cache/builder/linux/src/third_party/catapult/telemetry/third_party/cv2/lib/cv2_linux_x86_64_85be3046d2fef651206d7daadbd1b34af2a005f5/cv2.so)
WARNING:root:Unable to import psutil due to: No module named psutil
CRITICAL:devil.utils.cmd_helper:STDERR: unable to initialize libusb: -99

https://logs.chromium.org/v/?s=chromium%2Fbuildbucket%2Fcr-buildbucket.appspot.com%2F8963372731254296368%2F%2B%2Fsteps%2Fgclient_runhooks__with_patch_%2F0%2Fstdout

and

https://ci.chromium.org/swarming/task/39bb4957dec18d10?server=chromium-swarm.appspot.com
Cc: iannucci@chromium.org d...@chromium.org
this looks the same, but different. Looks like fetch_telemetry_binary_dependencies depends on presence of psutil without declaring the dependency anywhere (see go/vpython). I think this means that gclient sync should run hooks via vpython as opposed to python? https://cs.chromium.org/chromium/tools/depot_tools/gclient.py?q=gclient.py&sq=package:chromium&l=210

+iannucci and +dnj if he has time
Labels: -Pri-1 Pri-0
I've excluded Chromium LUCI dogfooders from https://chrome-infra-auth.appspot.com/auth/groups/luci-chromium-cq-dogfood
and created a separate group https://chrome-infra-auth.appspot.com/auth/groups/project-chromium-luci-beta
with the end users. To enable again, include project-chromium-luci-beta in luci-chromium-cq-dogfood
reverted https://chromium-review.googlesource.com/c/chromium/src/+/760878
and added catapult roller to LUCI dogfood list so it does not break LUCI again
Status: Started (was: Assigned)
Labels: -Pri-0 Pri-1
the builds that are running right now, either before the roll or after revert, pass runhooks. with the revert and adding the roller to dogfooders, this is no longer an emergency, so p1. I will wait for a few green builds before adding dogfooders back though.
as expected, CLs that don't include the roll, WAI https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng
catapult roller is now unable to break LUCI and https://chromium-review.googlesource.com/c/chromium/src/+/762037 would make its runhooks pass
Project Member

Comment 23 by bugdroid1@chromium.org, Nov 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/depot_tools/+/0ffcc877a6242e54351557f9a5eb53518c0c381f

commit 0ffcc877a6242e54351557f9a5eb53518c0c381f
Author: Nodir Turakulov <nodir@google.com>
Date: Fri Nov 10 01:53:10 2017

[gclient hooks] add .bat to vpython on windows

Bug:  781430 
Change-Id: Idcba016f78078aa9678b8a246e964b3dcb09a016
Reviewed-on: https://chromium-review.googlesource.com/762389
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Commit-Queue: Nodir Turakulov <nodir@chromium.org>

[modify] https://crrev.com/0ffcc877a6242e54351557f9a5eb53518c0c381f/gclient.py

Project Member

Comment 24 by bugdroid1@chromium.org, Nov 10 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/e2d81c351d09346de3f27cb784e7d2a6049f0c0c

commit e2d81c351d09346de3f27cb784e7d2a6049f0c0c
Author: Nodir Turakulov <nodir@google.com>
Date: Fri Nov 10 07:47:27 2017

[DEPS] run hooks via vpython

Hooks tend to have implicit dependencies
//.vpython covers or should cover all of them, so run all hooks via vpython

Also add missing catapult's dependencies to .vpython.

Bug:  781430 
Change-Id: I5ec7a760f44bcb806c654ad8da303f78d3dbee3f
Reviewed-on: https://chromium-review.googlesource.com/762037
Commit-Queue: Nodir Turakulov <nodir@chromium.org>
Reviewed-by: Robbie Iannucci <iannucci@chromium.org>
Cr-Commit-Position: refs/heads/master@{#515499}
[modify] https://crrev.com/e2d81c351d09346de3f27cb784e7d2a6049f0c0c/.vpython
[modify] https://crrev.com/e2d81c351d09346de3f27cb784e7d2a6049f0c0c/DEPS

Comment 25 by no...@chromium.org, Nov 10 2017

Status: Fixed (was: Started)
https://chromium-review.googlesource.com/762037
fixed catapult hooks on LUCI, they succeeded on roller’s CL
https://ci.chromium.org/swarming/task/39bf4b50f5d86210?server=chromium-swarm.appspot.com

I’ve included project-chromium-luci-beta group back to luci-chromium-cq-dogfood

Comment 26 by efoo@chromium.org, Jan 31 2018

Labels: -LUCI-Blocker-M4 luci-blocker-migration

Comment 27 by efoo@chromium.org, Feb 13 2018

Labels: -LUCI-blocker-migration LUCI-Chromium-CQSets LUCI-Blocker-Chromium-CQSets

Comment 28 by benhenry@google.com, Jan 16 (6 days ago)

Components: Test>Telemetry

Comment 29 by benhenry@google.com, Jan 16 (6 days ago)

Components: -Speed>Telemetry

Sign in to add a comment