New issue
Advanced search Search tips

Issue 832406 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: ----



Sign in to add a comment

Current CHROMEOS_LKGM broken for chromeos-daisy-rel

Project Member Reported by cnardi@chromium.org, Apr 13 2018

Issue description

Not sure if this has already been reported since I'm not a Googler, but gclient runhooks is running into problems on chromeos-daisy-rel: https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.chromiumos%2Fchromeos-daisy-rel%2F93460%2F%2B%2Frecipes%2Fsteps%2Fgclient_runhooks__with_patch_%2F0%2Fstdout

WARNING: GS_ERROR: AccessDeniedException: 403 chromium.archive@gmail.com does not have storage.objects.list access to chromeos-image-archive.
 

Comment 1 by cnardi@chromium.org, Apr 13 2018

Labels: -Restrict-View-Google
This probably doesn't need to be RVG.

Comment 2 by cnardi@chromium.org, Apr 13 2018

Labels: Infra-Troopers

Comment 3 by cnardi@chromium.org, Apr 13 2018

Components: -Infra Infra>Client>ChromeOS
Labels: -Infra-Troopers

Comment 4 by cnardi@chromium.org, Apr 13 2018

Labels: Infra-ChromeOS OS-Chrome

Comment 5 by cnardi@chromium.org, Apr 17 2018

Cc: sergeybe...@chromium.org bpastene@chromium.org
This still doesn't seem to be fixed, cc-ing bpastene@ and sergeyberezin@ for some help with getting this assigned.
Cc: achuith@chromium.org
Labels: -Pri-1 Pri-2
So what's happening here is the bot's trying to download the cros sysroot for daisy at the pinned cros manifest:
https://chromium.googlesource.com/chromium/src/+/b088b9a0448a0f98e7a96628545913e6c889ca46/chromeos/CHROMEOS_LKGM

However there isn't one available for that manifest since the builder that uploads them had a string of red builds around that manifest:
https://build.chromium.org/p/chromiumos/builders/daisy-full?numbuilds=200

The bot then starts decrementing the manifest until it finds one that's available. It starts from 10552.0.0 and ends at 10550.0.0 since that was most recent manifest with an available sysroot:
https://build.chromium.org/p/chromiumos/builders/daisy-full/builds/18440

The "does not have storage.objects.list access" error is somewhat of a red herring. gsutil falls back to ls'ing the bucket when you try to cat a file that doesn't exist. The bot has get permissions, but not list permissions. So in effect, it's really a 404, not a 403.

Can I ask how you noticed the failure/why you filed the bug? It's gracefully handling the failure, so I don't anything here is blocked, but I could be wrong.

Comment 7 by cnardi@chromium.org, Apr 17 2018

I noticed this only from looking at the timing of gclient runhooks on chromeos-daisy-rel. Because of the backoffs (12 minutes), it's been taking about 13-14 minutes (e.g. https://ci.chromium.org/buildbot/tryserver.chromium.chromiumos/chromeos-daisy-rel/95797).

Probably not blocking anything, but the ChromeOS trybots have been pretty busy with the occasional backlog, and I figured this extra 12 minutes could be contributing.
Cc: dpranke@chromium.org
Owner: bpastene@chromium.org
Status: Assigned (was: Untriaged)
Summary: Current CHROMEOS_LKGM broken for chromeos-daisy-rel (was: chromium.archive@gmail.com does not have storage.objects.list access to chromeos-image-archive)
Ah, that's a really good point.  Bug 832355  should help relieve some of the pressure the cros trybots are feeling, but you're right that this certainly isn't helping things.

Short of reverting the most recent lkgm bump, let me see what we can do on our side of things to smooth this over.
Project Member

Comment 9 by bugdroid1@chromium.org, Apr 17 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/905b77af486a9e5c1f6c9573a43f21276fd22b90

commit 905b77af486a9e5c1f6c9573a43f21276fd22b90
Author: Benjamin Pastene <bpastene@chromium.org>
Date: Tue Apr 17 18:48:03 2018

Don't clear cros-chrome-sdk cache on the daisy simplechrome bot.

Runhooks is taking ~15min on chromeos-daisy-rel every build because of
broken LKGM + fallback logic.

If we don't clear the cache, we skip that fallback logic, so this
should save us around 10min on every build.

Normally wouldn't be a big problem, but that slave pool is already
over-subscribed, so hopefully this is immediate relief.

Recipe-Nontrivial-Roll: build_limited_scripts_slave
Bug:  832406 
Change-Id: I5f9694b184b4864c672fe24bea8acc5ae4918634
Reviewed-on: https://chromium-review.googlesource.com/1015535
Commit-Queue: Ben Pastene <bpastene@chromium.org>
Reviewed-by: Dirk Pranke <dpranke@chromium.org>

[modify] https://crrev.com/905b77af486a9e5c1f6c9573a43f21276fd22b90/scripts/slave/README.recipes.md
[modify] https://crrev.com/905b77af486a9e5c1f6c9573a43f21276fd22b90/scripts/slave/recipe_modules/chromium/api.py

Status: Fixed (was: Assigned)
Runhooks in recent builds are back down to under a minute. We may want to disable the fallback logic in the chrome-sdk entirely now that lkgm updates are going through the CQ to prevent this from happening again.

For now though, the bot should be healthy.
It's a bit tricky to do that - we want the fallback logic for developers, and only want to only disable it on trybots.
12 min seems like an awfully long time - this logic should just be doing a gs cat of a small text file:
https://cs.corp.google.com/chromeos_public/chromite/cli/cros/cros_chrome_sdk.py?l=229-243


IIUC, deeper in that gs_ctx.Cat() call is linear-backoff logic. ctrl-f for "Retrying in" in the following log:
https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.chromiumos%2Fchromeos-daisy-rel%2F93460%2F%2B%2Frecipes%2Fsteps%2Fgclient_runhooks__with_patch_%2F0%2Fstdout

If the cat fails, it sleeps for 1 min, tries again, sleeps for 2 min, tries again, then sleeps for 3 min. So that's 6min spent for each possible version. That seems a bit much, I'd expect network backoffs to be on the order of 1 to 10 seconds. (At least that's how I've written things in the past.)
Ah, we can definitely fix that - thanks for looking into it.
Owner: achuith@chromium.org
Status: Started (was: Fixed)
Project Member

Comment 17 by bugdroid1@chromium.org, Apr 20 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/328ec4e1303a98f1fcd7c8e84e3b1b6ed8c0e791

commit 328ec4e1303a98f1fcd7c8e84e3b1b6ed8c0e791
Author: Achuith Bhandarkar <achuith@chromium.org>
Date: Fri Apr 20 02:28:57 2018

cros_chrome_sdk: Don't retry if gsutil cat fails.

* Set retries=0 for gs.Cat()
* Pull the gs.Cat call into a separate function.

BUG= chromium:832406 
TEST=manual

Change-Id: I052591adca9ed04f9d9d6b718035bb2ce21f58cb
Reviewed-on: https://chromium-review.googlesource.com/1019481
Commit-Ready: Achuith Bhandarkar <achuith@chromium.org>
Tested-by: Achuith Bhandarkar <achuith@chromium.org>
Reviewed-by: Steven Bennetts <stevenjb@chromium.org>
Reviewed-by: Ben Pastene <bpastene@chromium.org>

[modify] https://crrev.com/328ec4e1303a98f1fcd7c8e84e3b1b6ed8c0e791/cli/cros/cros_chrome_sdk.py

Project Member

Comment 18 by bugdroid1@chromium.org, Apr 20 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/94a729903406cc33d7b18c6be41d36362465e132

commit 94a729903406cc33d7b18c6be41d36362465e132
Author: chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Date: Fri Apr 20 04:01:52 2018

Roll src/third_party/chromite/ 55fae2ff6..328ec4e13 (1 commit)

https://chromium.googlesource.com/chromiumos/chromite.git/+log/55fae2ff6860..328ec4e1303a

$ git log 55fae2ff6..328ec4e13 --date=short --no-merges --format='%ad %ae %s'
2018-04-19 achuith cros_chrome_sdk: Don't retry if gsutil cat fails.

Created with:
  roll-dep src/third_party/chromite
BUG= chromium:832406 


The AutoRoll server is located here: https://chromite-chromium-roll.skia.org

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.


TBR=chrome-os-gardeners@chromium.org

Change-Id: I2bf9a4c977f810bccdd92fb963031211d8b5c25c
Reviewed-on: https://chromium-review.googlesource.com/1019921
Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Cr-Commit-Position: refs/heads/master@{#552255}
[modify] https://crrev.com/94a729903406cc33d7b18c6be41d36362465e132/DEPS

Project Member

Comment 19 by bugdroid1@chromium.org, Apr 20 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/b915daddaea12c0fc209d422119c7dd7d6cca4b2

commit b915daddaea12c0fc209d422119c7dd7d6cca4b2
Author: Achuith Bhandarkar <achuith@chromium.org>
Date: Fri Apr 20 14:12:31 2018

cros_chrome_sdk: Comment for _GetFullVersionFromStorage.

BUG= chromium:832406 
TEST=None

Change-Id: I666820ce2988ff4d77b32da93728551317891eb9
Reviewed-on: https://chromium-review.googlesource.com/1021412
Commit-Ready: Achuith Bhandarkar <achuith@chromium.org>
Tested-by: Achuith Bhandarkar <achuith@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[modify] https://crrev.com/b915daddaea12c0fc209d422119c7dd7d6cca4b2/cli/cros/cros_chrome_sdk.py

Project Member

Comment 20 by bugdroid1@chromium.org, Apr 20 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/761e2864cb2de6c27eb629999c68cb6e619a16e8

commit 761e2864cb2de6c27eb629999c68cb6e619a16e8
Author: chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Date: Fri Apr 20 16:54:39 2018

Roll src/third_party/chromite/ 8b2b38b48..b915dadda (1 commit)

https://chromium.googlesource.com/chromiumos/chromite.git/+log/8b2b38b482ee..b915daddaea1

$ git log 8b2b38b48..b915dadda --date=short --no-merges --format='%ad %ae %s'
2018-04-20 achuith cros_chrome_sdk: Comment for _GetFullVersionFromStorage.

Created with:
  roll-dep src/third_party/chromite
BUG= chromium:832406 


The AutoRoll server is located here: https://chromite-chromium-roll.skia.org

Documentation for the AutoRoller is here:
https://skia.googlesource.com/buildbot/+/master/autoroll/README.md

If the roll is causing failures, please contact the current sheriff, who should
be CC'd on the roll, and stop the roller if necessary.


TBR=chrome-os-gardeners@chromium.org

Change-Id: I892fe6bb766d6f040bb1ec50f2f66f9b3b604394
Reviewed-on: https://chromium-review.googlesource.com/1021991
Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com>
Cr-Commit-Position: refs/heads/master@{#552357}
[modify] https://crrev.com/761e2864cb2de6c27eb629999c68cb6e619a16e8/DEPS

Status: Fixed (was: Started)
Closing this out as it seems the work above has been completed: no retries will be made when checking for a version. If I'm mistaken though, feel free to reopen.

Sign in to add a comment