Current CHROMEOS_LKGM broken for chromeos-daisy-rel |
||||||||||
Issue descriptionNot sure if this has already been reported since I'm not a Googler, but gclient runhooks is running into problems on chromeos-daisy-rel: https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.chromiumos%2Fchromeos-daisy-rel%2F93460%2F%2B%2Frecipes%2Fsteps%2Fgclient_runhooks__with_patch_%2F0%2Fstdout WARNING: GS_ERROR: AccessDeniedException: 403 chromium.archive@gmail.com does not have storage.objects.list access to chromeos-image-archive.
,
Apr 13 2018
,
Apr 13 2018
,
Apr 13 2018
,
Apr 17 2018
This still doesn't seem to be fixed, cc-ing bpastene@ and sergeyberezin@ for some help with getting this assigned.
,
Apr 17 2018
So what's happening here is the bot's trying to download the cros sysroot for daisy at the pinned cros manifest: https://chromium.googlesource.com/chromium/src/+/b088b9a0448a0f98e7a96628545913e6c889ca46/chromeos/CHROMEOS_LKGM However there isn't one available for that manifest since the builder that uploads them had a string of red builds around that manifest: https://build.chromium.org/p/chromiumos/builders/daisy-full?numbuilds=200 The bot then starts decrementing the manifest until it finds one that's available. It starts from 10552.0.0 and ends at 10550.0.0 since that was most recent manifest with an available sysroot: https://build.chromium.org/p/chromiumos/builders/daisy-full/builds/18440 The "does not have storage.objects.list access" error is somewhat of a red herring. gsutil falls back to ls'ing the bucket when you try to cat a file that doesn't exist. The bot has get permissions, but not list permissions. So in effect, it's really a 404, not a 403. Can I ask how you noticed the failure/why you filed the bug? It's gracefully handling the failure, so I don't anything here is blocked, but I could be wrong.
,
Apr 17 2018
I noticed this only from looking at the timing of gclient runhooks on chromeos-daisy-rel. Because of the backoffs (12 minutes), it's been taking about 13-14 minutes (e.g. https://ci.chromium.org/buildbot/tryserver.chromium.chromiumos/chromeos-daisy-rel/95797). Probably not blocking anything, but the ChromeOS trybots have been pretty busy with the occasional backlog, and I figured this extra 12 minutes could be contributing.
,
Apr 17 2018
Ah, that's a really good point. Bug 832355 should help relieve some of the pressure the cros trybots are feeling, but you're right that this certainly isn't helping things. Short of reverting the most recent lkgm bump, let me see what we can do on our side of things to smooth this over.
,
Apr 17 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/905b77af486a9e5c1f6c9573a43f21276fd22b90 commit 905b77af486a9e5c1f6c9573a43f21276fd22b90 Author: Benjamin Pastene <bpastene@chromium.org> Date: Tue Apr 17 18:48:03 2018 Don't clear cros-chrome-sdk cache on the daisy simplechrome bot. Runhooks is taking ~15min on chromeos-daisy-rel every build because of broken LKGM + fallback logic. If we don't clear the cache, we skip that fallback logic, so this should save us around 10min on every build. Normally wouldn't be a big problem, but that slave pool is already over-subscribed, so hopefully this is immediate relief. Recipe-Nontrivial-Roll: build_limited_scripts_slave Bug: 832406 Change-Id: I5f9694b184b4864c672fe24bea8acc5ae4918634 Reviewed-on: https://chromium-review.googlesource.com/1015535 Commit-Queue: Ben Pastene <bpastene@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> [modify] https://crrev.com/905b77af486a9e5c1f6c9573a43f21276fd22b90/scripts/slave/README.recipes.md [modify] https://crrev.com/905b77af486a9e5c1f6c9573a43f21276fd22b90/scripts/slave/recipe_modules/chromium/api.py
,
Apr 17 2018
Runhooks in recent builds are back down to under a minute. We may want to disable the fallback logic in the chrome-sdk entirely now that lkgm updates are going through the CQ to prevent this from happening again. For now though, the bot should be healthy.
,
Apr 18 2018
It's a bit tricky to do that - we want the fallback logic for developers, and only want to only disable it on trybots.
,
Apr 18 2018
12 min seems like an awfully long time - this logic should just be doing a gs cat of a small text file: https://cs.corp.google.com/chromeos_public/chromite/cli/cros/cros_chrome_sdk.py?l=229-243
,
Apr 18 2018
IIUC, deeper in that gs_ctx.Cat() call is linear-backoff logic. ctrl-f for "Retrying in" in the following log: https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.chromiumos%2Fchromeos-daisy-rel%2F93460%2F%2B%2Frecipes%2Fsteps%2Fgclient_runhooks__with_patch_%2F0%2Fstdout If the cat fails, it sleeps for 1 min, tries again, sleeps for 2 min, tries again, then sleeps for 3 min. So that's 6min spent for each possible version. That seems a bit much, I'd expect network backoffs to be on the order of 1 to 10 seconds. (At least that's how I've written things in the past.)
,
Apr 18 2018
Ah, we can definitely fix that - thanks for looking into it.
,
Apr 19 2018
,
Apr 20 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/328ec4e1303a98f1fcd7c8e84e3b1b6ed8c0e791 commit 328ec4e1303a98f1fcd7c8e84e3b1b6ed8c0e791 Author: Achuith Bhandarkar <achuith@chromium.org> Date: Fri Apr 20 02:28:57 2018 cros_chrome_sdk: Don't retry if gsutil cat fails. * Set retries=0 for gs.Cat() * Pull the gs.Cat call into a separate function. BUG= chromium:832406 TEST=manual Change-Id: I052591adca9ed04f9d9d6b718035bb2ce21f58cb Reviewed-on: https://chromium-review.googlesource.com/1019481 Commit-Ready: Achuith Bhandarkar <achuith@chromium.org> Tested-by: Achuith Bhandarkar <achuith@chromium.org> Reviewed-by: Steven Bennetts <stevenjb@chromium.org> Reviewed-by: Ben Pastene <bpastene@chromium.org> [modify] https://crrev.com/328ec4e1303a98f1fcd7c8e84e3b1b6ed8c0e791/cli/cros/cros_chrome_sdk.py
,
Apr 20 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/94a729903406cc33d7b18c6be41d36362465e132 commit 94a729903406cc33d7b18c6be41d36362465e132 Author: chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Fri Apr 20 04:01:52 2018 Roll src/third_party/chromite/ 55fae2ff6..328ec4e13 (1 commit) https://chromium.googlesource.com/chromiumos/chromite.git/+log/55fae2ff6860..328ec4e1303a $ git log 55fae2ff6..328ec4e13 --date=short --no-merges --format='%ad %ae %s' 2018-04-19 achuith cros_chrome_sdk: Don't retry if gsutil cat fails. Created with: roll-dep src/third_party/chromite BUG= chromium:832406 The AutoRoll server is located here: https://chromite-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. TBR=chrome-os-gardeners@chromium.org Change-Id: I2bf9a4c977f810bccdd92fb963031211d8b5c25c Reviewed-on: https://chromium-review.googlesource.com/1019921 Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#552255} [modify] https://crrev.com/94a729903406cc33d7b18c6be41d36362465e132/DEPS
,
Apr 20 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/b915daddaea12c0fc209d422119c7dd7d6cca4b2 commit b915daddaea12c0fc209d422119c7dd7d6cca4b2 Author: Achuith Bhandarkar <achuith@chromium.org> Date: Fri Apr 20 14:12:31 2018 cros_chrome_sdk: Comment for _GetFullVersionFromStorage. BUG= chromium:832406 TEST=None Change-Id: I666820ce2988ff4d77b32da93728551317891eb9 Reviewed-on: https://chromium-review.googlesource.com/1021412 Commit-Ready: Achuith Bhandarkar <achuith@chromium.org> Tested-by: Achuith Bhandarkar <achuith@chromium.org> Reviewed-by: Mike Frysinger <vapier@chromium.org> [modify] https://crrev.com/b915daddaea12c0fc209d422119c7dd7d6cca4b2/cli/cros/cros_chrome_sdk.py
,
Apr 20 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/761e2864cb2de6c27eb629999c68cb6e619a16e8 commit 761e2864cb2de6c27eb629999c68cb6e619a16e8 Author: chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Date: Fri Apr 20 16:54:39 2018 Roll src/third_party/chromite/ 8b2b38b48..b915dadda (1 commit) https://chromium.googlesource.com/chromiumos/chromite.git/+log/8b2b38b482ee..b915daddaea1 $ git log 8b2b38b48..b915dadda --date=short --no-merges --format='%ad %ae %s' 2018-04-20 achuith cros_chrome_sdk: Comment for _GetFullVersionFromStorage. Created with: roll-dep src/third_party/chromite BUG= chromium:832406 The AutoRoll server is located here: https://chromite-chromium-roll.skia.org Documentation for the AutoRoller is here: https://skia.googlesource.com/buildbot/+/master/autoroll/README.md If the roll is causing failures, please contact the current sheriff, who should be CC'd on the roll, and stop the roller if necessary. TBR=chrome-os-gardeners@chromium.org Change-Id: I892fe6bb766d6f040bb1ec50f2f66f9b3b604394 Reviewed-on: https://chromium-review.googlesource.com/1021991 Reviewed-by: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Commit-Queue: Chromite Chromium Autoroll <chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com> Cr-Commit-Position: refs/heads/master@{#552357} [modify] https://crrev.com/761e2864cb2de6c27eb629999c68cb6e619a16e8/DEPS
,
May 1 2018
Closing this out as it seems the work above has been completed: no retries will be made when checking for a version. If I'm mistaken though, feel free to reopen. |
||||||||||
►
Sign in to add a comment |
||||||||||
Comment 1 by cnardi@chromium.org
, Apr 13 2018