https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/ToTiOSDevice/99 shows "Internal Failure" |
||||||||||||||
Issue descriptionOn the bot page (https://ci.chromium.org/buildbot/chromium.clang/ToTiOSDevice/) it says that build 99 failed with a compile error. However, visiting the page for build 99 (https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/ToTiOSDevice/99) shows internal failure and doesn't let me see the build steps. Marking P1 because investigating the compile failure, which I can't see, is blocking the clang roll.
,
Sep 26
,
Sep 26
,
Sep 26
per https://chromium-swarm.appspot.com/task?id=3fd032a7d6e67110, slice 1 isn't running because nothing has the builder cache. slice 2 should run but doesn't?
,
Sep 26
,
Sep 26
Digging more, this appears to be an issue w/ the builder's migration to luci. https://build.chromium.org/deprecated/chromium.clang/builders/ToTiOSDevice has the full history of the buildbot (& shows the compile failures).
,
Sep 26
,
Sep 26
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0465ccf6c78bd8a13e4ce87bb92cf9868e3b716b commit 0465ccf6c78bd8a13e4ce87bb92cf9868e3b716b Author: John Budorick <jbudorick@chromium.org> Date: Wed Sep 26 15:06:46 2018 luci: pull misconfigured ios chromium.clang bots off the console. TBR=hinoka@chromium.org No-Try: true Bug: 889399 Change-Id: I0184ec4ab05b26c9e7abd0d5405f400e118cda2e Reviewed-on: https://chromium-review.googlesource.com/1246241 Commit-Queue: John Budorick <jbudorick@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> Cr-Commit-Position: refs/heads/master@{#594314} [modify] https://crrev.com/0465ccf6c78bd8a13e4ce87bb92cf9868e3b716b/infra/config/global/luci-milo.cfg
,
Sep 26
Bumped ToTiOSDevice's buildbot build number.
,
Sep 26
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c3f089dc172f7175f7d866597f25d0a56eb21598 commit c3f089dc172f7175f7d866597f25d0a56eb21598 Author: John Budorick <jbudorick@chromium.org> Date: Wed Sep 26 15:53:34 2018 luci: fix recipe configs for chromium.clang:ToTiOS{,Device}. TBR=hinoka@chromium.org Bug: 889399 , 861396 ,885799 Change-Id: I7cb770cbe0d4b94f3de501c1e8bc55e37ccd77ce Reviewed-on: https://chromium-review.googlesource.com/1246283 Reviewed-by: John Budorick <jbudorick@chromium.org> Commit-Queue: John Budorick <jbudorick@chromium.org> Cr-Commit-Position: refs/heads/master@{#594329} [modify] https://crrev.com/c3f089dc172f7175f7d866597f25d0a56eb21598/infra/config/global/cr-buildbucket.cfg
,
Sep 26
ToTiOSDevice is not flipped to LUCI yet, but for some reason it's not listed in luci-migration (Probably because it was added more recently...), so Milo assumes it must be a LUCI builder. Will need to add this into luci-migration somehow.
,
Sep 26
Oh it's totally on luci-migration (just not in the right order). I think this happened due to the luci-migration outage previously, so the luci build got marked as "prod". I can't think of a quick way to fix this, I think it might be better to turn off the buildbot -> luci build redirection logic, it causes more user confusion than it solves I think.
,
Sep 26
Actually it's best not to turn it off, instead just bump the build number on the buildbot side.
,
Sep 26
Did that in #9.
,
Sep 26
(though after build 103 started)
,
Sep 26
I see. The build takes 8 hours to cycle, maybe we should kill #103 and let the next one start?
,
Sep 26
I hadn't done so thus far because the build is still visible on the deprecated buildbot endpoint (https://build.chromium.org/deprecated/chromium.clang/builders/ToTiOSDevice/builds/103), but if hans/thakis would prefer killing it, that's fine with me.
,
Sep 26
There is a page & ticket - see issue 889490 , I wonder if it's related. Was busy with that page and didn't notice this issue - let me look at both.
,
Sep 26
#103 will not show up in RPC endpoints or tools because the latest build number (ie max(luci, buildbot)) is already 4000+, and Milo will only return the latest build number (and assume that there is only one "real" build for any build number, for the purposes of emulation mode). My inclination is that since this is a P0 wrt some autoroller not being able to see the build, the best way forward will be to kill the current build so that the next can start.
,
Sep 26
OK, it's not related to issue 889490 . hinoka@ - are you already handling this issue? Want to grab it? Do you need any of my help (as a CCI trooper)?
,
Sep 26
..a-and build 103 is dead with NO_RESOURCE. No killing required.
,
Sep 26
Sure I'll take this. If you click on the link, you get taken to the luci build, which is incorrect. jbudorick fixed this by bumping the buildbot build number, but it hasn't yet taken effect. Buildbot still thinks #103 is the latest build number. It's past work hours in MUC, so I don't think we're going to get a response. I'll kill #103 so that we can downgrade the from a P0.
,
Sep 26
Thanks! Ping me if you need any help.
,
Sep 26
#18: the page is due to a network issue; see the ops chat. #19: my impression is that this is a P0 because hans couldn't see what the compile failure was, and fixing it blocks the manual clang roll. (I don't think clang rolls are automated?)
,
Sep 26
oh, and hinoka: note that I bumped the buildbot number *considerably* -- to 10000, well beyond the minimum safe distance -- given the rate at which the LUCI builders were cycling. w/ #10, they should be cycling more slowly given that they're not failing immediately.
,
Sep 26
I'm fine with downgrading from P0; I was able to guess and go to the logdog URL directly.
,
Sep 26
I see. I mistakenly assumed all of our rolls our automated, and that this was a tools issue. Since the build is still accessable via uberchromegw, I'll downgrade this to P1. The next build has started, but the number is #104: https://uberchromegw.corp.google.com/i/chromium.clang/builders/ToTiOSDevice/builds/104 Maybe the master needs to be restarted?
,
Sep 26
The following revision refers to this bug: https://chrome-internal.googlesource.com/infradata/master-manager/+/98fe49e35c11dca59f2fb10448da8d25df24f1ee commit 98fe49e35c11dca59f2fb10448da8d25df24f1ee Author: Ryan Tseng <hinoka@google.com> Date: Wed Sep 26 17:04:05 2018
,
Sep 26
#27: yeah, seems like it.
,
Sep 29
I didn't quite understand what happened, but I can see the builds now so it seems like this is fixed.
,
Sep 29
So which bot is correct? I see: https://ci.chromium.org/buildbot/chromium.clang/ToTiOSDevice/ Running on build155-m1 And I see: https://ci.chromium.org/p/chromium/builders/luci.chromium.ci/ToTiOSDevice Running on: build281-m9 build282-m9 build283-m9 build284-m9 build285-m9 build286-m9 For the latter, all the builds are failing because the iOS device cert isn't installed. I can fix this -- but which one is correct? Is the old machine (build155-m1) being turned down?
,
Oct 1
+efoo who I think worked on this migration. Do you know what's going on here?
,
Oct 1
This one is correct right now: https://ci.chromium.org/buildbot/chromium.clang/ToTiOSDevice/ The latter is LUCI, and because of the issues you're seeing, they haven't been flipped yet (But if you can fix them, that would be pretty fantastic for us).
,
Oct 1
dba@ can you please add the iOS device cert and mobileprovision available at https://drive.google.com/corp/drive/folders/1lB7bCARh1DvHoOKtB0OPvpVqwY-TKoZ8 to the -m9 bots listed in #31? Thanks!
,
Oct 1
justincohen - just to confirm (since I'm the one that set these bots up), even though they need device certs, they don't need actual devices because the tests are swarmed, is that correct?
,
Oct 1
#34 - Done.
,
Oct 1
dba@ thanks! hinoka@ gn now passes.
,
Oct 4
|
||||||||||||||
►
Sign in to add a comment |
||||||||||||||
Comment 1 by h...@chromium.org
, Sep 26