Sheriff-O-Matic no longer shows failure of android-go-perf since it's moved to LUCI |
||||||||
Issue descriptionandroid-go-perf is now moved to LUCI in https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/android-go-perf/30 I no longer see its tests failure in SOM. Maybe we need to configure something? Sean: can you let us know what's missing?
,
Sep 11
Sean, can you prioritize this issue? The consequence of this is becoming serious as we are moving more perf builders to LUCI.
,
Sep 12
Probably the flip-side of https://bugs.chromium.org/p/chromium/issues/detail?id=882880 hinoka: do luci bots show up in build extracts served by milo's GetCompressedMasterJSON api?
,
Sep 13
Hinoka@: can you answer Sean's question in #3. If this bug is not fixed, I can't migrate further bots to LUCI.
,
Sep 18
Hinoka: can you take a look? This is blocking the LUCI migration, so it would be great if you can prioritize this
,
Sep 18
re #3: I believe you mean builders. Yes, LUCI builders do show up in the GetCompressedMasterJSON api. I traced one of the SoM API requests, and it looks like it's finding android-go-perf correctly: https://screenshot.googleplex.com/GjHNKFZFhK1 https://screenshot.googleplex.com/b4ZwWJC72Xe
,
Sep 19
Sean: can you drive this investigation further given Ryan's answer in #6?
,
Sep 20
So BuildExtracts we're getting from GetCompressedMasterJSON do contain builder records from LUCI, and *not* BuildBot? Asking because one of the BuildExtracts that SoM analyzed apparently contains a build record for the "android-go-perf" builder, with a BuildNumber of 477: https://screenshot.googleplex.com/DiKLexCGihE Most recent build that's failing on android-go-perf as of now is: https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/android-go-perf/383 Where's it getting the 477 from? Possibly related: it can't recognize the build State property, that's why it defaults to "Infra failure" (purple) in this case.
,
Sep 20
[disregard "State property" comment in #8]
This:
&{Basedir: BuilderName:android-go-perf CachedBuilds:[477 476 475 474 473 472 471 470 469 468 467 466 465 464 463 462 461 460 459 458 457 456 455 454 453 452 451 450 449 448 447 446 445 444 443 442 441 440 439 438 437 436 435 434 433 432 431 430 429 428] Category: CurrentBuilds:[] PendingBuilds:0 Slaves:[] State:}
is what it's getting from CompressedMasterJSON's BuildExtract.Builders entry for "android-go-perf" - and none of those are valid build numbers AFAICT.
Furthermore, if I catch the analyzer when it sees "android-go-perf" and hard-code the CachedBuilds list to something valid like 359 ... 383 I get this:
https://screenshot.googleplex.com/iCLSqFtgVmm
Is Milo setting CachedBuilds wrong here?
,
Sep 20
,
Sep 20
Good catch, Milo relies on build numbers for ordering. It looks like when the builder was flipped, the build number failed to increment. I can fix this case manually, and I'll investigate why the build numbers did not increment.
,
Sep 20
This instance is fixed I think? https://screenshot.googleplex.com/KScJofk27We
,
Sep 20
Thanks Hinoka!
,
Sep 20
,
Oct 2
Sorry but I am seeing SOM doesn't display failures of many perf builders. Can you look into this again, Sean? For example: https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/mac-10_12_laptop_low_end-perf has been failing on the last few days https://sheriff-o-matic.appspot.com/chromium.perf doesn't show the failure of mac.
,
Oct 2
Seeing this in the analyzer logs:
Merging alerts with same key ("chromium.perf.mac-10_12_laptop_low_end-perf.performance_test_suite."), different title: ("rendering.desktop/twitch_pinch_2018 and 2 other(s) in performance_test_suite failing on chromium.perf/mac-10_12_laptop_low_end-perf" vs "rendering.desktop/cats_unscaled and 28 other(s) in performance_test_suite failing on chromium.perf/mac-10_12_laptop_low_end-perf")
And later on, this one which also looks relevant:
Error getting alerts for builder mac-10_12_laptop_low_end-perf: [json: cannot unmarshal string into Go struct field Build.currentStep of type messages.CurrentStep (and 2 other errors)]
,
Oct 2
Think I found the issue, it's trying to parse something it doesn't actually ever use so I'm removing it: https://chromium-review.googlesource.com/c/infra/infra/+/1257275
,
Oct 2
The following revision refers to this bug: https://chromium.googlesource.com/infra/infra/+/34d4e65d73ffe73b5aeda3cf285778bc139f5ae6 commit 34d4e65d73ffe73b5aeda3cf285778bc139f5ae6 Author: Sean McCullough <seanmccullough@chromium.org> Date: Tue Oct 02 19:26:00 2018 [som] Don't try to parse currentStep. Bug: 879417 Change-Id: I94e6cb6011bcbe3e13b56b53fda9dbff7e894798 Reviewed-on: https://chromium-review.googlesource.com/c/1257275 Reviewed-by: Tiffany Zhang <zhangtiff@chromium.org> Commit-Queue: Sean McCullough <seanmccullough@chromium.org> Cr-Commit-Position: refs/heads/master@{#17997} [modify] https://crrev.com/34d4e65d73ffe73b5aeda3cf285778bc139f5ae6/go/src/infra/monitoring/messages/buildextract.go
,
Oct 2
The change in #18 is now live on prod. PTAL and re-open if it's still a broken. |
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by nednguyen@chromium.org
, Aug 31