New issue
Advanced search Search tips

Issue 879417 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 2
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Sheriff-O-Matic no longer shows failure of android-go-perf since it's moved to LUCI

Project Member Reported by nednguyen@chromium.org, Aug 31

Issue description

android-go-perf is now moved to LUCI in https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/android-go-perf/30

I no longer see its tests failure in SOM. Maybe we need to configure something?

Sean: can you let us know what's missing?
 
Cc: -jbudorick@google.com jbudorick@chromium.org
Cc: efoo@chromium.org
Labels: LUCI-Blocker-Chromium
Sean, can you prioritize this issue? The consequence of this is becoming serious as we are moving more perf builders to LUCI.
Probably the flip-side of https://bugs.chromium.org/p/chromium/issues/detail?id=882880  

hinoka: do luci bots show up in build extracts served by milo's GetCompressedMasterJSON api? 
Hinoka@: can you answer Sean's question in #3. If this bug is not fixed, I can't migrate further bots to LUCI.
Cc: seanmccullough@chromium.org
Owner: hinoka@chromium.org
Status: Assigned (was: Untriaged)
Hinoka: can you take a look? This is blocking the LUCI migration, so it would be great if you can prioritize this
re #3: I believe you mean builders.  Yes, LUCI builders do show up in the GetCompressedMasterJSON api.

I traced one of the SoM API requests, and it looks like it's finding android-go-perf correctly:
https://screenshot.googleplex.com/GjHNKFZFhK1
https://screenshot.googleplex.com/b4ZwWJC72Xe
Cc: hinoka@chromium.org
Owner: seanmccullough@chromium.org
Sean: can you drive this investigation further given Ryan's answer in #6?
So BuildExtracts we're getting from GetCompressedMasterJSON do contain builder records from LUCI, and *not* BuildBot?

Asking because one of the BuildExtracts that SoM analyzed apparently contains a build record for the "android-go-perf" builder, with a BuildNumber of 477:
https://screenshot.googleplex.com/DiKLexCGihE

Most recent build that's failing on android-go-perf as of now is: https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/android-go-perf/383

Where's it getting the 477 from?  Possibly related: it can't recognize the build State property, that's why it defaults to "Infra failure" (purple) in this case. 
[disregard "State property" comment in #8]

This:
&{Basedir: BuilderName:android-go-perf CachedBuilds:[477 476 475 474 473 472 471 470 469 468 467 466 465 464 463 462 461 460 459 458 457 456 455 454 453 452 451 450 449 448 447 446 445 444 443 442 441 440 439 438 437 436 435 434 433 432 431 430 429 428] Category: CurrentBuilds:[] PendingBuilds:0 Slaves:[] State:}

is what it's getting from CompressedMasterJSON's BuildExtract.Builders entry for "android-go-perf" - and none of those are valid build numbers AFAICT. 

Furthermore, if I catch the analyzer when it sees "android-go-perf" and hard-code the CachedBuilds list to something valid like 359 ... 383 I get this:

https://screenshot.googleplex.com/iCLSqFtgVmm

Is Milo setting CachedBuilds wrong here? 
Owner: hinoka@chromium.org
Good catch, Milo relies on build numbers for ordering.  It looks like when the builder was flipped, the build number failed to increment.  I can fix this case manually, and I'll investigate why the build numbers did not increment.
Status: Fixed (was: Assigned)
This instance is fixed I think?

https://screenshot.googleplex.com/KScJofk27We
Thanks Hinoka!
Cc: charliea@chromium.org
 Issue 882880  has been merged into this issue.
Owner: seanmccullough@chromium.org
Status: Started (was: Fixed)
Sorry but I am seeing SOM doesn't display failures of many perf builders. Can you look into this again, Sean?

For example:
https://ci.chromium.org/p/chrome/builders/luci.chrome.ci/mac-10_12_laptop_low_end-perf has been failing on the last few days

https://sheriff-o-matic.appspot.com/chromium.perf doesn't show the failure of mac.
Seeing this in the analyzer logs: 

Merging alerts with same key ("chromium.perf.mac-10_12_laptop_low_end-perf.performance_test_suite."), different title: ("rendering.desktop/twitch_pinch_2018 and 2 other(s) in performance_test_suite failing on chromium.perf/mac-10_12_laptop_low_end-perf" vs "rendering.desktop/cats_unscaled and 28 other(s) in performance_test_suite failing on chromium.perf/mac-10_12_laptop_low_end-perf")

And later on, this one which also looks relevant:

Error getting alerts for builder mac-10_12_laptop_low_end-perf: [json: cannot unmarshal string into Go struct field Build.currentStep of type messages.CurrentStep (and 2 other errors)]


Think I found the issue, it's trying to parse something it doesn't actually ever use so I'm removing it: https://chromium-review.googlesource.com/c/infra/infra/+/1257275
Project Member

Comment 18 by bugdroid1@chromium.org, Oct 2

The following revision refers to this bug:
  https://chromium.googlesource.com/infra/infra/+/34d4e65d73ffe73b5aeda3cf285778bc139f5ae6

commit 34d4e65d73ffe73b5aeda3cf285778bc139f5ae6
Author: Sean McCullough <seanmccullough@chromium.org>
Date: Tue Oct 02 19:26:00 2018

[som] Don't try to parse currentStep.

Bug:  879417 
Change-Id: I94e6cb6011bcbe3e13b56b53fda9dbff7e894798
Reviewed-on: https://chromium-review.googlesource.com/c/1257275
Reviewed-by: Tiffany Zhang <zhangtiff@chromium.org>
Commit-Queue: Sean McCullough <seanmccullough@chromium.org>
Cr-Commit-Position: refs/heads/master@{#17997}
[modify] https://crrev.com/34d4e65d73ffe73b5aeda3cf285778bc139f5ae6/go/src/infra/monitoring/messages/buildextract.go

Status: Fixed (was: Started)
The change in #18 is now live on prod. PTAL and re-open if it's still a broken.

Sign in to add a comment