Binary size alerts broken |
|||||||||||||
Issue descriptionThe builder was broken for a while due to issue 830843 but has been green for the last day or so: https://ci.chromium.org/buildbot/chromium.perf/Android%20Builder%20Perf/. It seems that our old graphs: https://chromeperf.appspot.com/report?sid=be5798bbf51f90ed6952ed30fb05b13eab9f7962dac1e464a154e1470d02af62&num_points=1500 have been succeeded by new ones from the builder renamed to "Android Builder Perf" https://chromeperf.appspot.com/report?sid=670c396cb2213f808343e3fbefd068015aef8fe414c3f1d197a7681cb9384c50&num_points=1500. It doesn't seem like alerts are enabled for the new graphs (and nothing shows up at go/binary-size-alerts). Rename CL here: 5f0dfea8a2626dbce3a41f2e7234b7e474f3a3c3. How do we setup alerts again? And is there a way to combine the data from before the rename so that we can maintain our historical graphs? Not sure who to route this to.
,
Apr 11 2018
Yep! So we can migrate the data from the old name to the new name, I'll get that started tomorrow. I'll also go over your sheriff configs and update as necessary to get alerting back up.
,
Apr 11 2018
Ok I've made the changes to the sheriff configs and anomaly configs, the migration is a bit more involved so I'll get that going tomorrow.
,
Apr 11 2018
Great, thanks Simon! I'll report back once I see alerts start coming in again.
,
Apr 11 2018
,
Apr 11 2018
I've got a migration under way, typically takes a day or 2.
,
Apr 12 2018
,
Apr 13 2018
Checking in, alerts seem to be working again but looks like I didn't get the migration right, will look at that.
,
Apr 13 2018
Forgot to report back yep alerts are working again, thanks!
,
Apr 13 2018
Looks like the graphs are inaccessible without login. Could you make them public?
,
Apr 13 2018
Done, will take a bit to propagate. I also noticed there's an Android arm64 Builder vs Android arm64 Builder Perf, should that also be public?
,
Apr 13 2018
We're not running resource_size.py for arm64, although we may want to... so sure!
,
Apr 17 2018
So the migration is done, but I managed to botch up the first attempt and some of the paths for resource_sizes (MonochromePublic.apk) were migrated to the wrong level. ie. ChromiumPerf/Android Builder/resource_sizes (MonochromePublic.apk)/MonochromePublic.apk_specifics/foo -> ChromiumPerf/Android Builder Perf/resource_sizes (MonochromePublic.apk)/foo I don't think any data was lost, but I'll have to sort out where each of those came from and migrate them one by one now into the proper path.
,
May 22 2018
Any update here? Besides losing the ability to see old metrics, the dropdown for choosing subtest now shows many invalid entries.
,
May 23 2018
Noticed that I hadn't seen any alerts this week. Looks like the graph may be broken again? https://chromeperf.appspot.com/report?sid=d6542096534166992e063320f8e1b7128e10ed53091e865eef3b5295644e60ce - the last point is from 5 days ago.
,
May 24 2018
Looks like Simon is OOO this week. Annie, any ideas how I can fix this or what may be causing it? Seems that our perf graphs aren't getting new data right now.
,
May 24 2018
Note: The bot seems fine, and the resource_sizes step does not show errors: https://ci.chromium.org/buildbot/chromium.perf/Android%20Builder%20Perf/ https://logs.chromium.org/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Builder_Perf%2F13690%2F%2B%2Frecipes%2Fsteps%2Fresource_sizes__MonochromePublic.apk_%2F0%2Fstdout# Maybe the issue is on the dashboard's side?
,
May 24 2018
Sorry for the slow response here! You can actually see what was uploaded to the perf dashboard in the json.output link under the buildbot step for resource_sizes. Here's a link to a recent one: https://logs.chromium.org/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Builder_Perf%2F13701%2F%2B%2Frecipes%2Fsteps%2Fresource_sizes__SystemWebView.apk_%2F0%2Flogs%2Fjson.output%2F0 In that JSON, you can see that although the step is called "resource_sizes (MonochromePublic.apk)", it uploads with the benchmark name "resource_sizes". And you can see in the graphs (if you look at the date range in the bottom sliders) that it stopped uploading as "resource_sizes (MonochromePublic.apk)" on May 15 and started uploading as "resource_sizes" on May 18: https://chromeperf.appspot.com/report?sid=48e8073d490503595b306647aeeef3cee67701a2e05cca3179a3d8e2d3597557 Unfortunately I'm not familiar with how the perf dashboard upload works from resource_sizes. Emily, do you know? I see in the logs corresponding to the JSON upload that "'--perf-dashboard-id=resource_sizes (MonochromePublic.apk)'" is passed into run_test.py (https://logs.chromium.org/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_Builder_Perf%2F13701%2F%2B%2Frecipes%2Fsteps%2Fresource_sizes__MonochromePublic.apk_%2F0%2Fstdout)
,
May 25 2018
Ah! Thanks for figuring this out & for the pointer about seeing what gets uploaded! I think I broke this :( https://chromium-review.googlesource.com/c/chromium/tools/build/+/1025870 self.m.chromium.runtest( self.c.resource_sizes, resource_sizes_args, - name=test_name, + name=test_name + step_suffix, perf_dashboard_id=test_name, point_id=None, - test_type=test_name, + test_type='resource_sizes', annotate=self.m.chromium.get_annotate_by_test_name(test_name), results_url='https://chromeperf.appspot.com', perf_id=perf_id or self.m.properties['buildername'], chartjson_file=chartjson_file) Looks like "test_type" actually maps to the "Test Suite".
,
May 25 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/25893f980f13d2f384f355955acf57f876275482 commit 25893f980f13d2f384f355955acf57f876275482 Author: Andrew Grieve <agrieve@chromium.org> Date: Fri May 25 02:57:31 2018 Fix resource_sizes tests results having incurrect "Test Suite" field Parameter was incorrectly changed in 8d8612afb3866855b664a1cb9a457b8784a0efe3. TBR=jbudorick Recipe-Nontrivial-Roll: build_limited_scripts_slave Bug: 831328 Change-Id: I9878808209d1a0b6f2a021c6159081b401253069 Reviewed-on: https://chromium-review.googlesource.com/1072321 Commit-Queue: agrieve <agrieve@chromium.org> Reviewed-by: agrieve <agrieve@chromium.org> [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipes/cronet.expected/android_cronet_x86_builder.json [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipes/cronet.expected/android_cronet_marshmallow_64bit_builder.json [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipe_modules/chromium_android/examples/full.expected/resource_size_builder_basic.json [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipes/cronet.expected/android_cronet_arm64_builder.json [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipes/cronet.expected/android_cronet_kitkat_builder.json [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipes/android/builder.expected/full_chromium_perf_Android_Builder_Perf.json [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipe_modules/cronet/examples/full.expected/mb_test.json [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipe_modules/chromium_android/tests/resource_sizes.expected/basic.json [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipe_modules/chromium_android/api.py [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipes/cronet.expected/android_cronet_lollipop_builder.json [modify] https://crrev.com/25893f980f13d2f384f355955acf57f876275482/scripts/slave/recipes/android/builder.expected/full_chromium_perf_Android_arm64_Builder_Perf.json
,
Jun 20 2018
It's probably not a priority to fix up the botched historical data. Just going to mark as fixed for now.
,
Jun 20 2018
Spoke too soon. Looks like alerts are not firing even though there are plenty of 16kb jumps. Did alerts get disabled recently for this graph? Direct link where I'm expecting alerts: https://chromeperf.appspot.com/report?sid=d6542096534166992e063320f8e1b7128e10ed53091e865eef3b5295644e60ce&start_rev=558764&end_rev=561960&num_points=1600
,
Jun 20 2018
I can take a look
,
Jun 20 2018
This is really strange, there's nothing wrong with the test as far as I can tell, sheriff config and anomaly config are set properly. Checking it out in /debug_alert seems to imply that an alert *should* have fired.
,
Jun 20 2018
So I'm finding that the alerts actually exist and are some reason aren't making their way up into the UI.
,
Jun 20 2018
So it looks like it had been marked invalid at some point, what I did was reset it to untriaged. There's a bunch more that were marked invalid, I could go through and see if they need to be reset as well. There seems to be a 2nd issue in that the alerts aren't showing on the graphs. I've filed an issue here: https://github.com/catapult-project/catapult/issues/4518 and benjhayden@ is investigating why they're not showing up.
,
Jun 21 2018
If they are marked as ignored (is that the same as invalid?) then that's probably me triaging them. I can file bugs on them through the sheriffing page, by individually selecting and then clicking the "Triage" button, just that they don't show up on the actual graphs.
,
Jun 21 2018
The not showing up on graphs issue should be fixed shortly, we just need to land a cl and redeploy.
,
Jun 21 2018
Thank you Simon for quickly debugging and getting this fixed! That'll really help with easily identifying alerts and filing bugs.
,
Jun 22 2018
Thanks for the quick fix!
,
Jun 22 2018
No problem! Closing this out if there are no more issues. |
|||||||||||||
►
Sign in to add a comment |
|||||||||||||
Comment 1 by sullivan@chromium.org
, Apr 10 2018