Bisect numbers completely different from perf dashboard |
||||||||
Issue descriptionContext: https://bugs.chromium.org/p/chromium/issues/detail?id=636420 According to the dashboard (https://chromeperf.appspot.com/group_report?bug_id=636420), the metric increased 550,228 to 619,856 bytes. However, according to the bisect (https://bugs.chromium.org/p/chromium/issues/detail?id=636420#c3), the metric stayed at 16,384 bytes. The thing that I'm confused about is not that the bisect did not reproduce the regression (that happens quite often), but that the bisect results are order-of-magnitude different from the dashboard. How is that possible? Do we use completely different hardware on the bisect bots? Is there a bug in the code that sets up the revisions during a bisect? Any ideas?
,
Aug 11 2016
Correction for #1: Dashboard: 31,007,400 → 31,421,700
,
Aug 11 2016
,
Aug 11 2016
,
Aug 11 2016
Wow, thanks for the detailed report! I am working off of this spreadsheet about our hardware: https://docs.google.com/spreadsheets/d/1LTOSY9y1_sdDiL94XQTZrXmSzokn4p44hlgHQvLnEt4/edit#gid=62496670 Comment #1: Builder: Win 7 Perf (3) (build187-m1) windows 2008 R2 PowerEdge R210 II x64 1 Intel(R) Xeon(R) CPU E3-1220 V2 @ 3.10GHz 15.97 GB 2.4.4 2.8.3.windows.1 "values": [ 619856 ] Bisector: win_perf_bisect (build242-m4) windows 2008 R2 PowerEdge R220 x64 1 Intel(R) Xeon(R) CPU E3-1230 v3 @ 3.30GHz 23.83 GB 2.4.4 2.8.3.windows.1 bisector.lkgr: RevisionState(rev=chromium@410353, values=[16384, 16384, 16384, 16384, 16384, 16384, 16384, 16384, 16384, 16384, 16384, 16384], mean_value=16384.0, std_dev=0.0) @@@STEP_LOG_LINE@Debug Info@bisector.fkbr: RevisionState(rev=chromium@410387, values=[16384, 16384, 16384, 16384, 16384, 16384, 16384, 16384], mean_value=16384.0, std_dev=0.0) Dave, it looks like the hardware doesn't match! Did we bisect against the correct bisector? Or have we got a hardware mismatch in the lab? (Also you might want to regenerate the spreadsheet to double-check). Should this have gone to win_x64_perf_bisect which has the same hardware? It looks like we could swap out Windows bisectors with Android bisector linux hosts to update the configs. Petr, could that difference account for such a huge change in results?
,
Aug 11 2016
The bots seem to be very similar (3.10 vs. 3.30 GHz and 15.97 vs 23.83 GB). I don't see how this could bring down V8's memory allocated by malloc from 605 KiB to 16 KiB. Strange coincidence: I've just looked through the the charts on the dashboard (https://chromeperf.appspot.com/group_report?bug_id=636420) and I can see that 16,384 was reported on the dashboard by a completely different bot, chromium-rel-mac10 (brown chart at the top).
,
Aug 11 2016
Following up, the case in #1, all the perfbots and bisectors are on the exact same hardware/OS (MacBookPro11,2 with OS X 10.11.6) I think there is something strange happening with the metrics, Petr. Assigning to you to triage.
,
Aug 11 2016
I definitely think this is either the metric, or dashboard that having some problem: Before & after trace for "ChromiumPerf/chromium-rel-mac10/system_health.memory_desktop / memory:chrome:all_processes:reported_by_chrome:v8:allocated_by_malloc:effective_size_avg / load_tools /" in https://chromeperf.appspot.com/group_report?bug_id=636420: Before: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_32-2016-08-07_08-11-19-33994.html After: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_34-2016-08-09_08-44-44-82743.html Clicking on the memory dumps in both traces show the memory data are very similar. So I don't expect a huge regression as seen on the graph.
,
Aug 11 2016
Note that in #5 I pulled the numbers we were seeing from the dashboard out of the chartjson from the perfbot; you could go back and check all the json.output links on the bots to verify, but I'm pretty sure it's the metric and not the dashboard.
,
Aug 12 2016
(The regression that's mentioned in the original post is now actually in https://bugs.chromium.org/p/chromium/issues/detail?id=637269). #8: The traces you are referring to are from chromium-rel-mac10, but this issue is on Windows. The situation is really strange. Here are the traces: DASHBOARD (https://chromeperf.appspot.com/group_report?bug_id=637269) r410353: LOAD:social:twitter: 2 isolates (16+521 KiB allocated by malloc) https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_35-2016-08-08_10-40-38-7040.html BROWSE:social:twitter: 2 isolates (16+521 KiB allocated by malloc) https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_44-2016-08-08_10-40-52-44621.html r410387: LOAD:social:twitter: 2 isolates (16+589 KiB allocated by malloc) https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_35-2016-08-08_13-19-28-36286.html BROWSE:social:twitter: 2 isolates (16+589 KiB allocated by malloc) https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_44-2016-08-08_13-19-42-86279.html BISECT (https://build.chromium.org/p/tryserver.chromium.perf/builders/win_perf_bisect/builds/6830) r410353: LOAD:social:twitter: 2 isolates (16+521 KiB allocated by malloc) https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_35-2016-08-12_07-36-54-50217.html BROWSE:social:twitter: ONLY 1 ISOLATE (16 KiB allocated by malloc) !!!!! https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_44-2016-08-12_07-37-07-72625.html r410387: LOAD:social:twitter: 2 isolates (16+589 KiB allocated by malloc) https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_35-2016-08-12_09-05-40-32766.html BROWSE:social:twitter: ONLY 1 ISOLATE (16 KiB allocated by malloc) !!!!! https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_44-2016-08-12_09-05-53-16135.html The strange thing here is this: DASHBOARD: LOAD:social:twitter: Always 2 isolates BROWSE:social:twitter: Always 2 isolates BISECT LOAD:social:twitter: Always 2 isolates BROWSE:social:twitter: Always 1 isolate !!!!! ulan,hpayer: This is a very V8-specific thing. Do you guys have any idea why this would happen?
,
Aug 12 2016
As for #1, again the numbers reported on dashboard and by bisect match the numbers in their traces: Trace from dashboard (r405900): https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_9-2016-07-15_18-42-37-40510.html Trace from bisect (r405900): https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_9-2016-08-08_07-23-20-71724.html Trace from dashboard (r405918): https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_9-2016-07-15_20-44-16-59995.html Trace from bisect (r405900): https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/trace-file-id_9-2016-08-08_08-15-59-55929.html There's a huge difference between the size of cc on dashboard (15.3-17.3 MiB) and in the bisect (56.1 MiB). I dug into the traces and it turns out that CC holds at most 1 MiB resources/textures on the dashboard, but it holds quite a lot of 3-5 MiB resources/textures in the bisect. +cc ericrk, reveman: Do you have any idea what could lead to such a huge difference in cc/gpu (e.g. different screen resolution or pixel density)? Both the devices (dashboard buildbot and bisect trybot seem to have exactly the same configuration)
,
Aug 12 2016
Is it possible GPU is misconfigured on some bots?
,
Aug 12 2016
For #11 - when the original regression occurred the Mac retina bots were running MacOS 10.9. These bots appear to have been upgraded to 10.11 on June 27/28. This upgrade caused significant differences in many metrics, including these. Unfortunately, when we compare perf dashboard to buildbot, we are comparing the original (10.9) metric to the new (10.11) bisect bot metric. I believe this accounts for the difference. The follow up question is, why did upgrading change this metric so substantially. From looking at the size of tiles being used (for lack of a direct indicator), it appears that our larges (full-window) tile size went from 1024kb to ~3520kb. This makes me suspect that the pre-upgrade retina bots were being forced into a non-retina resolution. Looking at the tile sizes (largest tile is 1kb), it seems very unlikely that we were rendering retina content.
,
Aug 15 2016
ericrk: Your explanation seems very likely. I also suspected this could have something to do with Retina. sullivan: Is there any way to confirm/reject this hypothesis? Are changes like this tracked anywhere?
,
Aug 15 2016
I dug into this some more. You can check the OS in the following roundabout way: 1) Click the stdio link on the graph (you may need to scroll in the tooltip). You'll probably get a 404 since it's expired. 2) Delete the end of the stdio until the url ends in the buildnumber 3) On the buildbot status page that results, click the [stdout] link, which is stored for longer in logdog. 4) In the log, there will be a "_LogBrowserInfo" line that starts with either "OS: mac mavericks" or "OS: mac elcapitan". Through these steps, I found that the bot in question was upgraded to elcapitan on build 3391, which corresponds to r407939 - r407983 range on the perf dashboard: https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.perf%2FMac_Retina_Perf__3_%2F3390%2F%2B%2Frecipes%2Fsteps%2Fblink_perf.events%2F0%2Fstdout https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.perf%2FMac_Retina_Perf__3_%2F3391%2F%2B%2Frecipes%2Fsteps%2Fblink_perf.events.reference%2F0%2Fstdout Eric, it looks like you were looking into the graphs linked in comment 1, https://chromeperf.appspot.com/group_report?bug_id=629108. That spike is at build 3279 (r405901 - r405918), before the upgrade. But it still sounds like there is a problem with retina being properly enabled. Here are the logs from before/after the alert on that page: https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.perf%2FMac_Retina_Perf__3_%2F3278%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.memory_desktop%2F0%2Fstdout https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Fchromium.perf%2FMac_Retina_Perf__3_%2F3279%2F%2B%2Frecipes%2Fsteps%2Fsystem_health.memory_desktop%2F0%2Fstdout On both of them I see: (INFO) 2016-07-15 20:34:45,795 browser._LogBrowserInfo:124 max_resolution_height: 2160 (INFO) 2016-07-15 20:34:45,795 browser._LogBrowserInfo:124 max_resolution_width: 4096 I'm not very accustomed to reading this output, but one difference sticks out: 3278: (INFO) 2016-07-15 18:33:05,231 browser._LogBrowserInfo:128 rasterization : enabled 3279: (INFO) 2016-07-15 20:34:45,796 browser._LogBrowserInfo:128 rasterization : unavailable_software Could that be the issue?
,
Aug 15 2016
Thanks for the additional info! Looked into this more. There was a "regression" here in the original graph, it was caused by https://codereview.chromium.org/2151393002, which disabled GPU rasterization on 10.9 systems. There's not much we can do about this regression, as GPU rasterization was disabled to prevent visual corruption. The change to enable GPU rasterization never made it to a stable release on 10.9, so no stable users will experience this "regression" This would also explain why we had such a hard time bisecting - the bisect bot was at 10.11, so it was completely unaffected by the change. So I think we've explained the original regression. I'm still unclear on why 10.11 and 10.9 bots are so different. The "max resolution" numbers cited above just indicate the GPUs capabilities, not the resolution we're running at. I'm still guessing that retina was not correctly enabled on the old 10.9 bot, but I'm not sure how to confirm - I've looked at the logs some more, but I don't think we log actual screen or window resolution. The rasterization values you mention in #15 do explain the original (small) regression, but not the huge difference between 10.9/10.11.
,
Oct 3 2016
I'm marking this as "Fixed" because we figured out what the problem was. Please re-open if you feel this is not appropriate.
,
Nov 9 2016
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by petrcermak@chromium.org
, Aug 11 2016