Issue metadata
Sign in to add a comment
|
Investigate perf regression due to WebIDBFactoryImpl IOThreadHelper class removal |
||||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Jul 30
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/13d430c3a40000
,
Jul 31
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/16b71e8fa40000
,
Aug 1
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/149dc0ffa40000
,
Aug 1
Kicked off a bisect on the Mac to see if it has better luck than the Nexus devices.
,
Aug 1
📍 Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/149dc0ffa40000 Remove WebIDBFactoryImpl IOThreadHelper class by cmp@chromium.org https://chromium.googlesource.com/chromium/src/+/b5ce7038bfcaa4cdec4fe261f2dc8fc153eb4b59 10.64 → 11.35 (+0.7093) Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Aug 1
Adding others FYI. Adding a link to issue 717812 for now.
,
Aug 1
This is for IDBObjectStore::put timing. This may have gotten longer because we're doing more work on the main thread now (instead of punting stuff to the IO thread)?
,
Aug 1
The WebIDBFactoryImpl patch alone should not have a significant effect on the performance of a method like IDBObjectStore::put() since at that point the connection to the DB has already been made and that is still going through the old IO thread path. The only thing I can think of is that since we now wait until the first DB connection is opened to make a connection to the browser process the first DB connection may be a bit slower than before but that was done in a different change.
,
Aug 1
It may also be interesting to note that there was a V8 roll in this commit range as well.
,
Aug 1
The pinpoint job definitely found cmp@'s change as the reason. It would be nice if the pinpoint job ouputted the formatted results, where you can see each timing. Then we could see that the first 'put' takes longer, as we're still opening the database - if that's the reason. We do have traces, but those aren't exactly the best thing to analyze when looking for a tiny perf change over the whole range.
,
Aug 1
Note to self from Victor's offline comment: try perf test at 3505 and 3506. https://chromiumdash.appspot.com/commits?user=cmp%40chromium.org&platform=Mac
,
Aug 1
😿 Pinpoint job stopped with an error. https://pinpoint-dot-chromeperf.appspot.com/job/13d430c3a40000 The swarming task expired. The bots are likely overloaded, dead, or misconfigured.
,
Aug 2
📍 Found significant differences after each of 2 commits. https://pinpoint-dot-chromeperf.appspot.com/job/16b71e8fa40000 Remove WebIDBFactoryImpl IOThreadHelper class by cmp@chromium.org https://chromium.googlesource.com/chromium/src/+/b5ce7038bfcaa4cdec4fe261f2dc8fc153eb4b59 16.71 → No values Roll src/third_party/chromite 0e00185925aa..3b438311bbf3 (1 commits) by chromite-chromium-autoroll@skia-buildbots.google.com.iam.gserviceaccount.com https://chromium.googlesource.com/chromium/src/+/2d5685c1264d02c4dc671cc9638d6ef02a133279 No values → 21.63 Assigning to sheriff chrome-os-gardeners@chromium.org because "Roll src/third_party/chromite 0e00185925aa..3b438311bbf3 (1 commits)" is a roll. Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Aug 2
,
Aug 2
Repeating what I mentioned in chat: There definitely seems to be some regression here, but I don't know if it is a bug in the implementation or a bug in the test. If you look at the traces before/after this change, in the after case you see that several of the get() operations don't complete in the iteration they start in (i.e. the tests do the same thing 20 times, but some of the gets from earlier iterations don't end until much later).
,
Dec 15
,
Dec 15
Issue 871369 has been merged into this issue.
,
Dec 15
Marked some duplicates of this bug: https://crbug.com/871252 - 25.9% regression in blink_perf.owp_storage - mac-10_13_laptop_high_end-perf https://crbug.com/871369 - 11.2% regression in blink_perf.owp_storage - android-nexus5x-perf
,
Dec 15
Updating summary... here's the original: 9%-25.5% regression in blink_perf.owp_storage at 578917:578941 This bug is related to https://crbug.com/872155.
,
Dec 21
FYI, I investigated this a couple times and found the most likely cause to be that the Mojo scheduling calls were moved in https://crrev.com/b5ce7038bfcaa4cdec4fe261f2dc8fc153eb4b59. Before that change, the Mojo calls would come in on the (relatively underutilized) content renderer IO thread. After that change, the Mojo calls came in on the (relatively loaded) Blink main thread. Since the Blink main thread was more heavily loaded, sometimes the Mojo calls would not fire as quickly as when they were on the content renderer IO thread. This cost each iteration of the idb-load-docs perf test around 5ms or so, and occasionally enough of the tasks would back up to cause a significant overrun that would lead to the overall result increasing by an amount big enough for chromeperf to notice. We theorized at the time that there was no way to avoid this extra cost since onion souping IndexedDB required moving all of that work off of content renderer's IO thread. We also expected that getting through IndexedDB onion souping would yield an overall improvement on this and other perf tests. I re-ran the idb-load-docs tests to verify this. These are the results: ~July 2018, before onion souping: * avg 83.5 ms, median 78.1 ms * avg 86.4 ms, median 84.8 ms * avg 81.6 ms, median 75.6 ms December 14, 2018: * avg 74.2 ms, median 71.9 ms * avg 76.7 ms, median 75.0 ms * avg 78.4 ms, median 73.9 ms Any regression we experienced from the earlier change has been earned back from our IndexedDB refactoring work. Based on that, I'm going to mark this as Fixed. Will point members of the Blink scheduling team at this bug in case there's additional work we should do based on our findings. |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Jul 30