New issue
Advanced search Search tips

Issue 865699 link

Starred by 2 users

Issue metadata

Status: Started
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

platform_MemoryPressure should measure tab switch performance

Project Member Reported by sonnyrao@chromium.org, Jul 19

Issue description

platform_MemoryPressure is useful because it shows us the tab capacity of a particular system configuration, however it doesn't give us any indication of how janky the system is near it's limits.  

It would be useful to get some kind of measurement of this -- I think the easiest might be to measure tab switch times during phase 2.

We could also consider measuring the load times, but this may vary quite a bit because we're currently using live sites, so switch time seems more reasonable.
 
Components: Tests
Cc: -sonnyrao@chromium.org cylee@chromium.org vovoy@chromium.org semenzato@chromium.org cywang@chromium.org
Owner: sonnyrao@chromium.org
I've started looking at this

Page load is still tricky because we're using live sites, so I think I'll stick with switch times for now.

It looks like in the tab switching test we use Chrome's internal histograms to measure this, and we have code in chrome under src/tools/perf/contrib/cros_benchmarks/cros_utils.py to manipulate the histograms.  I confirmed this file is present on the DUT under /usr/local/telemetry/ so we ought to be able to use it in the autotest.

We can measure tab switch duration in phase 2 and report that relatively easily.  

Status: Started (was: Untriaged)
So I have a prototype of this using the code from chromium however I'm seeing some strange behavior when I use it.  I added a phase 3 where we iterate through all of the tabs, and grab the histogram before and afterwards.  I tested it with a limited small number of abs (like 5) and it seems to work, however when I tested it as normal where we hit a discard, somehow it's not working.  So, I'm currently debugging this.
What do you mean by "somehow it's not working"?
Oh sorry -- basically the histogram isn't getting updated after I call tab.Activate() in phase 3.  It remains static after I go through the list of tabs calling Activate on all of them.

NP, thank you for the explanation.  By "it remains static" I assume you mean that it isn't getting new samples.  Unfortunately I haven't seen that behavior before.
ok, so I started just counting all of the times that we're calling tab.Activate() in phase 1 and  once I go over 5 tabs I start to see a divergence in the number of activates and the number of samples in MPArch.RWH_TabSwitchPaintDuration

Possibly a bug in chrome?
When a tab is discarded, it's devtools context was lost, tab.Activate() and many other functions would not work for discarded tab.

See RunPageInteractions() in tab_switching_stories[1], it avoids using lost context by using keyboard emulator to switch tabs and using _GetTabSwitchHistogram()[2] to get the histogram.

[1]: https://cs.chromium.org/chromium/src/tools/perf/contrib/cros_benchmarks/tab_switching_stories.py?rcl=2558336f62d2b75cb3b38025660c365631634892&l=63
[2]: https://cs.chromium.org/chromium/src/tools/perf/contrib/cros_benchmarks/cros_utils.py?rcl=dafb7b3d81a275d9bc3255769731781d110847d3&l=47
#8 I don't think there are any tab discards in this case.

#7 I wonder if there is any legitimate reason to NOT go through the code path that increments the histogram.  I presume you're waiting long enough.

Sorry, just bikeshedding.
re #8 -- yeah I thought about that -- it could be happening for sure, but in the case where I limit the number of tabs there's no discards.  So I'm not sure why the activation count wouldn't match up

re #9 -- I also though maybe i wasn't waiting long enough in phase 3 so i increased the timeout there but it made no difference -- but there was a discard in that case, so #8 might apply.  I'll try increasing the timeout in phase 1 and see if I can get it to match up better
another interesting piece of information -- If I go back to stable channel   68.0.3440.78 and this seems to work as I expect it to and I get a new sample every time I activate a tab.  So, right now it does seem like a bug in chrome :-(
I'm going to check and see if this is also broken on Linux
Cc: sadrul@chromium.org
I bisected it down to between 69.0.3449.0 and 69.0.3451.0 which contains this change:

tab-switching: Use presentation time for measuring tab-switching time.

https://chromium.googlesource.com/chromium/src/+/8bc43cc019d9983e96f9a7c5549f862674991e97

from this bug:
 https://bugs.chromium.org/p/chromium/issues/detail?id=849182

i'm pretty sure that's what is changing the behavior. I'm not sure if they meant to break this or not.  Adding sadrul@ - author of that CL
It'd be best if chromeos used the same tabs measurement used for other platforms: https://cs.chromium.org/chromium/src/tools/perf/benchmarks/tab_switching.py?l=46
re #13 Hi thanks, yes I'd like to do that.   We may need to update some of the existing benchmarks.  

Is the correct way to measure tab switch latency now to use timeline based metrics instead of using the UMA histograms directly?  

Independent of that though, I'm still not sure why the UMA histogram would stop recording samples?
yeah FYI it looks like cros_tab_switching.typical_24 is also broken by the same issue -- which is to be expected

[ RUN      ] cros_tab_switching.typical_24/cros_tab_switching_typical24
opening tab: 0
opening tab: 10
opening tab: 20
(WARNING) 2018-08-01 15:50:33,337 cros_utils.WaitTabSwitching:118  Timed out waiting for histogram count increasing.
switching tab: 0
(WARNING) 2018-08-01 15:50:49,040 cros_utils.WaitTabSwitching:118  Timed out waiting for histogram count increasing.
(WARNING) 2018-08-01 15:51:04,615 cros_utils.WaitTabSwitching:118  Timed out waiting for histogram count increasing.
(WARNING) 2018-08-01 15:51:20,193 cros_utils.WaitTabSwitching:118  Timed out waiting for histogram count increasing.

I'm still working on enabling the timeline based measurement for the autotest
If switching to a predicted tab (e.g. If you keep switch to the next right tab, it's the predicted tab), the histogram value get via javascript: statsCollectionController.getBrowserHistogram('MPArch.RWH_TabSwitchPaintDuration') doesn't update.
If switching to the non-predicted tab (e.g. click a random tab), the histogram value via javascript updates.


re #16 -- ok that makes sense, since we're always switching to the next right tab.  I didn't realize Chrome is doing some prediction there -- I'm not sure that real users operate this way though?  Maybe we should also try switching to a random tab instead?
Cc: nverne@chromium.org
"tab-switching: Use presentation time for measuring tab-switching time." (https://crrev.com/c/1083862) is the culprit.

Before the patch, UMA_HISTOGRAM_TIMES("MPArch.RWH_TabSwitchPaintDuration", ...) is called in the browser process. After the patch, UMA_HISTOGRAM_TIMES() may be called in the browser process or the renderer process. When UMA_HISTOGRAM_TIMES() is called in the renderer process, the renderer histogram data may not be merged to the global histogram data and statsCollectionController.getBrowserHistogram() gets the old data.

A solution is to call StatisticsRecorder::ImportProvidedHistograms() in RenderProcessHostImpl::GetBrowserHistogram() to merged the histogram data.

On chromebook Yorp(Intel Celeron N4000), ImportProvidedHistograms() takes about 0.02 second when there are 10 renderer processes, and takes about 0.2 second when there are 120 renderer processes.
It would really be better to switch to using trace-events to measure the latency here (like it is done for tab_switching benchmark).
Yes, I think both are worth doing.
1. Fixing GetBrowserHistogram() to get up-to-date data.
2. Using trace-events in cros_tab_switching.typical_24. cros_tab_switching uses global histogram to walk around the devtools context lost issue(some tabs may be discarded), there may be other solution.

The CL to fix the histogram issue: https://crrev.com/c/1188175
tab_switching benchmark doesn't handle the devtools context lost issue, it would crash if some tabs are discarded. cros_tab_switching handles context lost with a platform specific solution (sending key press events to switch tab and using histogram to detect if tab switching is finished). It would be better if there is a general solution to the context lost issue.
Project Member

Comment 22 by bugdroid1@chromium.org, Aug 29

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a72a0acd3fab7ef832e3715b86c15741e0dd68ab

commit a72a0acd3fab7ef832e3715b86c15741e0dd68ab
Author: Kuo-Hsin Yang <vovoy@chromium.org>
Date: Wed Aug 29 07:26:24 2018

cros_tab_switching: migrate to TBMv2

Migrate the cros_tab_switching benchmark to use the new TBMv2 metric for
measuring tab-switching latency. The return value of
GetBrowserHistogram("MPArch.RWH_TabSwitchPaintDuration") may be
out-of-date as the histogram data is distributed across browser and
renderer processes. Detecting tab switching completion with histogram is
not working. Use a configurable constant waiting time after tab
switching.

Bug: chromium:865699
Change-Id: I73dce8760b839e2c368bb3bfcfb976de1d817aa2
Reviewed-on: https://chromium-review.googlesource.com/1193384
Reviewed-by: Cheng-Yu Lee <cylee@chromium.org>
Commit-Queue: Vovo Yang <vovoy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#587031}
[modify] https://crrev.com/a72a0acd3fab7ef832e3715b86c15741e0dd68ab/tools/perf/contrib/cros_benchmarks/tab_switching_bench.py
[modify] https://crrev.com/a72a0acd3fab7ef832e3715b86c15741e0dd68ab/tools/perf/contrib/cros_benchmarks/tab_switching_stories.py

Project Member

Comment 23 by bugdroid1@chromium.org, Aug 30

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/9a666dd6cfd827e59291c6151cab34368adfbf76

commit 9a666dd6cfd827e59291c6151cab34368adfbf76
Author: Kuo-Hsin Yang <vovoy@chromium.org>
Date: Thu Aug 30 02:20:17 2018

Update the docstring of GetBrowserHistogram

Bug: chromium:865699
Change-Id: I0ac462d400757c3f313dd0e9b0bf555c6da6639c
Reviewed-on: https://chromium-review.googlesource.com/1188175
Reviewed-by: Brian White <bcwhite@chromium.org>
Reviewed-by: Nasko Oskov <nasko@chromium.org>
Commit-Queue: Vovo Yang <vovoy@chromium.org>
Cr-Commit-Position: refs/heads/master@{#587418}
[modify] https://crrev.com/9a666dd6cfd827e59291c6151cab34368adfbf76/content/common/renderer_host.mojom

Project Member

Comment 24 by bugdroid1@chromium.org, Dec 9

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/328aa65b60eedd93ab2bed2fd51e425353b470e5

commit 328aa65b60eedd93ab2bed2fd51e425353b470e5
Author: Luigi Semenzato <semenzato@chromium.org>
Date: Sun Dec 09 08:48:18 2018

cros-go.eclass: allow building from single Go source

CROS_GO_BINARIES can now specify single GO files, allowing
non-standard source directory layouts.

Normally, Go projects have  src/cmd and src/pkg directories
for executables and libraries, respectively.  For instance:

catapult/
  web_page_replay_go/
    src/
      cmd/
        httparchive/
        wpr/
      pkg/
        webpagereplay/

However, sources aren't always organized this way and one
could find, for instance (and NOT hypothetically):

catapult/
  web_page_replay_go/
    wpr.go
    httparchive.go
    webpagereplay/
      < ... bunch of .go files ... >

Go allows such non-standard layouts, but the executables
must be compiled by specifying the path of the individual
Go files, for instance

go build <...>/catapult/web_page_replay_go/wpr.go -o wpr

This change enables this model.

BUG=chromium:865699
TEST=works with a use case (upcoming CL)

Change-Id: I07ccd3b4555a3e4e2802caed5d5ce0ff85915904
Reviewed-on: https://chromium-review.googlesource.com/1359026
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Luigi Semenzato <semenzato@chromium.org>
Reviewed-by: Dan Erat <derat@chromium.org>

[modify] https://crrev.com/328aa65b60eedd93ab2bed2fd51e425353b470e5/eclass/cros-go.eclass

Sign in to add a comment