New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 776506 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug-Regression



Sign in to add a comment

70.9% regression in system_health.common_desktop at 507399:507489

Project Member Reported by chiniforooshan@chromium.org, Oct 19 2017

Issue description

See the link to graphs below.
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Oct 19 2017

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=776506

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=b00de32bfddc48f4ae3c80db75581471592864dbc541d0daf07e5e5f0638a5fe


Bot(s) for this bug's original alert(s):

chromium-rel-win7-dual
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Oct 19 2017


=== BISECT JOB RESULTS ===
NO Perf regression found

Bisect Details
  Configuration: win_perf_bisect
  Benchmark    : system_health.common_desktop
  Metric       : total:500ms_window:renderer_eqt_max/browse_news/browse_news_reddit

Revision             Result                  N
chromium@507398      760.735 +- 1530.22      21      good
chromium@507489      649.677 +- 795.286      21      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=browse.news.reddit system_health.common_desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8965278252256498720


For feedback, file a bug with component Speed>Bisection
Retrying on a less noisy metric.
Project Member

Comment 6 by 42576172...@developer.gserviceaccount.com, Oct 20 2017

📍 Couldn't reproduce a difference.
https://pinpoint-dot-chromeperf.appspot.com/job/14b28517780000
Cc: simonhatch@chromium.org
Status: WontFix (was: Untriaged)
Marking as won't fix since the bisect cannot find the culprit. CCing simonhatch@ to check if this is normal. The graph show a clear ~70% regression and the bisect cannot find a culprit. Maybe the regression is because of a change in the bots...
Cc: sullivan@chromium.org perezju@chromium.org
+perezju, sullivan

I'm not aware of any change on the bots that coincides with that, plus it jumped on all bots. There doesn't seem to be any ref data available anywhere, not sure if this is a known issue, +annie otherwise should look into that.
Cc: tdres...@chromium.org
+tdresser: Any thoughts on how much to investigate? We haven't been monitoring this metric long so not sure how confident we are.
Cc: -perezju@chromium.org nedngu...@google.com charliea@chromium.org
+charliea, +nednguyen who own system_health.common_*
Presumably we'll start getting ref data automatically eventually?

Do the points drawn on the dashboard come from a single run, or multiple runs?

This metric does have a fair bit of noise in some cases. Reddit looks pretty rock solid though.

The next step would be to take a look at a trace. Unfortunately, I'm getting a "An error occurred: Error uploading patch to rietveld_service" message when trying to get a debug trace.
Cc: vhang@chromium.org pschmidt@chromium.org
Status: Available (was: WontFix)
Note that this happens on multiple different bots: chromium-rel-win-7, chromium-rel-win-7-gpu-nvidia, chromium-rel-mac-12, chromium-rel-win7-x64-dual, linux-release, chromium-rel-win-7-gpu-ati, so it's super unlikely to be a simple bot hardware issue.

Besides looking at the trace at Tim mentioned, we should also look at the similar metric on other benchmark (which I suspect didn't alert because those other benchmarks have ref build).

+Vince/Peter: are you aware of any major change in the lab recently? 
Before & after traces on ChromiumPerf/chromium-rel-win7-dual/system_health.common_desktop / total:500ms_window:renderer_eqt_max / browse_news / browse_news_reddit

Before: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/browse_news_reddit_2017-10-09_09-46-29_70033.html

After: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/browse_news_reddit_2017-10-10_12-33-17_58751.html
Thanks for the traces.

I'm seeing absolutely massive |RenderAccessibilityImpl::SendPendingAccessibilityEvents| slices. They're significantly larger (longest is 400ms) in the "after" trace, but still scary (200ms) in the first trace.

EQT is pretty sensitive to changes in the duration of the longest task, so I suspect this is the source of the regression.

Any idea why this is taking so long?
Re: #12.  Can you give me actual bulder/bot names?
Cc: dmazz...@chromium.org
+dmazzoni: any ideas why RenderAccessibilityImpl::SendPendingAccessibilityEvents could take so long? Does this do any communication with the server or anything else that could behave strangely on bots?
I don't understand why you should be seeing any traces at all for RenderAccessibilityImpl::SendPendingAccessibilityEvents when running browse_news_reddit.

Accessibility is supposed to be off by default, so you shouldn't get any traces at all for that function except for the three stories where we added a command-line flag to enable accessibility.

Independent of that, the first call to SendPendingAccessibilityEvents after page load scales with the number of DOM nodes, so it wouldn't surprise me that a large Reddit page with a lot of comments would be slow. That's something we'd like to fix, and tracking it here is an important first step.

Still, the most important question is why we're seeing these traces for stories that shouldn't be enabling accessibility.

Project Member

Comment 19 by 42576172...@developer.gserviceaccount.com, Oct 24 2017


=== BISECT JOB RESULTS ===
NO Perf regression found

Bisect Details
  Configuration: win_perf_bisect
  Benchmark    : system_health.common_desktop
  Metric       : total:500ms_window:renderer_eqt_max/browse_news/browse_news_reddit

Revision             Result                  N
chromium@507335      718.532 +- 1100.95      21      good
chromium@507624      639.183 +- 741.934      21      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=browse.news.reddit system_health.common_desktop

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8964843805262352400


For feedback, file a bug with component Speed>Bisection
Who is the right person to follow up on why these trace points are showing up?
Has anyone tried to reproduce this locally?

If this reproduces locally (i.e. you can see the traces) I can debug it from there, maybe next week.

If you run it live, try visiting chrome://accessibility in a tab and see if any checkboxes are checked.

When I added the accessibility stories I manually checked that as a sanity check - chrome://accessibility was showing accessibility enabled for those stories but not for others - so I'm not sure what's going on here.

dmazzoni@: you can click "M" button in the trace to see the commandline flag used. Can you check if the accessibility is enabled accidentally?
Owner: crouleau@chromium.org
Yep, that's it:

--force-renderer-accessibility

It looks like the extra browser args from tools/perf/page_sets/system_health/accessibility_stories.py are somehow leaking to other stories.

@crouleau, could you look at https://chromiumcodereview.appspot.com/3011293002 again?

Labels: -Pri-2 Pri-1
Status: Assigned (was: Available)
This is a bit serious bug, so I am raising this bug to P1.
Cc: perezju@chromium.org
finder_options is a mess 😢

I think the bug is here:
https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/page/shared_page_state.py?q=_StartBrowser

we should not be mutating the finder options there, as this is a global object shared by all stories in the set.

The fix is probably to make a copy of the object there, mutate the copy, and pass that to possible_browser.Create(..)
Hi, I will fix this. It seems like the fix is fairly straightforward.
Update: I verified that perezju@ is correct in both his diagnosis and in his suggested fix by adding log statements, verifying that flags from previous page extra flags were in the next page, and then verifying that that was no longer the case after the fix. CL is on its way. I figure we will need to add unit tests after since we want to get this fixed asap.
This should be fixed now. See https://github.com/catapult-project/catapult/issues/4000
Status: Fixed (was: Assigned)
Marking this fixed. Please verify, dmazzoni@chromium.org.
Status: Started (was: Fixed)
Caleb: can you add test that would catch the bug as well?
 dmazzoni@, you can use the trace link by clicking on graph link: https://screenshot.googleplex.com/1chA0UeMoQA.png
Status: Fixed (was: Started)
Sure, I can add a test. I will add it as a task for https://github.com/catapult-project/catapult/issues/4000, and still mark this bug fixed.
Re #32: Awesome! Thanks folks!
Cc: majidvp@chromium.org nednguyen@chromium.org
 Issue 770860  has been merged into this issue.

Sign in to add a comment