New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 776375 link

Starred by 4 users

Issue metadata

Status: WontFix
Owner:
Closed: Oct 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression

Blocking:
issue 773970



Sign in to add a comment

87% regression in thread_times.simple_mobile_sites at 509316:509373

Project Member Reported by chiniforooshan@chromium.org, Oct 19 2017

Issue description

See the link to graphs below.
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Oct 19 2017

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=776375

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=2fb921f9ed52df17c9d9d31d06d2a82992b0162a637a041b0f2c2cab6f420e24


Bot(s) for this bug's original alert(s):

android-nexus5X
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Oct 19 2017

Cc: hajimehoshi@chromium.org
Owner: hajimehoshi@chromium.org
Status: Assigned (was: Untriaged)

=== Auto-CCing suspected CL author hajimehoshi@chromium.org ===

Hi hajimehoshi@chromium.org, the bisect results pointed to your CL, please take a look at the
results.


=== BISECT JOB RESULTS ===
Perf regression found with culprit

Suspected Commit
  Author : Hajime Hoshi
  Commit : 1b8485c6d66e754d5fd729d636e5d1c50a261e21
  Date   : Tue Oct 17 09:52:33 2017
  Subject: Revert "Revert "Add RendererSchedulerImpl::V8TaskQueue()""

Bisect Details
  Configuration: android_nexus5X_perf_bisect
  Benchmark    : thread_times.simple_mobile_sites
  Metric       : thread_other_cpu_time_per_frame/http___m.nytimes.com_
  Change       : 63.25% | 0.29229375473 -> 0.477180665531

Revision             Result                      N
chromium@509315      0.292294 +- 0.179034        6      good
chromium@509330      0.118043 +- 0.00690289      6      good
chromium@509337      0.115045 +- 0.00563413      6      good
chromium@509338      0.118146 +- 0.00821447      6      good
chromium@509339      0.48434 +- 0.0299722        6      bad       <--
chromium@509341      0.480644 +- 0.037889        6      bad
chromium@509344      0.477082 +- 0.0211648       6      bad
chromium@509373      0.477181 +- 0.0280439       6      bad

To Run This Test
  src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=http...m.nytimes.com. thread_times.simple_mobile_sites

More information on addressing performance regressions:
  http://g.co/ChromePerformanceRegressions

Debug information about this bisect:
  https://chromeperf.appspot.com/buildbucket_job_status/8965293892647281360


For feedback, file a bug with component Speed>Bisection
The original CL had caused regression too:  crbug.com/772145 
Cc: haraken@chromium.org altimin@chromium.org
Cc: skyos...@chromium.org
Not sure this is a serious regression, but I'd like to mark this wontfix since I guess this is a temporary regression and we would be able to fix this by improving scheduling.

haraken@, skyostil@, altimin@ wdyt?
Project Member

Comment 8 by 42576172...@developer.gserviceaccount.com, Oct 20 2017

Cc: pmeenan@chromium.org
 Issue 776756  has been merged into this issue.
Hmm. This looks like a real regression. 87% is huge. Also 5 - 7% regressions are observed in Dromaeo/dom.

Do you know why introducing the V8 task queue caused the regression?


Project Member

Comment 10 by 42576172...@developer.gserviceaccount.com, Oct 23 2017

Cc: petermarshall@chromium.org
 Issue 777329  has been merged into this issue.
Blocking: 773970
Cc: tasak@google.com
I think this is a real regression and needs to be addressed.

tasak@ is looking into performance impacts of task scheduling and might have some insight.
Thank you, tasak@ san!

Comment 14 by tasak@google.com, Oct 24 2017

Looking...

Does anyone know any documents about thread_other_cpu_time_per_frame?

...I was not able to find. So I'm reading python scripts.
As far as I investigated, the name is generated by ThreadCpuTimeResultName (tools/perf/metrics/timeline.py). So cs/q='thread_other_cpu_time_per_frame' doesn't work...

Cc: alexclarke@chromium.org
I think "other" could be one of the IO or worker threads that aren't explicitly mapped to named bucket. I couldn't see a significant difference in any of these threads, so the first task here would be to find out exactly which thread regressed.

This is all very bizarre since there doesn't seem to be barely any v8 task queue activity in the traces that regressed. I can't think of a reason why adding a single extra task queue would make the scheduler code noticeably more expensive (which looks like what's happening here). A perf profile might tell us a little more.
I note the new QueueType::V8 is classified as QueueClass::NONE which means it won't get throttled.  Previously these tasks where on the Loading task queue which does get throttled.  Could that be the cause?

Comment 17 by tasak@google.com, Oct 24 2017

I've just finished reading timeline metrics related code.

So I've just started analyzing the tracing files.

Comment 18 by tasak@google.com, Oct 24 2017

I think, "other"(of renderer) means:

                "name": "TaskSchedulerRendererForegroundWorker"
                "name": "TaskSchedulerRendererForegroundWorker"
                "name": "ScriptStreamerThread"
                "name": "TaskSchedulerRendererForegroundWorker"
                "name": "TaskSchedulerRendererBackgroundBlockingWorker"
                "name": "TaskSchedulerRendererForegroundWorker"
                "name": "TaskSchedulerRendererForegroundWorker"

So looking at the tracing files, "ScriptStreamerThread" and "TaskSchedulerRendererBackgroundBlockingWorker" are innocent, because they have no slices during Interaction.Gesture_ScrollAction.

I think, we need to investigate:
   "TaskSchedulerRendererForegroundWorker" 
      src_file "../../gin/v8_platform.cc"
      src_func "CallOnBackgroundThread"

Project Member

Comment 19 by 42576172...@developer.gserviceaccount.com, Oct 24 2017

Cc: kraynov@chromium.org
 Issue 777792  has been merged into this issue.
Project Member

Comment 20 by 42576172...@developer.gserviceaccount.com, Oct 24 2017

 Issue 777822  has been merged into this issue.

Comment 21 by tasak@google.com, Oct 25 2017

I finished investigating this issue.

I agree with alexclarke. I think, this issue is caused by v8_tq.
The TaskSchedulerRendererForegroundWorker tasks are posted by v8_tq tasks.


Thank you for the investigation, tasak@-san!

So the solution would be:

1) Create a new throttleable policy for V8 task queue
2) Use the same policy as loading for V8 task queue for backward compatibility

I think 1) would make more sense.

Comment 23 by tasak@google.com, Oct 25 2017

The TaskSchedulerRendererForegroundWorker tasks are for V8.GC.

So after chromium@509339, V8 Incremental GC spends more time than before chromium@509339.
I think, the android devices doesn't have much memory, so chrome wants to invoke v8.gc. Since the new v8_tq is not throttled, v8.gc is not throttled... thread_other_cpu_time_per_frame is regressed.
I'm a little confused, how do TaskSchedulerRendererForegroundWorker tasks get accounted against the "other" thread (which is what regressed here)? Shouldn't they get counted toward the main thread since that's where they execute?

In any case I agree that we need a throttling policy for this new queue which more or less matches how things were before. We can talk about adding high priority v8 tasks after that.

Comment 26 by tasak@google.com, Oct 27 2017

I don't know what 'thread_other_cpu_time_per_%s' should report.

So if we need a new metric for TaskSchedulerRendererForegroundWorker, we need to update TimelineThreadCategories in src/tools/perf/metrics/timeline.py.

c.f.
ThreadCpuTimeResultName and ThreadCategoryName in  https://cs.chromium.org/chromium/src/tools/perf/metrics/timeline.py

ThreadCpuTimeResultName creates the metric name: "thread_" + thread_category + "_cpu_time_per_" + interval_name.

ThreadCategoryName creates the thread_category by using thread name:
---
  TimelineThreadCategories = {
    "Chrome_InProcGpuThread": "GPU",
    "CrGpuMain": "GPU",
    "AsyncTransferThread": "GPU_transfer",
    "CrBrowserMain": "browser",
    "Browser Compositor": "browser",
    "CrRendererMain": "renderer_main",
    "Compositor": "renderer_compositor",
    "IOThread": "IO",
    "CompositorTileWorker": "raster",
    "DummyThreadName1": "other",
    "DummyThreadName2": "total_fast_path",
    "DummyThreadName3": "total_all"
  }

  _MatchBySubString = ["IOThread", "CompositorTileWorker"]
---
If a given thread name is not in TimelineThreadCatregories.keys(), its ThreadCategoryName is 'other'.

'TaskSchedulerRendererForegroundWorker' is not in TimelineThreadCategories.keys() and, 'IOThread' and 'CompsitorTileWorker' are not its substrings. So, 'other'.


Previously we have been blocking v8 tasks during user scroll together with the rest of loading tasks. The introduction of a v8 task queue meant that we started running some gc-related tasks during scrolling. Based on my discussions with v8 folks I believe that we should run these tasks even during scrolling - all regressions are very small (<1ms) and we don't want to stop running GC in case we have critical memory pressure and GC will save us from OOM (these tasks check for this).

Reported memory regressions are measurement glitch - we just record more metrics in these tests instead of disabling GC outright.

So I suggest that this is "WontFix". 
Thanks Alexander for the analysis! That reasoning sounds totally reasonable and if that's the case, I'm fine with marking this as wontfix.


Status: WontFix (was: Assigned)
Thank you, Alexander! I'm also fine with making this wontfix.

Sign in to add a comment