Issue metadata
Sign in to add a comment
|
87% regression in thread_times.simple_mobile_sites at 509316:509373 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Oct 19 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8965293892647281360
,
Oct 19 2017
=== Auto-CCing suspected CL author hajimehoshi@chromium.org === Hi hajimehoshi@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Hajime Hoshi Commit : 1b8485c6d66e754d5fd729d636e5d1c50a261e21 Date : Tue Oct 17 09:52:33 2017 Subject: Revert "Revert "Add RendererSchedulerImpl::V8TaskQueue()"" Bisect Details Configuration: android_nexus5X_perf_bisect Benchmark : thread_times.simple_mobile_sites Metric : thread_other_cpu_time_per_frame/http___m.nytimes.com_ Change : 63.25% | 0.29229375473 -> 0.477180665531 Revision Result N chromium@509315 0.292294 +- 0.179034 6 good chromium@509330 0.118043 +- 0.00690289 6 good chromium@509337 0.115045 +- 0.00563413 6 good chromium@509338 0.118146 +- 0.00821447 6 good chromium@509339 0.48434 +- 0.0299722 6 bad <-- chromium@509341 0.480644 +- 0.037889 6 bad chromium@509344 0.477082 +- 0.0211648 6 bad chromium@509373 0.477181 +- 0.0280439 6 bad To Run This Test src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=http...m.nytimes.com. thread_times.simple_mobile_sites More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8965293892647281360 For feedback, file a bug with component Speed>Bisection
,
Oct 19 2017
The original CL had caused regression too: crbug.com/772145
,
Oct 20 2017
,
Oct 20 2017
,
Oct 20 2017
Not sure this is a serious regression, but I'd like to mark this wontfix since I guess this is a temporary regression and we would be able to fix this by improving scheduling. haraken@, skyostil@, altimin@ wdyt?
,
Oct 20 2017
,
Oct 20 2017
Hmm. This looks like a real regression. 87% is huge. Also 5 - 7% regressions are observed in Dromaeo/dom. Do you know why introducing the V8 task queue caused the regression?
,
Oct 23 2017
,
Oct 24 2017
,
Oct 24 2017
I think this is a real regression and needs to be addressed. tasak@ is looking into performance impacts of task scheduling and might have some insight.
,
Oct 24 2017
Thank you, tasak@ san!
,
Oct 24 2017
Looking... Does anyone know any documents about thread_other_cpu_time_per_frame? ...I was not able to find. So I'm reading python scripts. As far as I investigated, the name is generated by ThreadCpuTimeResultName (tools/perf/metrics/timeline.py). So cs/q='thread_other_cpu_time_per_frame' doesn't work...
,
Oct 24 2017
I think "other" could be one of the IO or worker threads that aren't explicitly mapped to named bucket. I couldn't see a significant difference in any of these threads, so the first task here would be to find out exactly which thread regressed. This is all very bizarre since there doesn't seem to be barely any v8 task queue activity in the traces that regressed. I can't think of a reason why adding a single extra task queue would make the scheduler code noticeably more expensive (which looks like what's happening here). A perf profile might tell us a little more.
,
Oct 24 2017
I note the new QueueType::V8 is classified as QueueClass::NONE which means it won't get throttled. Previously these tasks where on the Loading task queue which does get throttled. Could that be the cause?
,
Oct 24 2017
I've just finished reading timeline metrics related code. So I've just started analyzing the tracing files.
,
Oct 24 2017
I think, "other"(of renderer) means:
"name": "TaskSchedulerRendererForegroundWorker"
"name": "TaskSchedulerRendererForegroundWorker"
"name": "ScriptStreamerThread"
"name": "TaskSchedulerRendererForegroundWorker"
"name": "TaskSchedulerRendererBackgroundBlockingWorker"
"name": "TaskSchedulerRendererForegroundWorker"
"name": "TaskSchedulerRendererForegroundWorker"
So looking at the tracing files, "ScriptStreamerThread" and "TaskSchedulerRendererBackgroundBlockingWorker" are innocent, because they have no slices during Interaction.Gesture_ScrollAction.
I think, we need to investigate:
"TaskSchedulerRendererForegroundWorker"
src_file "../../gin/v8_platform.cc"
src_func "CallOnBackgroundThread"
,
Oct 24 2017
,
Oct 24 2017
Issue 777822 has been merged into this issue.
,
Oct 25 2017
I finished investigating this issue. I agree with alexclarke. I think, this issue is caused by v8_tq. The TaskSchedulerRendererForegroundWorker tasks are posted by v8_tq tasks.
,
Oct 25 2017
Thank you for the investigation, tasak@-san! So the solution would be: 1) Create a new throttleable policy for V8 task queue 2) Use the same policy as loading for V8 task queue for backward compatibility I think 1) would make more sense.
,
Oct 25 2017
The TaskSchedulerRendererForegroundWorker tasks are for V8.GC. So after chromium@509339, V8 Incremental GC spends more time than before chromium@509339. I think, the android devices doesn't have much memory, so chrome wants to invoke v8.gc. Since the new v8_tq is not throttled, v8.gc is not throttled... thread_other_cpu_time_per_frame is regressed.
,
Oct 26 2017
I'm a little confused, how do TaskSchedulerRendererForegroundWorker tasks get accounted against the "other" thread (which is what regressed here)? Shouldn't they get counted toward the main thread since that's where they execute? In any case I agree that we need a throttling policy for this new queue which more or less matches how things were before. We can talk about adding high priority v8 tasks after that.
,
Oct 27 2017
I don't know what 'thread_other_cpu_time_per_%s' should report. So if we need a new metric for TaskSchedulerRendererForegroundWorker, we need to update TimelineThreadCategories in src/tools/perf/metrics/timeline.py. c.f. ThreadCpuTimeResultName and ThreadCategoryName in https://cs.chromium.org/chromium/src/tools/perf/metrics/timeline.py ThreadCpuTimeResultName creates the metric name: "thread_" + thread_category + "_cpu_time_per_" + interval_name. ThreadCategoryName creates the thread_category by using thread name: --- TimelineThreadCategories = { "Chrome_InProcGpuThread": "GPU", "CrGpuMain": "GPU", "AsyncTransferThread": "GPU_transfer", "CrBrowserMain": "browser", "Browser Compositor": "browser", "CrRendererMain": "renderer_main", "Compositor": "renderer_compositor", "IOThread": "IO", "CompositorTileWorker": "raster", "DummyThreadName1": "other", "DummyThreadName2": "total_fast_path", "DummyThreadName3": "total_all" } _MatchBySubString = ["IOThread", "CompositorTileWorker"] --- If a given thread name is not in TimelineThreadCatregories.keys(), its ThreadCategoryName is 'other'. 'TaskSchedulerRendererForegroundWorker' is not in TimelineThreadCategories.keys() and, 'IOThread' and 'CompsitorTileWorker' are not its substrings. So, 'other'.
,
Oct 30 2017
Previously we have been blocking v8 tasks during user scroll together with the rest of loading tasks. The introduction of a v8 task queue meant that we started running some gc-related tasks during scrolling. Based on my discussions with v8 folks I believe that we should run these tasks even during scrolling - all regressions are very small (<1ms) and we don't want to stop running GC in case we have critical memory pressure and GC will save us from OOM (these tasks check for this). Reported memory regressions are measurement glitch - we just record more metrics in these tests instead of disabling GC outright. So I suggest that this is "WontFix".
,
Oct 30 2017
Thanks Alexander for the analysis! That reasoning sounds totally reasonable and if that's the case, I'm fine with marking this as wontfix.
,
Oct 31 2017
Thank you, Alexander! I'm also fine with making this wontfix. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Oct 19 2017