Issue metadata
Sign in to add a comment
|
13.1%-15.2% regression in performance_browser_tests at 485606:485695 |
||||||||||||||||||||
Issue descriptionSee the link to graphs below.
,
Jul 13 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8974178120078147024
,
Jul 13 2017
=== Auto-CCing suspected CL author sunnyps@chromium.org === Hi sunnyps@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Sunny Sachanandani Commit : 63fbdcc12a7ef762e0652c972986fe74835ba00e Date : Tue Jul 11 17:26:47 2017 Subject: gpu: Enable gpu scheduler by default. Bisect Details Configuration: mac_air_perf_bisect Benchmark : performance_browser_tests Metric : TabCapturePerformance_comp_gpu_novsync/Capture Change : 12.57% | 16.795059 -> 18.9054815 Revision Result N chromium@485605 16.7951 +- 0.174227 6 good chromium@485650 16.8906 +- 0.228688 6 good chromium@485662 16.8738 +- 0.236854 6 good chromium@485663 18.7626 +- 0.803885 6 bad <-- chromium@485664 18.7178 +- 1.23281 6 bad chromium@485665 18.7674 +- 0.672383 6 bad chromium@485668 18.9298 +- 1.07151 6 bad chromium@485673 18.5953 +- 0.88885 6 bad chromium@485695 18.9055 +- 1.34249 6 bad To Run This Test ./src/out/Release/performance_browser_tests --test-launcher-print-test-stdio=always --enable-gpu More information on addressing performance regressions: http://g.co/ChromePerformanceRegressions Debug information about this bisect: https://chromeperf.appspot.com/buildbucket_job_status/8974178120078147024 For feedback, file a bug with component Speed>Bisection
,
Jul 13 2017
cc miu@ I enabled this finch feature on waterfall bots (perf bots included) and developer builds over 3 weeks ago. https://chromium-review.googlesource.com/c/517412/ I got some regressions from perf bots then. Do you know why this benchmark only regressed after I set the feature to enabled everywhere? Is the waterfall field trial config not applying here?
,
Jul 13 2017
Is this using is_chrome_branded = true GN arg by any chance? Because that would prevent the waterfall fieldtrial config from being applied.
,
Jul 13 2017
+Ned: do the field trial configs only work for telemetry (not c++ perf tests)?
,
Jul 13 2017
miu@ is the owner of performance_browser_tests and can answer this. I have no knowledge on how the C++ perf tests are currently implemented.
,
Jul 14 2017
Unfortunately, I don't know anything about when fieldtrial configs are turned on. :) However, it seems that, for whatever reason, the fieldtrial config was not being applied. I'm not even sure what the policy should be: Should the Chrome Perf Infra be testing what's live to end-users, or what we want to make live in the future? I always assumed it was the former... Back to the issue at-hand: It seems that the latency for Capture on MacBook Air increased due to the GPU scheduler. So, for some reason the scheduler increased latency as a result of its scheduling decisions, and maybe this is WAI? I don't know anything about the GPU Scheduler, so I can't say what is good or bad behavior here. For tab capture, what matters most is that, over a time window of ~200ms, the number of frames captured has remained the same. And, other metrics, such as dropped frames, capture resolution, etc.; should not be regressing. In other words, the throughput of GPU processing and data xfer, while possibly multiple frames are in-flight, should not be regressing.
,
Jul 15 2017
Canvas hangs in WaitForTokenInRange (for BufferData called from Skia Ganesh). In the gpu process, the canvas command buffer seems to run ~1 frame behind the display compositor context. The WaitForTokenInRange unblocks when the previous frame's commands for canvas run. Ideally we should have two frames worth of space in the ring buffer but that doesn't seem to be the case for this benchmark. I'll add some logging to verify that we're close to hitting the ring buffer memory limit even without the scheduler. I also investigated if we had issues with sync tokens but that doesn't seem to be the case. The only sync token waits are by the display compositor (indirect waits by the scheduler) when it waits for the canvas frame from 2 frames ago. Those sync token waits are short because the canvas commands run quickly on the gpu process. Attaching traces with and without the scheduler on macbook pro. The scheduler trace also has flow events for command buffer flushes. Stack trace for WaitForTokenInRange: [93925:775:0714/150821.934166:INFO:command_buffer_proxy_impl.cc(347)] 0 libbase.dylib 0x00000001057e5a2c base::debug::StackTrace::StackTrace(unsigned long) + 28 1 libgpu.dylib 0x0000000110bea709 gpu::CommandBufferProxyImpl::WaitForTokenInRange(int, int) + 105 2 libgpu.dylib 0x0000000110ae8af3 gpu::CommandBufferHelper::WaitForToken(int) + 163 3 libgpu.dylib 0x0000000110aeb4da gpu::RingBuffer::FreeOldestBlock() + 58 4 libgpu.dylib 0x0000000110aeb588 gpu::RingBuffer::Alloc(unsigned int) + 56 5 libgpu.dylib 0x0000000110aecfd1 gpu::ScopedTransferBufferPtr::Reset(unsigned int) + 97 6 libgles2_implementation.dylib 0x00000001175e56d8 gpu::gles2::GLES2Implementation::BufferDataHelper(unsigned int, long, void const*, unsigned int) + 344 7 libskia.dylib 0x000000010c0ca735 GrGLBuffer::onUpdateData(void const*, unsigned long) + 133 8 libskia.dylib 0x000000010c003ef7 GrBufferAllocPool::flushCpuData(GrBufferAllocPool::BufferBlock const&, unsigned long) + 199 9 libskia.dylib 0x000000010c0042ae GrBufferAllocPool::createBlock(unsigned long) + 382 10 libskia.dylib 0x000000010c004099 GrBufferAllocPool::makeSpace(unsigned long, unsigned long, GrBuffer const**, unsigned long*) + 153 11 libskia.dylib 0x000000010c00491c GrVertexBufferAllocPool::makeSpace(unsigned long, int, GrBuffer const**, int*) + 44 12 libskia.dylib 0x000000010c04504f (anonymous namespace)::AAConvexPathOp::onPrepareDraws(GrMeshDrawOp::Target*) const + 1535 13 libskia.dylib 0x000000010c05e82a GrMeshDrawOp::onPrepare(GrOpFlushState*) + 26 14 libskia.dylib 0x000000010c029b79 GrRenderTargetOpList::prepareOps(GrOpFlushState*) + 361 15 libskia.dylib 0x000000010c00cb99 GrDrawingManager::internalFlush(GrSurfaceProxy*, GrResourceCache::FlushType) + 841 16 libskia.dylib 0x000000010c00cec3 GrDrawingManager::prepareSurfaceForExternalIO(GrSurfaceProxy*) + 67 17 libskia.dylib 0x000000010c028269 GrRenderTargetContext::prepareForExternalIO(int, GrBackendSemaphore*) + 713 18 libblink_platform.dylib 0x00000001153cc7d2 blink::Canvas2DLayerBridge::Flush() + 114 19 libblink_platform.dylib 0x00000001153ccf39 blink::Canvas2DLayerBridge::NewImageSnapshot(blink::AccelerationHint, blink::SnapshotReason) + 73 20 libblink_platform.dylib 0x00000001153cccf2 blink::Canvas2DLayerBridge::PrepareTextureMailbox(viz::TextureMailbox*, std::__1::unique_ptr<cc::SingleReleaseCallback, std::__1::default_delete<cc::SingleReleaseCallback> >*) + 146 21 libcc.dylib 0x000000011079704b cc::TextureLayer::Update() + 107 22 libcc.dylib 0x0000000110849c19 cc::LayerTreeHost::DoUpdateLayers(cc::Layer*) + 1145 23 libcc.dylib 0x00000001108496b6 cc::LayerTreeHost::UpdateLayers() + 54 24 libcc.dylib 0x000000011088f6d6 cc::ProxyMain::BeginMainFrame(std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc::BeginMainFrameAndCommitState> >) + 1046 25 libcc.dylib 0x000000011088e6e5 void base::internal::Invoker<base::internal::BindState<void (cc::ProxyMain::*)(std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc::BeginMainFrameAndCommitState> >), base::WeakPtr<cc::ProxyMain>, base::internal::PassedWrapper<std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc::BeginMainFrameAndCommitState> > > >, void ()>::RunImpl<void (cc::ProxyMain::*)(std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc::BeginMainFrameAndCommitState> >), std::__1::tuple<base::WeakPtr<cc::ProxyMain>, base::internal::PassedWrapper<std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc::BeginMainFrameAndCommitState> > > >, 0ul, 1ul>(void (cc::ProxyMain::*&&)(std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc::BeginMainFrameAndCommitState> >), std::__1::tuple<base::WeakPtr<cc::ProxyMain>, base::internal::PassedWrapper<std::__1::unique_ptr<cc::BeginMainFrameAndCommitState, std::__1::default_delete<cc::BeginMainFrameAndCommitState> > > >&&, base::IndexSequence<0ul, 1ul>) + 165 26 libbase.dylib 0x00000001057e6362 base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 242 27 libblink_platform.dylib 0x00000001154a5b84 blink::scheduler::TaskQueueManager::ProcessTaskFromWorkQueue(blink::scheduler::internal::WorkQueue*, bool, blink::scheduler::LazyNow, base::TimeTicks*) + 1364 28 libblink_platform.dylib 0x00000001154a3be2 blink::scheduler::TaskQueueManager::DoWork(bool) + 434 29 libbase.dylib 0x00000001057e6362 base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) + 242 30 libbase.dylib 0x000000010581990b base::MessageLoop::RunTask(base::PendingTask*) + 507 31 libbase.dylib 0x0000000105819cf5 base::MessageLoop::DeferOrRunPendingTask(base::PendingTask) + 213 32 libbase.dylib 0x000000010581a049 base::MessageLoop::DoWork() + 425 33 libbase.dylib 0x000000010581d00a base::MessagePumpCFRunLoopBase::RunWork() + 42 34 libbase.dylib 0x000000010580a42a base::mac::CallWithEHFrame(void () block_pointer) + 10 35 libbase.dylib 0x000000010581c92f base::MessagePumpCFRunLoopBase::RunWorkSource(void*) + 63 36 CoreFoundation 0x00007fff997bc321 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17 37 CoreFoundation 0x00007fff9979d21d __CFRunLoopDoSources0 + 557 38 CoreFoundation 0x00007fff9979c716 __CFRunLoopRun + 934 39 CoreFoundation 0x00007fff9979c114 CFRunLoopRunSpecific + 420 40 Foundation 0x00007fff9b1af4e2 -[NSRunLoop(NSRunLoop) runMode:beforeDate:] + 277 41 libbase.dylib 0x000000010581d76e base::MessagePumpNSRunLoop::DoRun(base::MessagePump::Delegate*) + 126 42 libbase.dylib 0x000000010581c17c base::MessagePumpCFRunLoopBase::Run(base::MessagePump::Delegate*) + 92 43 libbase.dylib 0x00000001058489d3 base::RunLoop::Run() + 51 44 libcontent.dylib 0x000000010e0d7d41 content::RendererMain(content::MainFunctionParams const&) + 641 45 libcontent.dylib 0x000000010e2088f0 content::ContentMainRunnerImpl::Run() + 368 46 libembedder.dylib 0x000000010bc8738d service_manager::Main(service_manager::MainParams const&) + 2445 47 libcontent.dylib 0x000000010e207e24 content::ContentMain(content::ContentMainParams const&) + 68 48 libchrome_dll.dylib 0x0000000105b54ac8 ChromeMain + 168 49 Chromium Helper 0x00000001057a69f3 main + 1571 50 libdyld.dylib 0x00007fffaef16235 start + 1
,
Jul 15 2017
+piman@ and vmiura@ who suggested that we're running out of ring buffer memory
,
Jul 15 2017
,
Jul 15 2017
CL for dumping traces from the benchmark and extra traces: https://chromium-review.googlesource.com/c/572158/ The test to run is TabCapturePerformance.Performance/3
,
Jul 26 2017
,
Jul 26 2017
,
Sep 14 2017
sunnyps: any update here?
,
Oct 25 2017
sunnyps: ping on this bug?
,
Oct 25 2017
The benchmark has recovered. I expected that because of a scheduling change I made (https://chromium-review.googlesource.com/c/chromium/src/+/706129). However, the improvement is in this range: https://chromium.googlesource.com/chromium/src/+log/c9e738432a1bce852e99ba4deb9f9dcac8450fdd%5E..2e3e5e893c8cfd98aca703f32ec0f97fafaf5f5f?pretty=fuller&n=1000 I've started a bisect.
,
Oct 26 2017
๐ Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/13f687b0f80000
,
Oct 26 2017
๐ Found a significant difference after 1 commit. https://pinpoint-dot-chromeperf.appspot.com/job/13f687b0f80000 [viz] Ensure that ClientLTFS acknowledges racy BeginFrames. By eseckler@chromium.org ยท Sun Sep 24 07:41:53 2017 chromium @ 1b697fc546af51e355c458bc4e10f821a1c8d551 Understanding performance regressions: http://g.co/ChromePerformanceRegressions
,
Oct 26 2017
Hm, interesting. Funnily enough, that same patch is also regressing another related benchmark slightly, see bug 771292 . I haven't had a chance yet to grab a trace though.
,
Jan 5 2018
eseckler: Any update on this bug?
,
Jan 8 2018
The patch in #19 has improved this benchmark (but regressed the one in bug 771292 ). I don't think there's anything left to do for this bug. |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Jul 13 2017