New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 801120 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Closed: Jan 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

Software compositing tab capture does not pool and reuse copy request backings

Project Member Reported by hjd@chromium.org, Jan 11 2018

Issue description

See the link to graphs below.
 
Project Member

Comment 1 by 42576172...@developer.gserviceaccount.com, Jan 11 2018

All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=801120

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=fe6ef1a23b183bf2b3c1e1319dac6c736123f9ef0306adfbeb11229fdfd0d5a9


Bot(s) for this bug's original alert(s):

chromium-rel-mac11-pro
Project Member

Comment 2 by 42576172...@developer.gserviceaccount.com, Jan 11 2018

๐Ÿ“ Pinpoint job started.
https://pinpoint-dot-chromeperf.appspot.com/job/14dfd3a7040000
Project Member

Comment 3 by 42576172...@developer.gserviceaccount.com, Jan 11 2018

Cc: danakj@chromium.org piman@chromium.org
Owner: danakj@chromium.org
Status: Assigned (was: Untriaged)
๐Ÿ“ Found a significant difference after 1 commit.
https://pinpoint-dot-chromeperf.appspot.com/job/14dfd3a7040000

Reduce fragmentation of gpu memory by rounding RenderPass textures.
By danakj@chromium.org ยท Wed Jan 03 18:25:14 2018
chromium @ 07fe0b6e0939a8aa223d8fb526a8040b39441858

Understanding performance regressions:
  http://g.co/ChromePerformanceRegressions

Comment 4 by m...@chromium.org, Jan 18 2018

Note: I just noticed the Y-axis label on the graphs is wrong. It should be milliseconds. Not sure why the perf infrastructure is ignoring the test stdout units...

Anyway, the failed metrics are measuring the average time between capture events. Meaning, we attempt 60 FPS, so an ideal would be ~16.7 ms. However, we are seeing they went from 36 ms to 38 ms (about 26 FPS). So, the execution of CopyOutputRequests has slowed down with this CL.

Also, notice that "_gpu" is not present in the metric name. That means these are testing the viz::SoftwareRenderer, not the viz::GLRenderer. I suspect the rounding-up of textures, which happens in viz::DirectRenderer, has increased the size of the surfaces in viz::SoftwareRenderer, and so the CPU is now doing more work. Or, perhaps, with the increased stride of the larger bitmaps in memory, CPU memory caching has suffered?

Note that the old tab capture impl is still being used at this time. This means SoftwareRenderer:CopyFromDrawnRenderPass() is being called, but it is NOT being asked to do any scaling or I420 format conversion on the copy requests. Scaling and format conversion happens downstream, in DelegatedFrameHost::CopyFromCompositingSurfaceHasResultForVideo(): https://cs.chromium.org/chromium/src/content/browser/renderer_host/delegated_frame_host.cc?rcl=e886b675233053c768ef61a25d1f9e398bde0cc9&l=698

Comment 5 by danakj@chromium.org, Jan 18 2018

Cc: m...@chromium.org
Owner: ----
Status: Available (was: Assigned)
Summary: Software compositing tap capture does not pool and reuse copy request backings (was: 6.1% regression in performance_browser_tests at 526685:526772)
I think it's because in the software renderer case we made no optimizations to prevent malloc/free of the renderpass backings every capture. If we draw any frames in between the capture events, the renderpass backings would be evicted from the SoftwareRenderer. In the GL case we have the tab capture client allocate the backing and pass it in to optimize for this. So now it's just allocating more memory.

The way to fix this though is to stop allocating each frame, not just to shrink the renderpass size.

This is where we allocate/reuse resources for the gpu path: https://cs.chromium.org/chromium/src/content/browser/renderer_host/delegated_frame_host.cc?rcl=b165dc5211edda5957d621643749d22d71152afe&l=414

That if will fail for the software path, and it has no fallback.

miu@ all the work I saw for new-fangled tab capture was in GLRenderer. Does the software story improve there?
Perf sheriff checking in: danakj, miu, any ideas of who could own this bug?

Comment 7 by danakj@chromium.org, Jan 25 2018

My claim here is that software based tab capture is already slow and unoptimized, so changing it here doesn't really matter, but it points at that fact so now we have a bug. Since it was not a priority to make it fast in the past, it doesn't seem clear that it's a priority to do so now. miu@ do you have opinions?

Comment 8 by m...@chromium.org, Jan 25 2018

Owner: danakj@chromium.org
Status: Assigned (was: Available)
Summary: Software compositing tab capture does not pool and reuse copy request backings (was: Software compositing tap capture does not pool and reuse copy request backings)
danakj: The "round-up-to-64" change wasn't meant to affect the main memory being allocated for the SoftwareRenderer, correct? I agree with you, in comments 5 and 7, that we don't care about capture performance in the SoftwareRenderer.

However, the problem we really need to attend to here is totally orthogonal to tab capture: The "round-up-to-64" change is increasing main memory usage whenever software compositing is being used. Even if we don't care about software compositing performance, the extra memory usage is taking up a precious resource from the rest of the system.

Thus, IMHO, this change should be revisited; and probably only affect texture allocations in GLRenderer.

--------------

Reply to comments 5 and 7: For GLRenderer, the "backings" are being cached in GLRendererCopier to make GPU-accelerated tab capture efficient. Note that all the tab-capture-related code in DelegatedFrameHost will be deleted soon, as just yesterday it all became dead code paths. :)

For software, we didn't bother to optimize. There's really no way to make it efficient enough to run full framerate video capture, even with the caching; so we really aren't bothering there.

Comment 9 by m...@chromium.org, Jan 25 2018

So, this crbug started as a tab capture performance regression; and we have all agreed not to bother resolving that.

Since the important issue is main memory usage when using the software renderer, perhaps a new crbug should be cut?
> danakj: The "round-up-to-64" change wasn't meant to affect the main memory being allocated for the SoftwareRenderer, correct?

No, it did. Instead of allocating a 100x100 bitmap, we'd allocate a 128x128 one. See the linked bug for the motivation, but basically meant to help with memory fragmentation as we're reusing surfaces more and they are similar sizes.

> The "round-up-to-64" change is increasing main memory usage whenever software compositing is being used. Even if we don't care about software compositing performance, the extra memory usage is taking up a precious resource from the rest of the system.

Sure.. the memory change is pretty small, render surfaces are very few, and the goal was to reduce fragmentation which reduces memory pressure. See the linked bug.

> Thus, IMHO, this change should be revisited; and probably only affect texture allocations in GLRenderer.

The premise of that bug doesn't only apply to GL textures.
 Issue 801119  has been merged into this issue.

Comment 12 by m...@chromium.org, Jan 26 2018

Status: WontFix (was: Assigned)
Okay. It sounds like this is all a WontFix then. :)
sg thanks!
Cc: mustaq@chromium.org erikc...@chromium.org
 Issue 800105  has been merged into this issue.

Sign in to add a comment