Issue metadata
Sign in to add a comment
|
1.1%-34.6% regression in memory metrics, media.tough_video_cases_tbmv2 at 476906:477865 |
||||||||||||||||||||
Issue descriptionlots of different tests on windows and mac bots
,
Jun 9 2017
Started bisect job https://chromeperf.appspot.com/buildbucket_job_status/8977236495682377952
,
Jun 9 2017
=== Auto-CCing suspected CL author dalecurtis@chromium.org === Hi dalecurtis@chromium.org, the bisect results pointed to your CL, please take a look at the results. === BISECT JOB RESULTS === Perf regression found with culprit Suspected Commit Author : Dale Curtis Commit : 24fca6fb8a70b95cae28843751485529e02378c0 Date : Wed Jun 07 03:50:37 2017 Subject: Enable parallel decode requests for VpxVideoDecoder when offloading. Bisect Details Configuration: mac_air_perf_bisect Benchmark : media.tough_video_cases_tbmv2 Metric : memory:chrome:renderer_processes:reported_by_os:system_memory:native_heap:private_dirty_size_avg/video.html?src_smpte_3840x2160_60fps_vp9.webm_seek Change : 34.47% | 113071746.0 -> 152051811.333 Revision Result N chromium@477007 113071746 +- 112886 6 good chromium@477386 113033119 +- 220947 6 good chromium@477481 113021267 +- 169866 6 good chromium@477529 113117013 +- 599026 6 good chromium@477535 113081419 +- 305149 6 good chromium@477536 112980697 +- 130670 6 good chromium@477537 152106575 +- 80073.4 6 bad <-- chromium@477538 152053877 +- 231039 6 bad chromium@477541 152072094 +- 161152 6 bad chromium@477553 152100509 +- 76413.9 6 bad chromium@477576 152120296 +- 185757 6 bad chromium@477765 152051811 +- 168621 6 bad To Run This Test src/tools/perf/run_benchmark -v --browser=release --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=video.html.src.smpte.3840x2160.60fps.vp9.webm.seek media.tough_video_cases_tbmv2 Debug Info https://chromeperf.appspot.com/buildbucket_job_status/8977236495682377952 Is this bisect wrong? https://chromeperf.appspot.com/bad_bisect?try_job_id=4548125500375040 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Speed>Bisection. Thank you!
,
Jun 10 2017
Issue 731871 has been merged into this issue.
,
Jun 10 2017
Issue 731875 has been merged into this issue.
,
Jun 10 2017
Issue 731863 has been merged into this issue.
,
Jun 10 2017
Issue 731876 has been merged into this issue.
,
Jun 10 2017
Issue 731874 has been merged into this issue.
,
Jun 12 2017
Ah, I guess the difference with GpuVideoDecoder is we have a CanReadWithoutStalling() which limits the number of max parallel decodes. With the current approach we may have up to 8 frames in flight vs the 4 before. Will see if we can limit to 5 total like GVD ends up doing.
,
Jun 12 2017
Issue 732469 has been merged into this issue.
,
Jun 13 2017
Hmm, that ends up being very hard to do since it's not about limiting parallel decodes but the number of outstanding frames... Also, per the Telemetry tests there doesn't seem to have been any real associated improvements with this CL, which is surprising. I'm going to let this soak for a few more days on canary then revert and see if the Media.RebuffersCount metric changes - there was no change in the MTBR metric AFAICT, so it doesn't seem to help there after all. If that doesn't improve then the memory increase has nothing going for it, so we'll scrap this experiment and focus on just general queue size improvements.
,
Jun 13 2017
Marking RBB so I don't forget about it.
,
Jun 13 2017
This issue is marked as a release blocker with no OS labels associated. Please add an appropriate OS label. All release blocking issues should have OS labels associated to it, so that the issue can tracked and promptly verified, once it gets fixed. Thanks for your time! To disable nags, add the Disable-Nags label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jun 13 2017
,
Jun 13 2017
Issue 732486 has been merged into this issue.
,
Jun 13 2017
As a note on severity: We're still well below the memory usage for equivalent resolution h264 software decoded material. Regardless, I'd like to see an improvement if this is actually helping though.
,
Jun 14 2017
Issue 731855 has been merged into this issue.
,
Jun 16 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/bdfd91d9a149f6189f03368a661f099976f144b1 commit bdfd91d9a149f6189f03368a661f099976f144b1 Author: Dale Curtis <dalecurtis@chromium.org> Date: Fri Jun 16 18:21:16 2017 Revert "Enable parallel decode requests for VpxVideoDecoder when offloading." This reverts commit 24fca6fb8a70b95cae28843751485529e02378c0. Reason for revert: All internal metrics are neutral to negative; YouTube reports significant _decrease_ in MTBR around this date; so lets revert and see if that improves the issue. Original change's description: > Enable parallel decode requests for VpxVideoDecoder when offloading. > > Similar to the GPUVideoDecoder, where decodes are being queued off > of the calling thread, we can improve throughput within parallel > requests. > > When VpxVideoDecoder uses an offload thread (>1024 width) we can > queue multiple decodes to the offload thread. This should improve > resiliency to glitches. > > To avoid incurring penalties to seek or shutdown, as part of this > change a base::AtomicFlag is used to abort pending decodes when > Reset() or ~VpxVideoDecoder() are used. > > Results show this improves the average read time (time from > VideoRendererImpl::AttemptRead() until VRI::FrameReady()) for a > 4K60 video from ~2.183ms to ~1.79ms an ~22% improvement. Total > time spent waiting for reads goes down from 3931.89ms to 3224.54ms; > similarly showing the 22% improvement. > > BUG=648710, 731841 > TEST=performance tests > > Change-Id: I88cbc778a8ccdd35d7be4c24d0bc12be15cfe661 > Reviewed-on: https://chromium-review.googlesource.com/518543 > Reviewed-by: Dan Sanders <sandersd@chromium.org> > Commit-Queue: Dale Curtis <dalecurtis@chromium.org> > Cr-Commit-Position: refs/heads/master@{#477537} TBR=dalecurtis@chromium.org,sandersd@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 648710 Change-Id: Id92127f76edbecc6447234e39d77f3a74e06b660 Reviewed-on: https://chromium-review.googlesource.com/538134 Commit-Queue: Dale Curtis <dalecurtis@chromium.org> Reviewed-by: Dale Curtis <dalecurtis@chromium.org> Cr-Commit-Position: refs/heads/master@{#480104} [modify] https://crrev.com/bdfd91d9a149f6189f03368a661f099976f144b1/media/filters/vpx_video_decoder.cc [modify] https://crrev.com/bdfd91d9a149f6189f03368a661f099976f144b1/media/filters/vpx_video_decoder.h
,
Jun 16 2017
These have returned to normal levels post-revert.
,
Mar 7 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/16adbb2a440000
,
Mar 7 2018
📍 Pinpoint job started. https://pinpoint-dot-chromeperf.appspot.com/job/13d424b2440000
,
Mar 7 2018
Whoops tagged the wrong bug for those jobs.
,
Mar 8 2018
📍 Couldn't reproduce a difference. https://pinpoint-dot-chromeperf.appspot.com/job/16adbb2a440000
,
Mar 9 2018
😿 Pinpoint job stopped with an error. https://pinpoint-dot-chromeperf.appspot.com/job/13d424b2440000
,
Mar 27 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/00eda06906b09bb0b001207e29ed8ebb8c0e03e1 commit 00eda06906b09bb0b001207e29ed8ebb8c0e03e1 Author: Dale Curtis <dalecurtis@chromium.org> Date: Tue Mar 27 03:14:31 2018 Attempt to parallelize offloaded video decoding again... http://crrev.com/477537 was the last attempt at this, it was reverted due to perf regressions and an MTBR reduction on YT. We'll watch this CL closely to ensure it doesn't cause any similar regressions to the first attempt. Here are the differences from the last attempt: - Only offloading 1 extra frame instead of 3. - No waiting for Reset() to complete (unnecessary). - GpuMemoryBuffer upload stage is in parallel now. This behavior only works with decoders that have synchronous Reset() and Decode() calls, which is all of our software decoders, but most importantly VpxVideoDecoder, which is the only offloaded video decoder at the moment. With this change the total time to read 4000 frames improves by 3x, from 11659350us to 3427885us and drops zero frames on average when running at 4K60 with https://www.youtube.com/watch?v=1La4QzGeaaQ BUG= 731841 , 801245 TEST=new tests, manual verification. Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: I20852897044c504088a83434d6bde60104822864 Reviewed-on: https://chromium-review.googlesource.com/945029 Commit-Queue: Dale Curtis <dalecurtis@chromium.org> Reviewed-by: Xiaohan Wang <xhwang@chromium.org> Cr-Commit-Position: refs/heads/master@{#545943} [modify] https://crrev.com/00eda06906b09bb0b001207e29ed8ebb8c0e03e1/media/filters/offloading_video_decoder.cc [modify] https://crrev.com/00eda06906b09bb0b001207e29ed8ebb8c0e03e1/media/filters/offloading_video_decoder.h [modify] https://crrev.com/00eda06906b09bb0b001207e29ed8ebb8c0e03e1/media/filters/offloading_video_decoder_unittest.cc |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by chcunningham@chromium.org
, Jun 9 2017