New issue
Advanced search Search tips

Issue 755381 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux , Windows
Pri: 1
Type: Bug-Regression



Sign in to add a comment

Chrome: Crash Report - PostTaskAndReplyRelay::RunTaskAndPostReply from media::WebMediaPlayerImpl::ReportMemoryUsage

Project Member Reported by cr...@system.gserviceaccount.com, Aug 14 2017

Issue description

reporter:gab@google.com

crash_analysis_section:start
crash_analysis_section:end

Magic Signature: base::`anonymous namespace'::PostTaskAndReplyRelay::RunTaskAndPostReply

Crash link: https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%3D'base%3A%3A%60anonymous%20namespace%5C'%3A%3APostTaskAndReplyRelay%3A%3ARunTaskAndPostReply'%20AND%20product.Version%20CONTAINS%20'62.0.'%20AND%20ReportID%3D'74e3b30f1054dd98'%20AND%20stable_signature%3D'base%3A%3Ainternal%3A%3AFunctorTraits%3Cvoid%20(*)(base%3A%3ACallback%3Cvoid%20(const%20disk_cache%3A%3ASimpleBackendImpl%3A%3ADiskStatResult%20%26)%2Cbase%3A%3Ainternal%3A%3ACopyMode%3A%3AMoveOnly%2Cbase%3A%3Ainternal%3A%3ARepeatMode%3A%3AOnce%3E%2C%20disk_cache%3A%3ASimpleBackendImpl%3A%3ADiskStatResult%20*)%2Cvoid%3E%3A%3AInvoke%3Cbase%3A%3ACallback%3Cvoid%20(const%20disk_cache%3A%3ASimpleBackendImpl%3A%3ADiskStatResult%20%26)%2Cbase%3A%3Ainternal%3A%3ACopyMode%3A%3AMoveOnly%2Cbase%3A%3Ainternal%3A%3ARepeatMode%3A%3AOnce%3E%2Cdisk_cache%3A%3ASimpleBackendImpl%3A%3ADiskStatResult%20*%3E-1d594648'&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D#3

-------------------------------------------------------------------------------
Sample Report
-------------------------------------------------------------------------------
Product name: Chrome
Magic Signature : base::`anonymous namespace'::PostTaskAndReplyRelay::RunTaskAndPostReply
Product Version: 62.0.3185.0
Report ID: 74e3b30f1054dd98
Report Url: https://crash.corp.google.com/74e3b30f1054dd98
Report Time: 2017-08-14T11:43:37-07:00
Upload Time: 2017-08-14T11:43:40.709-07:00
Uptime: 3898000 ms
CumulativeProductUptime: 0 ms
OS Name: Windows NT
OS Version: 6.1.7601 23392
CPU Architecture: x86
CPU Info: GenuineIntel family 6 model 23 stepping 10

-------------------------------------------------------------------------------
Crashing thread: Thread index: 15. Stack Quality: 78%. Thread id: 3268.
-------------------------------------------------------------------------------
0x64206572 
0x5b892658 (chrome_child.dll - bind_internal.h: 151)	base::internal::FunctorTraits<void (*)(base::Callback<void (const disk_cache::SimpleBackendImpl::DiskStatResult &),base::internal::CopyMode::MoveOnly,base::internal::RepeatMode::Once>, disk_cache::SimpleBackendImpl::DiskStatResult *),void>::Invoke<base::Callback<void (const disk_cache::SimpleBackendImpl::DiskStatResult &),base::internal::CopyMode::MoveOnly,base::internal::RepeatMode::Once>,disk_cache::SimpleBackendImpl::DiskStatResult *>
0x5b892625 (chrome_child.dll - bind_internal.h: 307)	base::internal::Invoker<base::internal::BindState<void (*)(base::Callback<disk_cache::SimpleBackendImpl::DiskStatResult (),base::internal::CopyMode::MoveOnly,base::internal::RepeatMode::Once>, disk_cache::SimpleBackendImpl::DiskStatResult *),base::Callback<disk_cache::SimpleBackendImpl::DiskStatResult (),base::internal::CopyMode::MoveOnly,base::internal::RepeatMode::Once>,disk_cache::SimpleBackendImpl::DiskStatResult *>,void ()>::RunOnce
0x5b73fff7 (chrome_child.dll - post_task_and_reply_impl.cc: 44)	base::`anonymous namespace'::PostTaskAndReplyRelay::RunTaskAndPostReply
0x5a896685 (chrome_child.dll - task_annotator.cc: 57)	base::debug::TaskAnnotator::RunTask(char const *,base::PendingTask *)
0x5a895f2a (chrome_child.dll - message_loop.cc: 410)	base::MessageLoop::RunTask(base::PendingTask *)
0x5b71f0b3 (chrome_child.dll - message_loop.cc: 421)	base::MessageLoop::DeferOrRunPendingTask(base::PendingTask)
0x5a88fbb4 (chrome_child.dll - message_loop.cc: 528)	base::MessageLoop::DoWork()
0x5a88fa56 (chrome_child.dll - message_pump_default.cc: 33)	base::MessagePumpDefault::Run(base::MessagePump::Delegate *)
0x5a88f9c0 (chrome_child.dll - message_loop.cc: 350)	base::MessageLoop::Run()
0x5a88f634 (chrome_child.dll - run_loop.cc: 123)	base::RunLoop::Run()
0x5a88f5fa (chrome_child.dll - thread.cc: 255)	base::Thread::Run(base::RunLoop *)
0x5a88eb34 (chrome_child.dll - thread.cc: 338)	base::Thread::ThreadMain()
0x5b711ff8 (chrome_child.dll - platform_thread_win.cc: 89)	base::`anonymous namespace'::ThreadFunc
0x7606ef1b (KERNEL32.dll + 0x0004ef1b)	BaseThreadInitThunk
0x77873679 (ntdll.dll + 0x00063679)	__RtlUserThreadStart
0x7787364c (ntdll.dll + 0x0006364c)	_RtlUserThreadStart

 

Comment 1 by gab@chromium.org, Aug 14 2017

Cc: siggi@chromium.org tzik@chromium.org
Components: Blink>Storage>CacheStorage
Labels: -Restrict-View-EditIssue
Owner: morlovich@chromium.org
Status: Assigned (was: Untriaged)
Summary: Chrome: Crash Report - SimpleCacheBackendImpl - base::`anonymous namespace'::PostTaskAndReplyRelay::RunTaskAndPostReply (was: Chrome: Crash Report - base::`anonymous namespace'::PostTaskAndReplyRelay::RunTaskAndPostReply)
The crash blames PostTaskAndReplyRelay::RunTaskAndPostReply() but really the problem comes when invoking Run() on what appears to be a use-after-free.

The callback template tells us this is replying with a disk_cache::SimpleBackendImpl::DiskStatResult which from code search can only be the reply to SimpleBackendImpl::InitializeIndex() from SimpleBackendImpl::InitCacheStructureOnDisk().

https://cs.chromium.org/chromium/src/net/disk_cache/simple/simple_backend_impl.cc?type=cs&q=%26SimpleBackendImpl::InitializeIndex&sq=package:chromium

Should this reply use a WeakPtr?

@tzik: any idea how we could better highlight the cause in crash (i.e. in this case we're lucky because the reply traits make it obvious but on a PostTaskAndReply with, e.g., a bool we would have no idea who's at fault...).
Status: Started (was: Assigned)
Thanks for the forward and the decoding. 
Hmm, the reply already uses a WeakPtr, though? 
This being recent likely means it's related to the forward direction changing from the cache thread (except this is renderer, not browser anyway..)


Also are you sure it's not in the post direction --- wouldn't the reply have RunReplyAndSelfDestruct in the backtrace? 
Status: Untriaged (was: Started)
Hmm actually kinda wondering if DiskStatResult didn't just end up winning the merged instantiation lottery for this particular binary (62.0.3185.0)

Focusing on Windows (since Android ones don't have any info past RunTaskAndPostReply):
3188 all seem to be DxDiagNode at top frame (and IOResult below)
3187 mentions DiskStatResult on top, BlobItemBytesResponse one frame below.
3186 is DxDiagNode/VideoCaptureSettings
3185 is as you described
3184 (one report) has DxDiagNode/MhtmlSaveStatus
older ones only have android stuff.

To me this suggests that the attribution via this method is likely unreliable, so if it's correct it is by accident...

Comment 6 by ajha@chromium.org, Aug 21 2017

Labels: -Type-Bug -Pri-2 ReleaseBlock-Stable M-62 Pri-1 Type-Bug-Regression
Just to update, crashes with this magic signature seems to have appeared on Windows in M-62 since chrome version:62.0.3184.0.

Link to the list of the builds on Windows:
===========================================
https://crash.corp.google.com/browse?q=custom_data.ChromeCrashProto.magic_signature_1.name%3D%27base%3A%3A%60anonymous%20namespace%5C%27%3A%3APostTaskAndReplyRelay%3A%3ARunTaskAndPostReply%27%20AND%20product.name%3D%27Chrome_Android%27&ignore_case=false&enable_rewrite=true&omit_field_name=&omit_field_value=&omit_field_opt=%3D&unnest=#-property-selector,samplereports:5,productversion:1000.

Latest Dev(62.0.3188.4) of Windows has reported 95 crashes from 90 clients and is ranked as #12 amongst top renderer crashes. 

Note: There have been no crashes reported on the last 2 Windows canary(62.0.3190.0 and 62.0.3191.0). Marking this as Stable blocker for M-62 and for tracking.

Comment 7 by tzik@chromium.org, Aug 21 2017

Components: Blink>Media
Looked into some of them.
Some of VideoCaptureSettings cases look caused by a miscompilation or a broken binary.

On DxDiagNode and DiskStatResult cases, all I saw was posted by media::WebMediaPlayerImpl::ReportMemoryUsage according to the TaskAnnotator stack. So, these are likely a part of http://crbug.com/726532. A base::Unretained to |demuxer_| likely causes an UAF.
Cc: morlovich@chromium.org
Owner: ----
Summary: Chrome: Crash Report - PostTaskAndReplyRelay::RunTaskAndPostReply from media::WebMediaPlayerImpl::ReportMemoryUsage (was: Chrome: Crash Report - SimpleCacheBackendImpl - base::`anonymous namespace'::PostTaskAndReplyRelay::RunTaskAndPostReply)
Retitling this to making it more obvious to people scanning who may want to look at it, per the analysis in comment #7.

Components: -Blink>Media Internals>Media
Components: -Blink>Storage>CacheStorage
|demuxer_| isn't cleared until ~WMPI at which point WeakPtrs will have been invalidated. It's guaranteed alive at this point. There are only a couple call sites that modify that value, so I don't think it's possible for the scenario in c#7 to occur.

The stack traces in that bug also point to the hasAudio() line, which just reads a member variable, so I don't think it's related to a null demuxer_.

Comment 13 by tzik@chromium.org, Aug 25 2017

#c12: Do we have any chance to run the posted task runs after WMPI dtor?
It looks to me that if:
 - WebMediaPlayerImpl::OnError
 - WebMediaPlayerImpl::ReportMemoryUsage
 - WebMediaPlayerImpl::~WebMediaPlayerImpl
 - Demuxer::GetMemoryUsage
are called in this order, it hits a UAF.
The returns are all bound AsWeakPtr() and ~WMPI call Pipeline::Stop() which blocks on the media thread until shutdown:

https://cs.chromium.org/chromium/src/media/base/pipeline_impl.cc?type=cs&sq=package:chromium&l=1033

By that point the post to Demuxer from ReportMemoryUsage must have completed.

Comment 15 by tzik@chromium.org, Aug 26 2017

If WebMediaPlayerImpl::OnError is called before ~WMPI, PipelineController::Stop is called from the OnError [1], and PipelineController gets to State::STOPPED state. So, another PipelineController::Stop in ~WMPI is a noop [2] and doesn't flush the media task runner.

[1]: https://cs.chromium.org/chromium/src/media/blink/webmediaplayer_impl.cc?l=1349
[2]: https://cs.chromium.org/chromium/src/media/filters/pipeline_controller.cc?l=250

Comment 16 by tzik@chromium.org, Aug 26 2017

 Issue 758935  has been merged into this issue.
Project Member

Comment 17 by ClusterFuzz, Aug 26 2017

Labels: OS-Linux
Owner: sande...@chromium.org
Status: Assigned (was: Untriaged)
Ah, thanks for your patience tzik@, yes that does seem like that is the issue then. We're not syncing the media thread if we've already stopped in error. 

=>sandersd want to pick this one up?
Cc: sande...@chromium.org

Comment 20 by w...@chromium.org, Aug 30 2017

Issue 758139 has been merged into this issue.
Owner: dalecur...@chromium.org
Dan is out sick so I'll try to merge a fix for this later today.
Cc: tguilbert@chromium.org
+tguilbert whose https://chromium-review.googlesource.com/c/chromium/src/+/612204 added the code to OnError where we stop the pipeline there now. Previously it always ran in the destructor and thus we didn't have this issue.

https://chromium-review.googlesource.com/c/chromium/src/+/644147 should fix it, but if not we may need to revert the change to Stop() during error.
Project Member

Comment 23 by bugdroid1@chromium.org, Aug 31 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8a62813291c54988b34a4c49e8ed3ce535f95f97

commit 8a62813291c54988b34a4c49e8ed3ce535f95f97
Author: Dale Curtis <dalecurtis@chromium.org>
Date: Thu Aug 31 00:35:53 2017

Don't collect demuxer memory usage after pipeline stop.

We can't rely on Pipeline::Stop() to cycle the media thread in this
case, so the demuxer may be destroyed before use.

BUG= 755381 
TEST=new unittest

Change-Id: I5c78eaccb8f2f59dd724a891f853c3269f4a07d6
Reviewed-on: https://chromium-review.googlesource.com/644147
Reviewed-by: Thomas Guilbert <tguilbert@chromium.org>
Commit-Queue: Dale Curtis <dalecurtis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#498696}
[modify] https://crrev.com/8a62813291c54988b34a4c49e8ed3ce535f95f97/media/blink/webmediaplayer_impl.cc
[modify] https://crrev.com/8a62813291c54988b34a4c49e8ed3ce535f95f97/media/blink/webmediaplayer_impl_unittest.cc

Status: Fixed (was: Assigned)
Labels: Merge-TBD
[Auto-generated comment by a script] We noticed that this issue is targeted for M-62; it appears the fix may have landed after branch point, meaning a merge might be required. Please confirm if a merge is required here - if so add Merge-Request-62 label, otherwise remove Merge-TBD label. Thanks.
Labels: -Merge-TBD

Sign in to add a comment