MemoryInfra: Unregistering VpxVideoDecoder::MemoryPool dump provider is not thread safe |
|||||||||||
Issue descriptionBackground context:go/memory-infra: memory profiling in chrome://tracing Failed build: https://build.chromium.org/p/tryserver.chromium.android/builders/linux_android_rel_ng/builds/121969 Error Message: F/chromium( 9647): [FATAL:memory_dump_manager.cc(323)] Check failed: (*mdp_iter)->task_runner && (*mdp_iter)->task_runner->RunsTasksOnCurrentThread(). MemoryDumpProvider "VpxVideoDecoder" attempted to unregister itself in a racy way. Please file a crbug. The VpxVideoDecoder::MemoryPool is constructed in one thread and destructed in a different thread. Is that possible?
,
Aug 15 2016
Could this be related to crbug.com/627295 ?
,
Aug 15 2016
,
Aug 22 2016
,
Aug 22 2016
+cc dalecurtis > The VpxVideoDecoder::MemoryPool is constructed in one thread and destructed in a different thread. Is that possible? It looks like that could happen if VpxVideoDecoder::Reset were called from a different thread than it was initially. Possibly also if VpxVideoDecoder were destroyed by VpxOffloadThread. Looking at the code doesn't lead me to believe either scenario is especially likely, and I believe the latter to be impossible. Any idea what might be going on here, Dale? I don't think there's anything going on within libvpx here, but I've been wrong before. Any update on whether this is related to crbug.com/627295 as mentioned in #2?
,
Aug 22 2016
,
Aug 22 2016
Isn't the memory pool refcounted? It'll be destroyed by whatever holds a reference last. If that's a frame it'll happen on the media thread. VpxVideoDecoder is always called from media thread.
,
Aug 22 2016
It is: it can be release()'d via nullptr assignment in VpxVideoDecoder::CloseDecoder() and VpxVideoDecoder::Reset(). The only way I can see that happening is if the media thread were to die or be killed, and a new one somehow gained access the previous media thread's VpxVideoDecoder. Sounds pretty impossible/insane. Just looped you in because VpxVideoDecoder seems mostly in maintenance mode, but someone working on the media stack should at least be made aware of this issue since it doesn't appear to be anything within libvpx itself. For now I choose to believe this is something to do with crbug.com/627295 ...
,
Aug 22 2016
Hmm, not directly since that's for GpuVideoDecoder not VpxVideoDecoder, but perhaps a similar mechanism within GpuMemoryBuffers is broken; i.e. it's not handling the GPU process death on the right thread.
,
Dec 17 2016
is this issue fixed? If not, I think we should disable the VpxVideoDecoder::MemoryPool dump provider, since it can cause crashes.
,
Dec 18 2016
crrev.com/2580223002 should fix this.
,
Dec 19 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/12c9d4df53fb2d12149efae6a4c25d072e839dc1 commit 12c9d4df53fb2d12149efae6a4c25d072e839dc1 Author: dcastagna <dcastagna@chromium.org> Date: Mon Dec 19 22:27:46 2016 media: Move VPX mem dump provider reg/unreg to same thread. VPX MemoryPool is refcounted and it'll be destroyed by by whatever thread holds a reference last. Currently registering the memorypool dump provider happens in the memorypool ctor while unregistering it happens in the dtor. This can cause problems since unregistering it should happen on the same thread where it was registered first. This CL moves the register/unregister of memory pool dump provider to VpxVideoDecoder that is always called from media thead. BUG= 637409 Review-Url: https://codereview.chromium.org/2580223002 Cr-Commit-Position: refs/heads/master@{#439594} [modify] https://crrev.com/12c9d4df53fb2d12149efae6a4c25d072e839dc1/media/filters/vpx_video_decoder.cc
,
Dec 19 2016
,
Dec 19 2016
,
Dec 19 2016
Given that this leads to the run-time crash described in issue 643438, and that the change is fairly minimal, I think we should get this merged. ssid/dcastagna: Is there a simple unit-test we can add to VpxVideoDecoder to avoid regressions?
,
Dec 19 2016
We have a browser test to verify that the dump providers are maintaining thread affinity. But the test was flaky in this case because the was a race. It's hard to write a test from memory-infra side to check if dump providers are doing the right thing since we aren't sure we hit all these cases in browser test.
,
Dec 19 2016
To test this specific issue we'd need to spawn another thread and register/unregister VpxVideoDecoder pool MemoryDumpProvider, so it'd be far from minimal. Considering logs already reported the issue, I'm not sure it's worth adding a test just for this.
,
Dec 20 2016
Agree with #18. The rationale of that threading check on registrations/unregistrations was precisely this: "if clients get the threading wrong (like in this case) they will end up creating data races. races are an utter pain to debug/reproduce and unrealistic to cover with tests, so let's make the full browser crash whenever we detect a situation that can possibly lead to a race". This worked greatly, in fact the reason why we are all here and have this bug, which is now fixed thanks to Daniele, is because we had that dcheck in place. The only possible action I can see here is make that dcheck even stronger: i.e. make it a CHECK, so we get crash reports and don't even have to wait for people to file bugs. But that is a delicate choice: we can do that only once we are confident that we don't have any other regression in place, or it would lead people to stop using tracing.
,
Dec 20 2016
crrev.com/2592543002 makes it little more stricter so that we can catch regression on bots better.
,
Dec 20 2016
Where should we merge this fix? Is 56 enough?
,
Dec 20 2016
I think due to the new policies we can't (and shouldn't) really go beyond 56. 55 is stable and merging there requires a postmortem and a solid explanation (e.g. being a top crasher) which is not really the case here.
,
Dec 20 2016
Your change meets the bar and is auto-approved for M56 (branch: 2924)
,
Dec 21 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b5c89bb0e2d8c54499727c849466a679cf59fec4 commit b5c89bb0e2d8c54499727c849466a679cf59fec4 Author: Daniele Castagna <dcastagna@chromium.org> Date: Wed Dec 21 16:58:45 2016 media: Move VPX mem dump provider reg/unreg to same thread. Merge to 56. VPX MemoryPool is refcounted and it'll be destroyed by by whatever thread holds a reference last. Currently registering the memorypool dump provider happens in the memorypool ctor while unregistering it happens in the dtor. This can cause problems since unregistering it should happen on the same thread where it was registered first. This CL moves the register/unregister of memory pool dump provider to VpxVideoDecoder that is always called from media thead. BUG= 637409 Review-Url: https://codereview.chromium.org/2580223002 Cr-Commit-Position: refs/heads/master@{#439594} (cherry picked from commit 12c9d4df53fb2d12149efae6a4c25d072e839dc1) Review-Url: https://codereview.chromium.org/2599463002 . Cr-Commit-Position: refs/branch-heads/2924@{#578} Cr-Branched-From: 3a87aecc31cd1ffe751dd72c04e5a96a1fc8108a-refs/heads/master@{#433059} [modify] https://crrev.com/b5c89bb0e2d8c54499727c849466a679cf59fec4/media/filters/vpx_video_decoder.cc |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by mcasas@chromium.org
, Aug 13 2016Components: Internals>Media>Video
Labels: OS-All
Owner: dcasta...@chromium.org