New issue
Advanced search Search tips

Issue 781222 link

Starred by 3 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: 2017-11-15
OS: Linux
Pri: 2
Type: Bug



Sign in to add a comment

v4l2 video decode accelerator sometimes deadlocks after switching video resolution

Reported by kiagiada...@gmail.com, Nov 3 2017

Issue description

UserAgent: Mozilla/5.0 (X11; Linux x86_64; rv:56.0) Gecko/20100101 Firefox/56.0

Steps to reproduce the problem:
On a platform that operates a hardware decoder through the v4l2 video decode accelerator, try the following:

1. Open a video in youtube, with quality set to Auto
2. Wait...
3. At some point, although the video has buffered a lot, you will notice that it pauses and the youtube player's spinning circles appear. At this point, it is not even possible to seek. The only way to unblock it is to reload the page.

What is the expected behavior?
Video should play uninterrupted

What went wrong?
There is a deadlock in V4L2VideoDecodeAccelerator [media/gpu/v4l2_video_decode_accelerator.cc] that causes this.

The root cause of the deadlock is this:

ServiceDeviceTask() and DevicePollTask() are meant to run in sequence and activate one another. But, they run from different threads. Normally, StartDevicePoll() schedules a call to DevicePollTask(), which then in turn schedules ServiceDeviceTask(), which schedules DevicePollTask() again and they continue like this.

When switching resolutions, there is an interesting chain of events:

1) ServiceDeviceTask() schedules DevicePollTask()
2) DevicePollTask() now runs in the other thread in parallel to ServiceDeviceTask()
3) DevicePollTask() schedules ServiceDeviceTask()
4) the previous ServiceDeviceTask() at the same time calls StartResolutionChange() before returning
5) StartResolutionChange() causes the thread that runs DevicePollTask() to get restarted by calling StopDevicePoll() & StartDevicePoll()
6) StartDevicePoll() schedules DevicePollTask()
7) ServiceDeviceTask() returns and runs again because it was scheduled earlier in (3)
8) ServiceDeviceTask() schedules DevicePollTask()
-> At this point we've ended up with 2 scheduled calls to DevicePollTask(), which are going to schedule 2 calls to ServiceDeviceTask() and so forth, which leads to problems.

The deadlock situation happens when at some point ServiceDeviceTask() schedules DevicePollTask() with poll_device=false because there are no input or output buffers available, but then the next ServiceDeviceTask() (which runs immediately because there are always 2 of them scheduled) gets an output buffer and schedules DevicePollTask with poll_device=true. At this point, the state of the VDA is such that it is waiting for an event from the v4l device, but DevicePollTask() has called poll() without the device fd, and therefore the event is never received.

I am attaching a patch that I implemented to fix this. The solution that is implemented is basically to break the chain of calls when it is detected that device_poll_thread_ has been restarted. With this additional check, at event (7) in the above example, ServiceDeviceTask() will return early and do nothing, because it will detect that it was scheduled by the previous device_poll_thread_ and not the current one.

Did this work before? N/A 

Chrome version: 61.0.3163.79 (Developer Build)  Channel: n/a
OS Version: Endless 3.4.0
Flash Version: N/A
 
0001-v4l2_VDA-fix-a-deadlock-that-can-happen-when-switchi.patch
4.9 KB Download
Components: Internals>Media>Codecs
Thanks for the patch with the fix.

Adding component for respective team to pick it up.
Cc: mcasas@chromium.org posciak@chromium.org sande...@chromium.org

Comment 3 by ajha@chromium.org, Nov 8 2017

Labels: TE-NeedsTriageHelp
Requesting someone from the dev team for help in triaging this further.
Components: Internals>GPU>Video
Labels: Needs-Feedback
NextAction: 2017-11-15
kiagiadakis.george@ thanks for the report and the patch; before
anything else, could you give details about the chromebook where
this fails? It should be ARM based, right? Then, could you in any
case please follow the instructions of [1] to submit a patch, 
that should be verified with a simplechrome [2]? Thanks!

[1] https://www.chromium.org/developers/contributing-code
[2] https://chromium.googlesource.com/chromiumos/docs/+/master/simple_chrome_workflow.md
I'm sorry I wasn't clear about that. This problem was spotted on an Endless Mini computer running Endless OS, not a chromebook. The SoC is an Amlogic S805 (meson8b) and the OS is ubuntu/debian based. Chromium is built for a linux desktop target there, with a few tweaks.

Despite this, I reported the issue here because it looks like a logic error in chromium's v4l VDA code and doesn't seem like it would affect only this board/platform. Being a race condition, it might be harder to reproduce elsewhere, but still, it could affect chromium running on any ARM board that uses the v4l VDA.

I am going to open a review ticket on gerrit asap.

Project Member

Comment 6 by sheriffbot@chromium.org, Nov 10 2017

Labels: -Needs-Feedback
Thank you for providing more feedback. Adding requester "mcasas@chromium.org" to the cc list and removing "Needs-Feedback" label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot

Comment 7 by fsamuel@google.com, Nov 10 2017

Owner: mcasas@chromium.org
Status: Assigned (was: Unconfirmed)
+mcasas@ who seems to be looking into this.
The NextAction date has arrived: 2017-11-15

Sign in to add a comment