Race condition in mojo::WaitSet::State class triggers DCHECK
Reported by
land...@opera.com,
May 12 2017
|
||
Issue descriptionWe sometimes hits a DCHECK in mojo WaitSet implementation with the following log: [19751:19753:0511/153249.723283:8574564291282:FATAL:wait_set.cc(283)] Check failed: MOJO_RESULT_CANCELLED == result (1 vs. 9) #0 0x7f3aa0e692f7 base::debug::StackTrace::StackTrace() #1 0x7f3aa0e89197 logging::LogMessage::~LogMessage() #2 0x7f3aa14fdabd mojo::WaitSet::State::Notify() #3 0x7f3aa14fdca2 mojo::WaitSet::State::Context::OnNotification() #4 0x7f3aa1ed585f mojo::edk::Watch::InvokeCallback() #5 0x7f3aa1ed3ac1 mojo::edk::RequestContext::~RequestContext() #6 0x7f3aa1ec47d0 mojo::edk::NodeChannel::OnChannelMessage() #7 0x7f3aa1eb2c40 mojo::edk::Channel::OnReadComplete() #8 0x7f3aa1eb7086 mojo::edk::(anonymous namespace)::ChannelPosix::OnFileCanReadWithoutBlocking() #9 0x7f3aa0e9da68 base::MessagePumpLibevent::OnLibeventNotification() #10 0x7f3aa0f3ff6e event_base_loop #11 0x7f3aa0e9d211 base::MessagePumpLibevent::Run() #12 0x7f3aa0e9c74f base::MessageLoop::RunHandler() #13 0x7f3aa0ec420b base::RunLoop::Run() #14 0x7f3aa0ef74b1 base::Thread::Run() #15 0x7f3aa0ef8830 base::Thread::ThreadMain() #16 0x7f3aa0ef06c3 base::(anonymous namespace)::ThreadFunc() #17 0x7f3aa6e84184 start_thread #18 0x7f3a9dff237d clone I have identified this to be a race because of insufficient synchronization between WaitSet::State::RemoveHandle() and WaitSet::State::Notify(). The reason this happens is that the handle is removed (with lock held) from handle_to_context_ in RemoveHandle() and then the lock is released before calling MojoCancelWatch() which will (always?) trigger a notification with result == MOJO_RESULT_CANCELLED. There could however be another notification happening on another thread that arrives at WaitSet::State::Notify() before the MOJO_RESULT_CANCELLED arrive and the DCHECK is triggered. So, with the current implementation the DCHECK doesn't hold. A comment states that it's not possible to extend the time the lock is held because that would cause reentrancy issues. Question is if the solution is to simply remove the DCHECK or if a more elaborate solution is needed? The problem was introduced by: https://codereview.chromium.org/2744943002
,
May 15 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/6dbb86d4a6bd9aacd50573b3473afaa857bca4c3 commit 6dbb86d4a6bd9aacd50573b3473afaa857bca4c3 Author: Ken Rockot <rockot@chromium.org> Date: Mon May 15 18:18:18 2017 Fix notification expectation in mojo::WaitSet This removes an errant DCHECK in mojo::WaitSet::State::Notify. BUG= 721689 R=yzshen@chromium.org Change-Id: Ifc9e91874e43ce33db7b94c76596435c812f60c0 Reviewed-on: https://chromium-review.googlesource.com/505417 Reviewed-by: Yuzhu Shen <yzshen@chromium.org> Commit-Queue: Ken Rockot <rockot@chromium.org> Cr-Commit-Position: refs/heads/master@{#471833} [modify] https://crrev.com/6dbb86d4a6bd9aacd50573b3473afaa857bca4c3/mojo/public/cpp/system/wait_set.cc
,
May 15 2017
Thanks again for reporting this. I agree with your analysis; the DCHECK was simply incorrect.
,
May 16 2017
Thanks for the quick resolution. |
||
►
Sign in to add a comment |
||
Comment 1 by roc...@chromium.org
, May 12 2017Labels: -Pri-3 Pri-1
Owner: roc...@chromium.org
Status: Assigned (was: Untriaged)