I've discovered a race condition in the existing design whereby calls can timeout due to a deadlock between threads (it does recover after the timeout, so the deadlock isn't permanent).
What happens is the D-Bus thread in cicerone gets a request from Chrome and then makes a gRPC call into garcon which then may require use of the D-Bus thread in garcon (the current example is getting the info for a Linux package file). From the garcon side, it will make gRPC calls back into cicerone when it notices certain changes in the filesystem, and these originate from the D-Bus thread in garcon which then does a gRPC call into cicerone which then will usually make a D-Bus call back to Chrome on the D-Bus thread....and if there is a call coming in the other direction at the same time we can deadlock there until timeouts occur.
After looking through the code more, the only case where this actually can happen is with the PackageInfo call. It's the only gRPC call that goes into garcon which blocks its return on something being executed on the D-Bus thread.
I think what may be the cleanest solution is to just have another thread in garcon so that the D-Bus thread isn't using the same message loop as the one which makes gRPC calls back to cicerone; then there won't be contention there.
Comment 1 by bugdroid1@chromium.org
, Sep 26