Fix race condition in weak pointers in safe-browsing v4 db code. |
|
Issue descriptionMoving discussion here because issue 683147 has been closed, and this is really a problem with the fix, not with the original bug. In https://codereview.chromium.org/2649643002 I brought up a concern about races in the weak pointer usage. I've thought things out more, and I'm pretty sure there's a race condition still, but that it happens to be narrower than I initially thought, which perhaps explains why ASAN/TSAN/whateverSAN hasn't flagged it. The CL calls GetWeakPtr and InvalidateWeakPtrs against the factory on the io thread, and this is safe. What is NOT safe is that it posts the weak pointer as a target to a task on the db thread. What I mean is that the weak pointer is the target of a method call, so on the db thread, code runs which is like this: T* raw_t = p.get(); if (raw_t) raw_t->TheMethod(...); The idea is that if the weak pointer is invalidated, the callback execution just falls on the floor, and whatever parameter ownership is involved unwinds (so, for instance, moved unique_ptr will be deleted, etc). The problem is that the get() is inherently racy WRT the InvalidateWeakPtrs() call. The general way this happens is that the background task calls the get() and starts running TheMethod(), then the foreground deletes the object (calling invalidate) while the method is still running. This code does not hit that case, because it manually calls invalidate and then posts a delete to the db task, so the delete is guaranteed to come after TheMethod() runs. BUT, get() and InvalidateWeakPtrs() are themselves racy against each other, because they do not access things using atomic operations. So it's possible that the get() could get inconsistent results. I believe/understand that on x86, this may be masked because the operations are accidentally atomic (if you have the right word sizes, alignment, etc, which is _probably_ all true for Chrome). But I believe that is not true on ARM. The right solution is to have the opposing thread _only_ pass WeakPtr instances as data, not to access them. In this case, it would mean that the db itself would contain the factory, and would vent weak pointers for the io thread to use. I think this would require the io thread to keep a task-runner reference (because it can't use the db's reference), and then have the io thread post a delete request to manually delete (not DeleteSoon, rather a call to a static function taking a weak pointer parameter to delete). [I'm not finding the arm-vs-x86-atomic-operations document I remember, but it's out there.]
,
Jan 26 2017
This is _not_ the document I'm thinking of: http://preshing.com/20130618/atomic-vs-non-atomic-operations/ but it does contain some of the flavor of what I'm getting at. I posted around to see if someone remembers the document for me, it was one of those ones where reading it had me saying "But, but, but", and then, finally, agreeing that if you mean atomic, you need to say atomic, and if you don't say atomic, you must assume it's not atomic. |
|
►
Sign in to add a comment |
|
Comment 1 by vakh@chromium.org
, Jan 26 2017