crrev.com/492354 somehow makes Cronet cache reuse hang |
||||||
Issue descriptionSteps to reproduce: 1. Checkout revision prior to crrev.com/492354: git checkout 2ccc2f573320a2fe2630382882cbafa32bf8c4bf 2. gclient sync 3. Apply this little patch: --- a/components/cronet/android/test/javatests/src/org/chromium/net/CronetUrlRequestContextTest.java +++ b/components/cronet/android/test/javatests/src/org/chromium/net/CronetUrlRequestContextTest.java @@ -1025,8 +1025,11 @@ public class CronetUrlRequestContextTest extends CronetTestBase { // Shutdown original context and create another that uses the same cache. cronetEngine.shutdown(); + for (int i = 0; i < 100; i++) { cronetEngine = enableDiskCache(new CronetEngine.Builder(getContext())).build(); checkRequestCaching(cronetEngine, url, true); + cronetEngine.shutdown(); + } } 4. ./components/cronet/tools/cr_cronet.py gn 5. See that test passes: ./components/cronet/tools/cr_cronet.py build-test -f \*#testEnableHttpCacheDiskNewEngine --repeat 10 6. Revert patch from step #3: git checkout components/cronet/android/test/javatests/src/org/chromium/net/CronetUrlRequestContextTest.java 7. Checkout crrev.com/492354: git checkout 036fd1f21f7a30219442b1c66669a1dc761ca0d7 8. gclient sync 9. Apply patch from step #3, note that "cronetEngine.shutdown();" is already there and doesn't need to be added. 10. See that test hangs: ./components/cronet/tools/cr_cronet.py build-test -f \*#testEnableHttpCacheDiskNewEngine --repeat 10 This stems from internal b/65237179 I'm using something like https://paste.googleplex.com/4815557513707520 but the above steps assume an ARM Android device is connected via USB; though I haven't tried it on an ARM Android device.
,
Sep 1 2017
I added some logging. Looks like it passes most of the time when BackendCleanupTracker::TryCreate() creates an entry in the map, but when it finds an already existing entry in the map (I imagine when the old CronetEngine isn't done shutting down) then it seems to hang.
,
Sep 1 2017
Looks like the BackendCleanupTracker isn't getting destructed in the failure case, so the cleanup task isn't posted.
,
Sep 1 2017
,
Sep 1 2017
Yeah, that's all as expected. The question is why it's not being deleted, that requires a backend or an entry leak.
,
Sep 1 2017
Reproduced once with a native phone, but it's not doing it again w/more debug logs, so trying emulated.
,
Sep 1 2017
Looks like an entry leak, unsurprisingly.
,
Sep 1 2017
Seems like the reply callback (to CloseOperationComplete) in here: https://cs.chromium.org/chromium/src/net/disk_cache/simple/simple_entry_impl.cc?rcl=6717d64ea77a8481b964eedbcb6dc2ed2c2a84ea&l=798 isn't getting invoked sometimes, despite us being inside the function with non-null synchronous_entry_. Any ideas on why that might happen? SimpleEntryImpl has a scoped_refptr to worker_pool_ (probably irrelevant, but makes things easier to think of).
,
Sep 1 2017
I don't know much about the simple cache, but it may be relevant that this leak likely happens as the network thread is being stopped and destroyed, so I imagine any PostTask to the network thread may return false and not execute the task.
,
Sep 1 2017
Sounds extremely plausible. Besides this, is it also destroying the scheduler stuff altogether? Though I guess the BLOCK_SHUTDOWN tasks (index write back) are hopefully guaranteed to succeed. As for solutions: 1) Disabling this stuff for HTTP/Media caches is easy, and I think pretty much gives the status quo ante (if index write back works right). Probably sensible given 62 timing. Cons: messy API, some cache writes may end up incomplete. 2) Wiring up the "cache backend is done cleaning up" callback. This seems likely to be messy since it's a bunch of layers in, but my unfamiliarity with the surrounding may mean that I am missing some opportunities. Pro: when it's done, it's really done, with all entries written out, rather than some in incomplete/corrupt state.
,
Sep 1 2017
> is it also destroying the scheduler stuff altogether? I'm not sure what you mean by "scheduler stuff", but I don't think that's being shutdown. It's just destroying the UrlRequestContext and the network thread it ran on. I imagine any static stuff (e.g. simple cache's leaked g_sequenced_worker_pool) is not going away.
,
Sep 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b802d1998785b72384bedf9a3e285602fe5a3869 commit b802d1998785b72384bedf9a3e285602fe5a3869 Author: Maks Orlovich <morlovich@chromium.org> Date: Tue Sep 05 19:47:59 2017 disk_cache: Disable automatic creation sequencing for {DISK,MEDIA}_CACHE Network thread shutdown may sometimes leak entries in Cronet environment, and blocking-forever further creation for that cache path is inappropriate. Further, the feature is mostly meant for other kinds of caches, which get destroyed and created repeatedly. Bug: 761331 Change-Id: Id0edd1cb3bda41723ea159a02a0d92a661e3f6e4 Reviewed-on: https://chromium-review.googlesource.com/648286 Commit-Queue: Maks Orlovich <morlovich@chromium.org> Reviewed-by: Josh Karlin <jkarlin@chromium.org> Cr-Commit-Position: refs/heads/master@{#499715} [modify] https://crrev.com/b802d1998785b72384bedf9a3e285602fe5a3869/net/disk_cache/backend_unittest.cc [modify] https://crrev.com/b802d1998785b72384bedf9a3e285602fe5a3869/net/disk_cache/disk_cache.cc [modify] https://crrev.com/b802d1998785b72384bedf9a3e285602fe5a3869/net/disk_cache/disk_cache.h
,
Sep 6 2017
,
Sep 7 2017
Your change meets the bar and is auto-approved for M62. Please go ahead and merge the CL to branch 3202 manually. Please contact milestone owner if you have questions. Owners: amineer@(Android), cmasso@(iOS), bhthompson@(ChromeOS), abdulsyed@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Sep 7 2017
I don't think I am able to do the merge myself...
,
Sep 8 2017
To merge changes follow the directions here: http://dev.chromium.org/developers/how-tos/drover So to merge back your change you'd use this command: git drover --branch 3202 --cherry-pick b802d19 There is a little merge conflict which I think I addressed. Here's the CL after I fixed the merge conflict: https://chromium-review.googlesource.com/c/chromium/src/+/656617 Want me to submit it?
,
Sep 8 2017
I don't have a committer status, so no amount of instructions can help. Similarly I can't actually approve the merge CL, though informally it does look good.
,
Sep 8 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/cb7fdd075b1398a233326699e65fa705d3573086 commit cb7fdd075b1398a233326699e65fa705d3573086 Author: Paul Jensen <pauljensen@chromium.org> Date: Fri Sep 08 01:34:45 2017 disk_cache: Disable automatic creation sequencing for {DISK,MEDIA}_CACHE Network thread shutdown may sometimes leak entries in Cronet environment, and blocking-forever further creation for that cache path is inappropriate. Further, the feature is mostly meant for other kinds of caches, which get destroyed and created repeatedly. TBR=morlovich@chromium.org (cherry picked from commit b802d1998785b72384bedf9a3e285602fe5a3869) Bug: 761331 Change-Id: Id0edd1cb3bda41723ea159a02a0d92a661e3f6e4 Reviewed-on: https://chromium-review.googlesource.com/648286 Commit-Queue: Maks Orlovich <morlovich@chromium.org> Reviewed-by: Josh Karlin <jkarlin@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#499715} Reviewed-on: https://chromium-review.googlesource.com/656617 Reviewed-by: Paul Jensen <pauljensen@chromium.org> Cr-Commit-Position: refs/branch-heads/3202@{#82} Cr-Branched-From: fa6a5d87adff761bc16afc5498c3f5944c1daa68-refs/heads/master@{#499098} [modify] https://crrev.com/cb7fdd075b1398a233326699e65fa705d3573086/net/disk_cache/backend_unittest.cc [modify] https://crrev.com/cb7fdd075b1398a233326699e65fa705d3573086/net/disk_cache/disk_cache.cc [modify] https://crrev.com/cb7fdd075b1398a233326699e65fa705d3573086/net/disk_cache/disk_cache.h
,
Sep 8 2017
|
||||||
►
Sign in to add a comment |
||||||
Comment 1 by morlovich@chromium.org
, Sep 1 2017