Issue metadata
Sign in to add a comment
|
HttpNetworkTransactions leaked by HttpCache::Transaction on request cancellation
Reported by
bal...@netavis.hu,
Feb 12 2018
|
||||||||||||||||||||||||
Issue descriptionUserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Steps to reproduce the problem: 1. Sending several chained requests to our server (login, open required session, channel, etc.) 2. Setting an MJPEG stream (multipart-x-mixed-replace type) to an img tag 3. Updating the src attribute constantly, based on user interaction (jumping in the player, etc.) What is the expected behavior? At any given time, the active tcp sockets should be less then the maximum allowed 6. On previous versions of Chrome this is true, and on Firefox / Edge also (at least that's what I've tested) What went wrong? Requests are stuck in a pending... state, timing out eventually. According to chrome://net-internals tab, the previous URLs of the img tag are still running, even though the src attribute got updated since. Did this work before? Yes 63.0.3239.132 Does this work in other browsers? Yes Chrome version: 64.0.3282.140 Channel: stable OS Version: Ubuntu 16.04.3 LTS Flash Version: Since the version 64 update, Chrome seems to not close multipart sockets correctly. I was not able to reproduce it via a more simplified example (or even skipping our API). We have a kind of camera archive player, written in javascript, working through our propriertary API. Opening an archive has several steps (each as a separate GET request), but simply put it looks like the following: - login the user with credentials, opening a session (not the same as browser's session) - set the start / end of archive, opening a channel and a stream in return - set the mjpeg stream to an img tag I've attached an export file created with the latest Chrome version, reproducing the problem, and another one created with Chromium 63.0.3239.132 - on which everything works correctly and the socket limit is never reached. Please let me know if you need anything else.
,
Feb 12 2018
,
Feb 12 2018
,
Feb 12 2018
In chrome-net-export-log.json,
URL_REQUEST(ID:461) was finished ("t=1529 [st=1041] -REQUEST_ALIVE") but the corresponding socket (ID:283) was kept IN-USE.
In chromium-63-works-correctly.json,
the URL_REQUEST with a similar URL (ID:1610) was finished ("t=1547 [st=1053] -REQUEST_ALIVE") and the corresponding socket (ID:1510) became dead immediately (
t=1547 [st=1545] -SOCKET_IN_USE
t=1547 [st=1545] -SOCKET_ALIVE).
Perhaps requests were canceled in Blink/URLRequest layers but sockets were kept alive for some reasons?
net/ folks, could you take a look?
,
Feb 13 2018
Thanks for filing the issue! @Reporter: If it is possible could you please provide any repro test case or sample file for reproducing the issue and bisecting this further in better way. Thanks!
,
Feb 14 2018
Thanks for the analysis! @Viswa: As I wasn't able to reproduce it without using our API, I've put together a test case for you at: http://speedy.netavis.hu:62354/test When you open the page, you'll see an image and a button. The image is an archive stream from our server, in paused state. When the button is clicked the stream is closed and right after an identical new one is opened. In the chrome://net-internals tab you'll see it uses 2 sockets now, though the img has it's source updated to the new url already. After multiple clicks, the browser will max out the 6 available sockets, leaving any subsequent requests in pending state. It's a simplified example of how our client works, sadly I couldn't narrow it down more. Setting a multipart stream (without any of our session/archive open calls) works correctly, even when adding some random string at the end of each url. This may have something to do with chaining the requests after each other, like in the example.
,
Feb 14 2018
Thank you for providing more feedback. Adding requester "viswa.karala@chromium.org" to the cc list and removing "Needs-Feedback" label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Feb 15 2018
Tested the issue on chrome reported version 64.0.3282.140 using Ubuntu 17.10 with steps mentioned below: 1) Launched chrome reported version and navigated to URL: http://speedy.netavis.hu:62354/test provided in comment#6 2) On the new page, clicked on "click me" button, stream got closed and gain new stream got opened and navigated to Chrome://net-internals > Sockets, able to see some entries in it. 3) After mumltiple clicking on "click me" button, the Handed value keeps on chaging @Reporter: Please find the attached screencast for your reference and provide your inputs on it, if possible could you please provide the screencast of the excepted behaviour of the issue which helps us in further triaging it. Thanks!
,
Feb 15 2018
@Viswa: Thanks for the check. Please note that on the test page, the video should be *paused* at the first 5 clicks, until the max 6 sockets are reached for the host. Afterwards the site may not work correctly as the requests doesn't reach the server anymore. Thus the expected behaviour of the test page is: after any given click on the button, a paused video should be shown (meaning all requests reached the server correctly), and no more than 1 socket are open for the host. I'm uploading 2 screencasts: - socket_pool_for_host_filled: reproducing the error on with Chrome 64, showing also the network tab and pending request (once limit is reached) - works_correctly_on_chromium63: the expected behaviour, that works correctly on Chromium 63
,
Feb 15 2018
Thank you for providing more feedback. Adding requester "viswa.karala@chromium.org" to the cc list and removing "Needs-Feedback" label. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Feb 15 2018
Something weird is happening in the flaky log - we have a URLRequest (461) holding onto a socket (283) after the "-REQUEST_ALIVE" event. The request in question was cancelled. REQUEST_ALIVE is logged on URLRequest destruction, so presumably something must be hanging on to the socket in an intermediary layer. [+mgersh]: Can the cache do that, after all requests to a URL were closed / cancelled?
,
Feb 15 2018
Did you mean morlovich?
,
Feb 15 2018
[+shivanisha] Sorry, Miriam. It was Shivani did the cache lock fix, not sure why I was thinking it was you.
,
Feb 16 2018
Able to reproduce the issue on reported chrome version 64.0.3282.140 and latest canary 66.0.3347.0 using Mac 10.13.1, Windows-10 and Ubuntu 14.04 hence providing Bisect Info Bisect Info: ================ Good build: 64.0.3247.0 Bad build: 64.0.3248.0 You are probably looking for a change made after 510818 (known good), but no later than 510819 (first known bad). https://chromium.googlesource.com/chromium/src/+log/7b3d579f00d83dd6403cf8cb2883e1658a1cabcb..c18f976cdcb74086ce6bab2d2574718166c53517 Reviewed-on: https://chromium-review.googlesource.com/578172 @Shivani Sharma: Please confirm the issue and help in re-assigning if it is not related to your change. Adding ReleaseBlock-Stable as it is seems a receent break, feel free to remove it if not applicable. Thanks!
,
Feb 16 2018
It does look like the HttpNetworkTransaction is leaked...
,
Feb 16 2018
,
Feb 16 2018
Dumping some state info --- not sure I'd be able to do anything with it myself (will poke around some more), but may at least save Shivani some time. HttpCache::Transaction gets destroyed by the job getting cancelled, and the destructor calls HttpCache::DoneWithEntry(false), which calls HttpCache::Writers::RemoveTransaction In there, the transaction removed is the last one (all_writers_ is empty after the EraseTransaction) call, success is false, should_keep_entry_ is true, and next_state_ is NETWORK_READ_COMPLETE I do not, however, seem to see DoNetworkReadComplete getting called.
,
Feb 16 2018
@Reporter: is it possible you're doing something clever that stops the server from pushing things before you navigate away? If so, it's possible that not doing that might be a doable workaround till we get this sorted.
,
Feb 16 2018
,
Feb 16 2018
Multipart-mixed-replace or some sort of other hanging get are probably part of the equation, too. Relevant context: https://bugs.chromium.org/p/chromium/issues/detail?id=804868&desc=2#c18 Come to think of it, this is probably a duplicate?
,
Feb 16 2018
Ah, I thought that one was fixed. I agree it's a duplicate.
,
Feb 19 2018
Thanks for narrowing it down, and finding the real culprit! @morlovich: we already have a workaround, by not using mutlipart images, but sending individual jpegs instead. Thanks for the suggestion anyway. |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by krajshree@chromium.org
, Feb 12 2018