Possible FD leak when looping videos and play/pause events. |
|||||||||
Issue descriptionSpun out from Bug 731808 I am trying to reproduce the issue on Kevin and Cyan. (will barrow a veyron for testing later.) No lock to reproduce the issue yet after about 1 hour (following the instruction in c57). In which process, do we observed the fd leaking? The tab, gpu, or browser. The only number I saw keeping increasing is the fd # in browser process on Cyan. And I found most of them are those deleted buffers of cras. lrwx------. 1 chronos chronos 64 Jun 29 09:37 927 -> /dev/shm/cras-1707-stream-000b00de (deleted) lrwx------. 1 chronos chronos 64 Jun 29 09:37 928 -> /dev/shm/cras-1707-stream-000b00de (deleted) lrwx------. 1 chronos chronos 64 Jun 29 09:38 932 -> /dev/shm/cras-1707-stream-000b00e1 (deleted) lrwx------. 1 chronos chronos 64 Jun 29 09:38 933 -> /dev/shm/cras-1707-stream-000b00e1 (deleted) lrwx------. 1 chronos chronos 64 Jun 29 09:38 934 -> /dev/shm/cras-1707-stream-000b00e2 (deleted) lrwx------. 1 chronos chronos 64 Jun 29 09:38 935 -> /dev/shm/cras-1707-stream-000b00e2 (deleted) lrwx------. 1 chronos chronos 64 Jun 29 09:38 936 -> /dev/shm/cras-1707-stream-000b00e3 (deleted) lrwx------. 1 chronos chronos 64 Jun 29 09:38 937 -> /dev/shm/cras-1707-stream-000b00e3 (deleted) #localhost fd # ls -l | grep deleted | grep cras | wc 590 7080 57820 But the same behavior is not observed on kevin. BTW, it has been increased to 664 while I am typing. #localhost fd # ls -l | grep deleted | grep cras | wc 752 7968 65083 Repro Steps: Re #54: I don't. But I can repro the problem with the mini html extracted from customer's app in issue 735643 . The page loops 4 customer mp4 videos and switch every 1 second (to stress test). I load the html in a tab on veyron_fievel and the renderer crashed eventually. During the time, the renderer memory goes up and down between 170MB and 700+MB. Devtools does not show leaks in v8 (delta is minimum between two snapshots taken 8 hours apart).
,
Jun 30 2017
,
Jun 30 2017
<!DOCTYPE html>
<html>
<body>
<video id="testvideo"></video>
<script>
setInterval(function(){
var video = document.getElementsByTagName('video')[0];
video.setAttribute('src', '27406.mp4');
video.load(); //Does not appear to change the behavior at all
video.play();
},200);
</script>
</body>
</html>
When the time interval is less than 200 ms, issue starts to happen.
When the time interval is set to 100 ms or less, the issue happens more.
The shm file name is formed by cras-%pid-stream-%stream_id
e.g.:
/dev/shm/cras-1363-stream-001410fa
/dev/shm/cras-1363-stream-001410fa
/dev/shm/cras-1363-stream-00141102
/dev/shm/cras-1363-stream-00141102
/dev/shm/cras-1363-stream-00141109
/dev/shm/cras-1363-stream-00141109
/dev/shm/cras-1363-stream-0014110a
/dev/shm/cras-1363-stream-0014110a
/dev/shm/cras-1363-stream-0014110b
/dev/shm/cras-1363-stream-0014110b
/dev/shm/cras-1363-stream-00141113
/dev/shm/cras-1363-stream-00141113
/dev/shm/cras-1363-stream-00141119
/dev/shm/cras-1363-stream-00141119
The stream ids in the unclosed fd were not continuous, so it seems that when the stream is open and closed too fast, sometimes the fd is not closed.
I will need to check how Chrome handles the closed stream and the shm it used.
,
Jun 30 2017
I also checked whether CRAS closes unused shm after stream is removed and then destroyed (after draining). This path is executed successfully. This can be verified by checking /dev/shm There will be no more than one shm for stream at a time: localhost fd # ls -al /dev/shm total 32 drwxrwxrwt. 2 root root 80 Jun 30 16:24 . drwxr-xr-x. 16 root root 2220 Jun 30 15:41 .. -rw-------. 1 cras cras 139136 Jun 30 15:41 cras-1553 -rw-------. 1 cras cras 16456 Jun 30 16:24 cras-1553-stream-00090137 ------> this will be removed as soon as stream is closed. Still need to check how chrome handles shm fd.
,
Jun 30 2017
I have been looking at how chrome uses cras_client and stop/close a stream. CrasUnifiedStream::Stop: https://cs.chromium.org/chromium/src/media/audio/cras/cras_unified.cc?type=cs&l=233 void CrasUnifiedStream::Stop() { if (!client_) return; // Removing the stream from the client stops audio. cras_client_rm_stream(client_, stream_id_); is_playing_ = false; } CrasUnifiedStream::Close https://cs.chromium.org/chromium/src/media/audio/cras/cras_unified.cc?q=WriteAudio&l=128 void CrasUnifiedStream::Close() { if (client_) { cras_client_stop(client_); cras_client_destroy(client_); client_ = NULL; } // Signal to the manager that we're closed and can be removed. // Should be last call in the method as it deletes "this". manager_->ReleaseOutputStream(this); } It shows that cras_unified is just a thin wrapper that uses these calls in libcras cras_client: cras_client_rm_stream cras_client_stop cras_client_destroy I should look into these functions and see whether they are all executed successfully. In cras_client: client_thread_rm_stream https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?type=cs&q=client_thread_rm_stream+package:%5Echromeos_public$&l=1556 It calls free_shm(stream) /* Release shm areas if references to them are held. */ static void free_shm(struct client_stream *stream) { if (stream->capture_shm.area) { munmap(stream->capture_shm.area, stream->capture_shm_size); } if (stream->play_shm.area) { munmap(stream->play_shm.area, stream->play_shm_size); } stream->capture_shm.area = NULL; stream->play_shm.area = NULL; } I will need to check whether the munmap call succeeds. On the other hand we should check server side when it close shm: void cras_shm_close_unlink (const char *name, int fd) { shm_unlink(name); close(fd); } We should observe some difference when we set time interval below 200 ms and above. I will look into it next week.
,
Jun 30 2017
After digging into the code I realized that stream_connected in cras_client is the place where fd of shm should be closed. https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?q=cras_client.c+package:%5Echromeos_public$&dr=CSs&l=1442 when stream is created: A 1. server create shm using shm_open 2. server mmap it to use it. 2. server passes the fd of shm to client by socket using SCM_RIGHTS 3. client receives CRAS_CLIENT_STREAM_CONNECTED message, along with the fd (just like dup(2) which duplicate the fd, but this is on difference process). 4. client mmap the shm using fd. 5. client closes fd. A reference to shm is created when mmap is called, so client can close fd and still use that mmap. (ref https://stackoverflow.com/questions/17490033/do-i-need-to-keep-a-file-open-after-calling-mmap-on-it ) When client disconnect stream: B 1. client send disconnet stream message to server. 2. client remove the reference to shm using munmap. 3. server drains the stream. 4. server destroy the stream: calls munmap, shm_unlink, then close. In this issue, fd used in client process is not closed, so we should check whether A5 is executed successfully. Maybe after client receives CRAS_CLIENT_STREAM_CONNECTED from server and does A3,A4, it did not do A5 successfully.
,
Jun 30 2017
On Fri, Jun 30, 2017 at 01:32:55PM -0700, cychiang via monorail wrote: Looking at `stream_connected` in libcras, that seems unlikely. It would need a corrupted message. Can the client ever get in a state where `wake_aud_thread` would block? Maybe if the audio thread stopped reading from it and the pipe filled up? That's the only way to avoid closing those FDs.
,
Jul 3 2017
Thank you for the advice, Dylan. I added some debug messages to stream_connected, and found that the unclosed fd is caused by client removed stream so early such that the stream has not been connected yet. 2017-07-03T19:03:50.368137+08:00 ERR chrome[27614]: cras_client: client_thread_add_stream 2017-07-03T19:03:50.368372+08:00 ERR chrome[27614]: cras_client: client_thread_add_stream, assign id:0x290008 2017-07-03T19:03:50.368461+08:00 ERR chrome[27614]: cras_client: start_aud_thread stream id:0x290008 2017-07-03T19:03:50.371904+08:00 ERR chrome[27614]: cras_client: client_thread_rm_stream, id:0x290008 2017-07-03T19:03:50.372107+08:00 ERR chrome[27614]: cras_client: stop_aud_thread stream id:0x290008, join:1 2017-07-03T19:03:50.373886+08:00 ERR cras_server[1855]: fetch err: -32 for 290008 The fetch error -32 was caused by the fact the pipe to client stream had already been closed when server side audio thread tried to fetch audio data. (I remember there was a bug tracking -32 error. This might be one of the cause to that issue.) In comparison, normal stream flow: 2017-07-03T19:03:48.750267+08:00 ERR chrome[27614]: cras_client: client_thread_add_stream 2017-07-03T19:03:48.750529+08:00 ERR chrome[27614]: cras_client: client_thread_add_stream, assign id:0x290007 2017-07-03T19:03:48.750624+08:00 ERR chrome[27614]: cras_client: start_aud_thread stream id:0x290007 2017-07-03T19:03:48.765405+08:00 ERR chrome[27614]: cras_client: stream_connected called, id 0x290007 2017-07-03T19:03:48.765760+08:00 ERR chrome[27614]: cras_client: stream_connected calls wake_aud_thread, id 0x290007 2017-07-03T19:03:48.765912+08:00 ERR chrome[27614]: cras_client: stream_connected calls close, id 0x290007, ret rc0: 0, rc1: 0 2017-07-03T19:03:48.806399+08:00 ERR chrome[27614]: cras_client: client_thread_rm_stream, id:0x290007 2017-07-03T19:03:48.806691+08:00 ERR chrome[27614]: cras_client: stop_aud_thread stream id:0x290007, join:1 To fix this, we can close stream_fds in stop_aud_thread like closing wake_fds: https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?type=cs&q=stop_aud_thread+package:%5Echromeos_public$&l=1276 That is, check whether stream_fds are closed by stream_connected, if not, close them.
,
Jul 3 2017
Sorry, I checked the code more carefully and got the idea what happened: 1. client_thread_rm_stream removed stream from streams linked list https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?q=cras_client.c+package:%5Echromeos_public$&dr=CSs&l=1581 2. client thread received stream_connected message, and got the fd for shm. However, it could not find the stream id in the stream list, so it break without executing stream_connected callback. https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?q=cras_client.c+package:%5Echromeos_public$&dr=CSs&l=1693 3. The fd for this shm remained opened. To fix this, we need to close fd in 2 before break. I will test the fix tomorrow.
,
Jul 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/adhd/+/91493ab663ead053cae0a7c54d690db252ae8a79 commit 91493ab663ead053cae0a7c54d690db252ae8a79 Author: Cheng-Yi Chiang <cychiang@chromium.org> Date: Thu Jul 06 04:54:55 2017 CRAS: cras_client - Close fd to shm if stream is removed Sometimes a stream is removed before it is connected. In this case, client needs to close fd to shm and then ignore the stream connected message. If client received incorrect number of fds, returns error. BUG= chromium:738023 TEST=Use the test page in the issue and monitor there is no fd like /dev/shm/cras-1707-stream-000b00de (deleted) in /proc/<pid>/fd. Change-Id: I2fe2008106a5215670f281a4ccc1f6e3bf8686bd Reviewed-on: https://chromium-review.googlesource.com/558182 Commit-Ready: Cheng-Yi Chiang <cychiang@chromium.org> Tested-by: Cheng-Yi Chiang <cychiang@chromium.org> Reviewed-by: Hsinyu Chao <hychao@chromium.org> [modify] https://crrev.com/91493ab663ead053cae0a7c54d690db252ae8a79/cras/src/libcras/cras_client.c
,
Jul 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/adhd/+/ea6269092046804e0b42b11111706a0c10e489d3 commit ea6269092046804e0b42b11111706a0c10e489d3 Author: Cheng-Yi Chiang <cychiang@chromium.org> Date: Thu Jul 06 12:00:30 2017 CRAS: cras_client - Remove debug message Remove debug message so libcras does not pollute the log of users. BUG= chromium:738023 TEST=None Change-Id: I71d364a34c773f04c2da5e341376bc3f657ed068 Reviewed-on: https://chromium-review.googlesource.com/561038 Commit-Ready: Cheng-Yi Chiang <cychiang@chromium.org> Tested-by: Cheng-Yi Chiang <cychiang@chromium.org> Reviewed-by: Hsinyu Chao <hychao@chromium.org> [modify] https://crrev.com/ea6269092046804e0b42b11111706a0c10e489d3/cras/src/libcras/cras_client.c
,
Jul 10 2017
Verified on 9725.0.0 on elm. Add merge request for R60.
,
Jul 10 2017
This bug requires manual review: M60 has already been promoted to the beta branch, so this requires manual review Please contact the milestone owner if you have questions. Owners: amineer@(Android), cmasso@(iOS), josafat@(ChromeOS), bustamante@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Jul 12 2017
,
Jul 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/adhd/+/4c5d810158a74943182743f9981ca7467683c0c4 commit 4c5d810158a74943182743f9981ca7467683c0c4 Author: Cheng-Yi Chiang <cychiang@chromium.org> Date: Thu Jul 13 06:36:41 2017 CRAS: cras_client - Close fd to shm if stream is removed Sometimes a stream is removed before it is connected. In this case, client needs to close fd to shm and then ignore the stream connected message. If client received incorrect number of fds, returns error. BUG= chromium:738023 TEST=Use the test page in the issue and monitor there is no fd like /dev/shm/cras-1707-stream-000b00de (deleted) in /proc/<pid>/fd. Change-Id: I2fe2008106a5215670f281a4ccc1f6e3bf8686bd Reviewed-on: https://chromium-review.googlesource.com/558182 Commit-Ready: Cheng-Yi Chiang <cychiang@chromium.org> Tested-by: Cheng-Yi Chiang <cychiang@chromium.org> Reviewed-by: Hsinyu Chao <hychao@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/569548 Reviewed-by: Cheng-Yi Chiang <cychiang@chromium.org> Commit-Queue: Cheng-Yi Chiang <cychiang@chromium.org> [modify] https://crrev.com/4c5d810158a74943182743f9981ca7467683c0c4/cras/src/libcras/cras_client.c
,
Jul 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/adhd/+/6e9aa2b7a694ee79518f05d544999468c4fd7055 commit 6e9aa2b7a694ee79518f05d544999468c4fd7055 Author: Cheng-Yi Chiang <cychiang@chromium.org> Date: Thu Jul 13 06:36:44 2017 CRAS: cras_client - Remove debug message Remove debug message so libcras does not pollute the log of users. BUG= chromium:738023 TEST=None Change-Id: I71d364a34c773f04c2da5e341376bc3f657ed068 Reviewed-on: https://chromium-review.googlesource.com/561038 Commit-Ready: Cheng-Yi Chiang <cychiang@chromium.org> Tested-by: Cheng-Yi Chiang <cychiang@chromium.org> Reviewed-by: Hsinyu Chao <hychao@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/569549 Reviewed-by: Cheng-Yi Chiang <cychiang@chromium.org> Commit-Queue: Cheng-Yi Chiang <cychiang@chromium.org> [modify] https://crrev.com/6e9aa2b7a694ee79518f05d544999468c4fd7055/cras/src/libcras/cras_client.c
,
Jul 13 2017
Thanks Bernie!
,
Jul 18 2017
Verified on 9756.0.0, 61.0.3159.0 on cyan device by running html file in https://bugs.chromium.org/p/chromium/issues/detail?id=731808#c58. Not observed any crash. I'll verify on M60 n then close the issue.
,
Jul 20 2017
Verified on 9592.66.0, 60.0.3112.72 on cyan device. |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by cychiang@chromium.org
, Jun 30 2017