New issue
Advanced search Search tips

Issue 738023 link

Starred by 3 users

Issue metadata

Status: Verified
Owner:
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

Possible FD leak when looping videos and play/pause events.

Project Member Reported by dgreid@chromium.org, Jun 29 2017

Issue description

Spun out from  Bug 731808 

I am trying to reproduce the issue on Kevin and Cyan. (will barrow a veyron for testing later.)
No lock to reproduce the issue yet after about 1 hour (following the instruction in c57). 

In which process, do we observed the fd leaking? The tab, gpu, or browser. 

The only number I saw keeping increasing is the fd # in browser process on Cyan. And I found most of them are those deleted buffers of cras. 

lrwx------. 1 chronos chronos 64 Jun 29 09:37 927 -> /dev/shm/cras-1707-stream-000b00de (deleted)
lrwx------. 1 chronos chronos 64 Jun 29 09:37 928 -> /dev/shm/cras-1707-stream-000b00de (deleted)
lrwx------. 1 chronos chronos 64 Jun 29 09:38 932 -> /dev/shm/cras-1707-stream-000b00e1 (deleted)
lrwx------. 1 chronos chronos 64 Jun 29 09:38 933 -> /dev/shm/cras-1707-stream-000b00e1 (deleted)
lrwx------. 1 chronos chronos 64 Jun 29 09:38 934 -> /dev/shm/cras-1707-stream-000b00e2 (deleted)
lrwx------. 1 chronos chronos 64 Jun 29 09:38 935 -> /dev/shm/cras-1707-stream-000b00e2 (deleted)
lrwx------. 1 chronos chronos 64 Jun 29 09:38 936 -> /dev/shm/cras-1707-stream-000b00e3 (deleted)
lrwx------. 1 chronos chronos 64 Jun 29 09:38 937 -> /dev/shm/cras-1707-stream-000b00e3 (deleted)


 #localhost fd # ls -l | grep  deleted | grep cras | wc
      590    7080   57820

But the same behavior is not observed on kevin.

BTW, it has been increased to 664 while I am typing.

  #localhost fd # ls -l | grep  deleted | grep cras | wc
      752    7968   65083



Repro Steps:
Re #54: I don't. But I can repro the problem with the mini html extracted from customer's app in  issue 735643 . The page loops 4 customer mp4 videos and switch every 1 second (to stress test). I load the html in a tab on veyron_fievel and the renderer crashed eventually. During the time, the renderer memory goes up and down between 170MB and 700+MB. Devtools does not show leaks in v8 (delta is minimum between two snapshots taken 8 hours apart).
 
I can reproduce the issue using test html file in https://bugs.chromium.org/p/chromium/issues/detail?id=731808#c58

I will check why chrome still holds those shm fd.
Status: Started (was: Assigned)
<!DOCTYPE html>
<html>
  <body>
    <video id="testvideo"></video>
    <script>
      setInterval(function(){
        var video = document.getElementsByTagName('video')[0];
        video.setAttribute('src', '27406.mp4');
        video.load();  //Does not appear to change the behavior at all
        video.play();
      },200);
    </script>
  </body>
</html>

When the time interval is less than 200 ms, issue starts to happen.
When the time interval is set to 100 ms or less, the issue happens more.

The shm file name is formed by cras-%pid-stream-%stream_id
e.g.:
/dev/shm/cras-1363-stream-001410fa
/dev/shm/cras-1363-stream-001410fa
/dev/shm/cras-1363-stream-00141102
/dev/shm/cras-1363-stream-00141102
/dev/shm/cras-1363-stream-00141109
/dev/shm/cras-1363-stream-00141109
/dev/shm/cras-1363-stream-0014110a
/dev/shm/cras-1363-stream-0014110a
/dev/shm/cras-1363-stream-0014110b
/dev/shm/cras-1363-stream-0014110b
/dev/shm/cras-1363-stream-00141113
/dev/shm/cras-1363-stream-00141113
/dev/shm/cras-1363-stream-00141119
/dev/shm/cras-1363-stream-00141119

The stream ids in the unclosed fd were not continuous, so it seems that when the stream is open and closed too fast, sometimes the fd is not closed.
I will need to check how Chrome handles the closed stream and the shm it used.

Comment 4 Deleted

I also checked whether CRAS closes unused shm after stream is removed and then destroyed (after draining).
This path is executed successfully. This can be verified by checking
/dev/shm
There will be no more than one shm for stream at a time:

localhost fd # ls -al /dev/shm
total 32
drwxrwxrwt.  2 root root     80 Jun 30 16:24 .
drwxr-xr-x. 16 root root   2220 Jun 30 15:41 ..
-rw-------.  1 cras cras 139136 Jun 30 15:41 cras-1553
-rw-------.  1 cras cras  16456 Jun 30 16:24 cras-1553-stream-00090137   ------> this will be removed as soon as stream is closed.

Still need to check how chrome handles shm fd.
I have been looking at how chrome uses cras_client and stop/close a stream.

CrasUnifiedStream::Stop:

https://cs.chromium.org/chromium/src/media/audio/cras/cras_unified.cc?type=cs&l=233

void CrasUnifiedStream::Stop() {
  if (!client_)
    return;

  // Removing the stream from the client stops audio.
  cras_client_rm_stream(client_, stream_id_);

  is_playing_ = false;
}


CrasUnifiedStream::Close

https://cs.chromium.org/chromium/src/media/audio/cras/cras_unified.cc?q=WriteAudio&l=128

void CrasUnifiedStream::Close() {
  if (client_) {
    cras_client_stop(client_);
    cras_client_destroy(client_);
    client_ = NULL;
  }

  // Signal to the manager that we're closed and can be removed.
  // Should be last call in the method as it deletes "this".
  manager_->ReleaseOutputStream(this);
}

It shows that cras_unified is just a thin wrapper that uses these calls in libcras cras_client:

cras_client_rm_stream
cras_client_stop
cras_client_destroy

I should look into these functions and see whether they are all executed successfully.


In cras_client:

client_thread_rm_stream
https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?type=cs&q=client_thread_rm_stream+package:%5Echromeos_public$&l=1556

It calls free_shm(stream)

/* Release shm areas if references to them are held. */
static void free_shm(struct client_stream *stream)
{
	if (stream->capture_shm.area) {
		munmap(stream->capture_shm.area, stream->capture_shm_size);
	}
	if (stream->play_shm.area) {
		munmap(stream->play_shm.area, stream->play_shm_size);
	}
	stream->capture_shm.area = NULL;
	stream->play_shm.area = NULL;
}

I will need to check whether the munmap call succeeds.

On the other hand we should check server side when it close shm:

void cras_shm_close_unlink (const char *name, int fd)
{
        shm_unlink(name);
        close(fd);
}


We should observe some difference when we set time interval below 200 ms and above.
I will look into it next week.



After digging into the code I realized that stream_connected in cras_client is the place where fd of shm should be closed.

https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?q=cras_client.c+package:%5Echromeos_public$&dr=CSs&l=1442

when stream is created:

A
1. server create shm using shm_open
2. server mmap it to use it.
2. server passes the fd of shm to client by socket using SCM_RIGHTS
3. client receives CRAS_CLIENT_STREAM_CONNECTED message, along with the fd (just like dup(2) which duplicate the fd, but this is on difference process).
4. client mmap the shm using fd.
5. client closes fd.

A reference to shm is created when mmap is called, so client can close fd and still use that mmap.
(ref https://stackoverflow.com/questions/17490033/do-i-need-to-keep-a-file-open-after-calling-mmap-on-it )


When client disconnect stream:

B
1. client send disconnet stream message to server.
2. client remove the reference to shm using munmap.
3. server drains the stream.
4. server destroy the stream: calls munmap, shm_unlink, then close.

In this issue, fd used in client process is not closed, so we should check whether A5 is executed successfully.
Maybe after client receives CRAS_CLIENT_STREAM_CONNECTED from server and does A3,A4, it did not do A5 successfully.

Comment 8 by dgreid@chromium.org, Jun 30 2017

On Fri, Jun 30, 2017 at 01:32:55PM -0700, cychiang via monorail wrote:

Looking at `stream_connected` in libcras, that seems unlikely.  It would
need a corrupted message.

Can the client ever get in a state where `wake_aud_thread` would block?
Maybe if the audio thread stopped reading from it and the pipe filled
up?  That's the only way to avoid closing those FDs.
Thank you for the advice, Dylan.
I added some debug messages to stream_connected, and found that the unclosed fd is caused by client removed stream so early such that the stream has not been connected yet.

2017-07-03T19:03:50.368137+08:00 ERR chrome[27614]: cras_client: client_thread_add_stream
2017-07-03T19:03:50.368372+08:00 ERR chrome[27614]: cras_client: client_thread_add_stream, assign id:0x290008
2017-07-03T19:03:50.368461+08:00 ERR chrome[27614]: cras_client: start_aud_thread stream id:0x290008
2017-07-03T19:03:50.371904+08:00 ERR chrome[27614]: cras_client: client_thread_rm_stream, id:0x290008
2017-07-03T19:03:50.372107+08:00 ERR chrome[27614]: cras_client: stop_aud_thread stream id:0x290008, join:1
2017-07-03T19:03:50.373886+08:00 ERR cras_server[1855]: fetch err: -32 for 290008

The fetch error -32 was caused by the fact the pipe to client stream had already been closed when server side audio thread tried to fetch audio data.
(I remember there was a bug tracking -32 error. This might be one of the cause to that issue.)

In comparison, normal stream flow:

2017-07-03T19:03:48.750267+08:00 ERR chrome[27614]: cras_client: client_thread_add_stream
2017-07-03T19:03:48.750529+08:00 ERR chrome[27614]: cras_client: client_thread_add_stream, assign id:0x290007
2017-07-03T19:03:48.750624+08:00 ERR chrome[27614]: cras_client: start_aud_thread stream id:0x290007
2017-07-03T19:03:48.765405+08:00 ERR chrome[27614]: cras_client: stream_connected called, id 0x290007
2017-07-03T19:03:48.765760+08:00 ERR chrome[27614]: cras_client: stream_connected calls wake_aud_thread, id 0x290007
2017-07-03T19:03:48.765912+08:00 ERR chrome[27614]: cras_client: stream_connected calls close, id 0x290007, ret rc0: 0, rc1: 0
2017-07-03T19:03:48.806399+08:00 ERR chrome[27614]: cras_client: client_thread_rm_stream, id:0x290007
2017-07-03T19:03:48.806691+08:00 ERR chrome[27614]: cras_client: stop_aud_thread stream id:0x290007, join:1

To fix this, we can close stream_fds in stop_aud_thread like closing wake_fds:

https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?type=cs&q=stop_aud_thread+package:%5Echromeos_public$&l=1276

That is, check whether stream_fds are closed by stream_connected, if not, close them.


Sorry, I checked the code more carefully and got the idea what happened:

1. client_thread_rm_stream removed stream from streams linked list
https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?q=cras_client.c+package:%5Echromeos_public$&dr=CSs&l=1581

2. client thread received stream_connected message, and got the fd for shm. However, it could not find the stream id in the stream list, so it break without executing stream_connected callback.

https://cs.corp.google.com/chromeos_public/src/third_party/adhd/cras/src/libcras/cras_client.c?q=cras_client.c+package:%5Echromeos_public$&dr=CSs&l=1693

3. The fd for this shm remained opened.


To fix this, we need to close fd in 2 before break.

I will test the fix tomorrow.

Project Member

Comment 11 by bugdroid1@chromium.org, Jul 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/adhd/+/91493ab663ead053cae0a7c54d690db252ae8a79

commit 91493ab663ead053cae0a7c54d690db252ae8a79
Author: Cheng-Yi Chiang <cychiang@chromium.org>
Date: Thu Jul 06 04:54:55 2017

CRAS: cras_client - Close fd to shm if stream is removed

Sometimes a stream is removed before it is connected.
In this case, client needs to close fd to shm and then ignore the stream
connected message.
If client received incorrect number of fds, returns error.

BUG= chromium:738023 
TEST=Use the test page in the issue and monitor there is no fd like
 /dev/shm/cras-1707-stream-000b00de (deleted) in /proc/<pid>/fd.

Change-Id: I2fe2008106a5215670f281a4ccc1f6e3bf8686bd
Reviewed-on: https://chromium-review.googlesource.com/558182
Commit-Ready: Cheng-Yi Chiang <cychiang@chromium.org>
Tested-by: Cheng-Yi Chiang <cychiang@chromium.org>
Reviewed-by: Hsinyu Chao <hychao@chromium.org>

[modify] https://crrev.com/91493ab663ead053cae0a7c54d690db252ae8a79/cras/src/libcras/cras_client.c

Project Member

Comment 12 by bugdroid1@chromium.org, Jul 6 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/adhd/+/ea6269092046804e0b42b11111706a0c10e489d3

commit ea6269092046804e0b42b11111706a0c10e489d3
Author: Cheng-Yi Chiang <cychiang@chromium.org>
Date: Thu Jul 06 12:00:30 2017

CRAS: cras_client - Remove debug message

Remove debug message so libcras does not pollute the log of users.

BUG= chromium:738023 
TEST=None

Change-Id: I71d364a34c773f04c2da5e341376bc3f657ed068
Reviewed-on: https://chromium-review.googlesource.com/561038
Commit-Ready: Cheng-Yi Chiang <cychiang@chromium.org>
Tested-by: Cheng-Yi Chiang <cychiang@chromium.org>
Reviewed-by: Hsinyu Chao <hychao@chromium.org>

[modify] https://crrev.com/ea6269092046804e0b42b11111706a0c10e489d3/cras/src/libcras/cras_client.c

Labels: Merge-Request-60
Verified on 9725.0.0 on elm.
Add merge request for R60.
Project Member

Comment 14 by sheriffbot@chromium.org, Jul 10 2017

Labels: -Merge-Request-60 Hotlist-Merge-Review Merge-Review-60
This bug requires manual review: M60 has already been promoted to the beta branch, so this requires manual review
Please contact the milestone owner if you have questions.
Owners: amineer@(Android), cmasso@(iOS), josafat@(ChromeOS), bustamante@(Desktop)

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Labels: -Hotlist-Merge-Review -Merge-Review-60 Merge-Approved-60
Project Member

Comment 16 by bugdroid1@chromium.org, Jul 13 2017

Labels: merge-merged-release-R60-9592.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/adhd/+/4c5d810158a74943182743f9981ca7467683c0c4

commit 4c5d810158a74943182743f9981ca7467683c0c4
Author: Cheng-Yi Chiang <cychiang@chromium.org>
Date: Thu Jul 13 06:36:41 2017

CRAS: cras_client - Close fd to shm if stream is removed

Sometimes a stream is removed before it is connected.
In this case, client needs to close fd to shm and then ignore the stream
connected message.
If client received incorrect number of fds, returns error.

BUG= chromium:738023 
TEST=Use the test page in the issue and monitor there is no fd like
 /dev/shm/cras-1707-stream-000b00de (deleted) in /proc/<pid>/fd.

Change-Id: I2fe2008106a5215670f281a4ccc1f6e3bf8686bd
Reviewed-on: https://chromium-review.googlesource.com/558182
Commit-Ready: Cheng-Yi Chiang <cychiang@chromium.org>
Tested-by: Cheng-Yi Chiang <cychiang@chromium.org>
Reviewed-by: Hsinyu Chao <hychao@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/569548
Reviewed-by: Cheng-Yi Chiang <cychiang@chromium.org>
Commit-Queue: Cheng-Yi Chiang <cychiang@chromium.org>

[modify] https://crrev.com/4c5d810158a74943182743f9981ca7467683c0c4/cras/src/libcras/cras_client.c

Project Member

Comment 17 by bugdroid1@chromium.org, Jul 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/adhd/+/6e9aa2b7a694ee79518f05d544999468c4fd7055

commit 6e9aa2b7a694ee79518f05d544999468c4fd7055
Author: Cheng-Yi Chiang <cychiang@chromium.org>
Date: Thu Jul 13 06:36:44 2017

CRAS: cras_client - Remove debug message

Remove debug message so libcras does not pollute the log of users.

BUG= chromium:738023 
TEST=None

Change-Id: I71d364a34c773f04c2da5e341376bc3f657ed068
Reviewed-on: https://chromium-review.googlesource.com/561038
Commit-Ready: Cheng-Yi Chiang <cychiang@chromium.org>
Tested-by: Cheng-Yi Chiang <cychiang@chromium.org>
Reviewed-by: Hsinyu Chao <hychao@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/569549
Reviewed-by: Cheng-Yi Chiang <cychiang@chromium.org>
Commit-Queue: Cheng-Yi Chiang <cychiang@chromium.org>

[modify] https://crrev.com/6e9aa2b7a694ee79518f05d544999468c4fd7055/cras/src/libcras/cras_client.c

Labels: -Merge-Approved-60
Status: Fixed (was: Started)
Thanks Bernie!
Labels: M-60
Verified on 9756.0.0, 61.0.3159.0 on cyan device by running html file in https://bugs.chromium.org/p/chromium/issues/detail?id=731808#c58. Not observed any crash. 
I'll verify on M60 n then close the issue.
Cc: vsu...@chromium.org avkodipelli@chromium.org
Status: Verified (was: Fixed)
Verified on 9592.66.0, 60.0.3112.72 on cyan device.

Sign in to add a comment