New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 765836 link

Starred by 1 user

Issue metadata

Status: Duplicate
Owner:
Closed: Nov 2017
Cc:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 3
Type: Bug



Sign in to add a comment

OOP HP stops working after running for a few days on macOS

Project Member Reported by erikc...@chromium.org, Sep 15 2017

Issue description

Observed this with both --memlog=browser and --memlog=minimal.

Observations:
  1) There's no CPU activity in the profiling process.
  2) But the profiling process still uses a large amount of memory [~1GB]
  3) chrome://memory-internals and chrome://tracing both fail to produce heap dumps.
  4) Attempting to take a chrome://memory-internals dump with tracing [of IPC events] enabling causes the profiling process to disappear. No crash is generated.
  5) Sampling the browser process shows that we still shim malloc, but no longer send memlog messages.

The explanation that best explains these observations is that: Somehow, the memlog pipe between browser + profiling processes is destroyed without either process going down. The browser process continues to shim alloc, but doesn't send any messages [or rather, all sends immediately fail]. The profiling process has its MemlogConnection torn down, but this does not clear the atoms from the BacktraceStorage, which accounts for the continued, high memory usage.

Subsequent requests for heap dumps produce no results, since the connection no longer exists.
 
I'm guessing the problem lies somewhere around:
https://cs.chromium.org/chromium/src/chrome/profiling/memlog_receiver_pipe_posix.cc?q=memlog_receiver&sq=package:chromium&l=70

Possibility 1: There's a errno that we haven't accounted for, which causes the profiling process to stop watching the file descriptor in question. In this case, the profiling process should stop reading from the socket...which should cause the browser process to eventually deadlock on write when the socket fills up. Since we haven't observed that...

Possibility 2: (bytes_read == 0) returns true even though the browser hasn't closed its end of the socket. This causes the socket to close. According to "man recvmsg" on macOS:

"""
     These calls return the number of bytes received, or -1 if an error occurred.
     For TCP sockets, the return value 0 means the peer has closed its half side of the connection.
"""

Since this isn't a TCP socket, I wonder if a return value of "0" might be possible, even when the socket hasn't closed?
This has happened again. I took a sampling of the profiling process while I triggered a dump. Observations:

1) The profiling process is still parsing/handling allocations.
2) The profiling process received a callback from memory_instrumentation for memory_instrumentation::ClientProcessImpl::RequestOSMemoryDump. This in turn must have been triggered by the profiling process itself. So there's an error happening somewhere in here that's cancelling the dump...we just don't know exactly where.
sample_of_profiling_process.rtf
118 KB Download
Ahh, the profiling process is still able to dump for the GPU process, but not the browser process. It also seems likely that the profiling process is only handling allocations for the gpu process.

Comment 4 by brettw@chromium.org, Sep 20 2017

If it gets too long a stack or too long a context string I believe it will close the pipe. Maybe that's what's happening?
Project Member

Comment 5 by bugdroid1@chromium.org, Sep 25 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b5fd11537b9d4b19fb9f10bb18fb9689526bb7d4

commit b5fd11537b9d4b19fb9f10bb18fb9689526bb7d4
Author: Erik Chen <erikchen@chromium.org>
Date: Mon Sep 25 04:52:41 2017

Temporary debugging for unexpected memlog connection error.

OOP HP unexpectedly stops working after running for a few days on macOS. This
temporary logging will help narrow down the cause.

Bug:  765836 
Change-Id: I5d15ee4ea6c7ca9ad4be952da65908ad038aa864
TBR: brettw@chromium.org
Reviewed-on: https://chromium-review.googlesource.com/676467
Reviewed-by: Erik Chen <erikchen@chromium.org>
Commit-Queue: Erik Chen <erikchen@chromium.org>
Cr-Commit-Position: refs/heads/master@{#503981}
[modify] https://crrev.com/b5fd11537b9d4b19fb9f10bb18fb9689526bb7d4/chrome/profiling/memlog_receiver_pipe_posix.cc
[modify] https://crrev.com/b5fd11537b9d4b19fb9f10bb18fb9689526bb7d4/chrome/profiling/memlog_stream_parser.cc

Mergedinto: 778439
Status: Duplicate (was: Assigned)
Project Member

Comment 7 by bugdroid1@chromium.org, Nov 3 2017

Sign in to add a comment