New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 725907 link

Starred by 3 users

Issue metadata

Status: Archived
Owner: ----
Closed: Aug 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug-Regression



Sign in to add a comment

traces contain huge number of MessageLoop::RunTask events

Reported by dave.rod...@arm.com, May 24 2017

Issue description

UserAgent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/57.0.2987.133 Safari/537.36
Platform: 9517.0.2017_05_03_1620 (Test Build - davrod01) developer-build elm

Steps to reproduce the problem:
1. capture trace with toplevel category enabled

What is the expected behavior?
Should be able to load trace in trace viewer, but it is too big.

What went wrong?
Trace contains around 70,000 MessageLoop::RunTask events per second and as a result are too big to be loaded in the trace viewer.

Looking at an older trace file, it has around 2000 of these events per second.

The extra events look similar to these three events from the Browser Chrome_IOThread thread, repeated many times.

{'ts': 108264772603, 's': None, 'dur': 6, 'category': None, 'cat': ['toplevel'], 'flow_out': None, 'timestamp': 21705.727, 'bind_id': None, 'args':
{'src_func': 'Watcher', 'src_file': '../../../../../../../home/pdswauto/chrome_root/src/base/files/file_descriptor_watcher_posix.cc'}
, 'tid': 30495, 'id': None, 'pid': 30451, 'tdur': 4, 'ph': 'X', 'name': 'MessagePumpLibevent::OnLibeventNotification', 'scope': None, 'flow_in': None, 'tts': 12005269}
{'ts': 108264772611, 's': None, 'dur': 26, 'category': None, 'cat': ['toplevel'], 'flow_out': None, 'timestamp': 21705.735, 'bind_id': None, 'args':
{'src_func': 'OnFileCanReadWithoutBlocking', 'src_file': '../../../../../../../home/pdswauto/chrome_root/src/base/files/file_descriptor_watcher_posix.cc'}
, 'tid': 30495, 'id': None, 'pid': 30451, 'tdur': 25, 'ph': 'X', 'name': 'MessageLoop::RunTask', 'scope': None, 'flow_in': None, 'tts': 12005276}
{'ts': 108264772640, 's': None, 'dur': 7, 'category': None, 'cat': ['toplevel'], 'flow_out': None, 'timestamp': 21705.764, 'bind_id': None, 'args':
{'src_func': 'StartWatching', 'src_file': '../../../../../../../home/pdswauto/chrome_root/src/base/files/file_descriptor_watcher_posix.cc'}

Did this work before? Yes Not sure, likely no earlier than April 2017.

Chrome version: 60.0.3109.0  Channel: dev
OS Version: 9517.0.2017_05_03_1620
Flash Version: (Disabled)
 

Comment 1 by dave.rod...@arm.com, May 24 2017

Observed on builds dated 2017-05-18 (elm), 2017-05-17 (kevin), 2017-05-03 (elm).
Components: Internals>Network
Owner: xunji...@chromium.org
Status: Assigned (was: Unconfirmed)
Assigning to xunjieli who added that trace event late 2016.
Labels: Needs-Feedback Performance
Thanks for filing the bug. 
Are these extra events from Chrome_IOThread? 
Could you attach a trace file?
Cc: ssid@chromium.org
Yes, these are in the Browser Chrome_IOThread. Please see attached file for an example (from an elm test build dated 2017-05-09).
elm_wikipedia_20170509.trace.gz
7.1 MB Download
Please let me know if I can help e.g. by providing additional traces or other info.
Cc: xunji...@chromium.org
Owner: ----
Status: Untriaged (was: Assigned)
There are a large number of MessageLoop::RunTasks from file_descriptor_watcher_posix.cc StartWatching(), Watcher() and OnFileCanReadWithoutBlocking() on Chrome_IOThread. These three methods seem to repeat many times. 
 
I am unable to reproduce this on Linux. This might be ChromeOS specific. 
dave.rodgman@: Could you follow the instructions at https://www.google.com/url?q=https://sites.google.com/a/chromium.org/dev/for-testers/providing-network-details to get a net-export trace?
Thank you.
Please see attached. This was captured on elm by:

- starting recording a trace (toplevel category only)
- starting the net-export capture
- switching to a tab showing Wikipedia for ~5 seconds
- stopping the net-export capture
chrome-net-export-log.json
56.8 KB View Download

Comment 9 by stephen....@arm.com, Jun 14 2017

It would appear that the following commit is responsible for us seeing so many of these 'Watcher', 'StartWatching', and 'OnFileCanReadWithoutBlocking' MessageLoop::RunTask events on the Browser Chrome_IOThread:

https://chromium.googlesource.com/chromium/src/+/7092984694bd56fc05488582a93a0fda0c8d7f29
Cc: -xunji...@chromium.org -ssid@chromium.org lhchavez@chromium.org
Components: -Internals>Network Speed>Tracing Platform>ARC
Thanks for the netlog. This is not a network stack issue.
r468306 looks possible, but I am not familiar with the code. +lhchavez who might know more about the change. Over to Speed>Tracing and Platform>ARC for triage. 

Hi, has anyone been able to look into this?

I currently revert the mentioned commit (7092984694...), and this resolves the issue. Is this a problem with the way that I build test images of ChromeOS? i.e., if they don't have ARC correctly configured, then using tracing is generating this extra traffic on the Browser IO thread as the tracing system attempts to communicate with the ARC container? I'm just wondering why no one else seems to have noticed this problem.
It would appear to be a problem limited to the way in which we build CrOS test images - i.e., we don't insert Google API keys? This means that ARC appears to be available, but the call to 

mojom::TracingInstance* tracing_instance =
      ARC_GET_INSTANCE_FOR_METHOD(arc_bridge_service_->tracing(), StartTracing);

in chrome/browser/chromeos/arc/tracing/arc_tracing_bridge.cc

doesn't receive the tracing instance. The write socket is closed, but the read socket (which has already been given to the IOThread to watch) is not, and so the IOThread wastes its time receiving notification that there is something to read, and then reading nothing, because the corresponding socket has been closed.

As such, the simplest solution seemed to be to swap the order - initialise the mojo tracing instance with the write socket first, and if that succeeds, then pass the read socket to the IOThread to watch.

Submitted fix at https://chromium-review.googlesource.com/c/596970
Project Member

Comment 13 by bugdroid1@chromium.org, Aug 7 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/cdc46a1983b188bc795c1db0c6799ae77e7c8db2

commit cdc46a1983b188bc795c1db0c6799ae77e7c8db2
Author: Stephen Kyle <stephen.kyle@arm.com>
Date: Mon Aug 07 18:38:29 2017

arc/tracing: ensure ARC tracing is set up only when fully available

In test builds without API keys, it is possible to have a chromiumos
system where ARC is installed on the system, but the mojo ARC tracing
system doesn't exist.

In order to set up ARC tracing, a socketpair is created, and the write
socket is given to the ArcTracingBridge, and the read socket is given to
the ArcTracingReader, which runs on the Browser's IOThread.

Currently if the ArcTracingBridge fails (because the mojo ARC tracing
system doesn't exist), then the write socket is closed, but nothing
stops the ArcTracingReader. It then consumes all the Browser IOThread's
time receiving OnFileCanReadWithoutBlocking calls because the write
socket has been closed.

In the ArcTracingAgent, reverse the current order to set up the
ArcTracingBridge before setting up the ArcTracingReader.  Then only set
up the ArcTracingReader if the ArcTracingBridge successfully started
tracing.

BUG= 725907 
TEST=Build cros+chrome (for example, kevin) without any API keys.
     Run a Telemetry benchmark that includes the 'toplevel' trace
     category, and note that the Browser IOThread is filled with
     thousands of OnFileCanReadWithoutBlocking and StartWatching events.
     Repeat after applying this patch, and note the Browser IOThread is
     now much quieter.

R=lhchavez@chromium.org

Change-Id: I0db26527bd531d806e2fce2db6341178c7e24f10
Reviewed-on: https://chromium-review.googlesource.com/596970
Reviewed-by: Alex Moshchuk <alexmos@chromium.org>
Reviewed-by: Luis Hector Chavez <lhchavez@chromium.org>
Commit-Queue: Adenilson Cavalcanti <cavalcantii@chromium.org>
Cr-Commit-Position: refs/heads/master@{#492368}
[modify] https://crrev.com/cdc46a1983b188bc795c1db0c6799ae77e7c8db2/chrome/browser/chromeos/arc/tracing/arc_tracing_bridge.cc
[modify] https://crrev.com/cdc46a1983b188bc795c1db0c6799ae77e7c8db2/chrome/browser/chromeos/arc/tracing/arc_tracing_bridge.h
[modify] https://crrev.com/cdc46a1983b188bc795c1db0c6799ae77e7c8db2/content/public/browser/arc_tracing_agent.cc
[modify] https://crrev.com/cdc46a1983b188bc795c1db0c6799ae77e7c8db2/content/public/browser/arc_tracing_agent.h

Cc: stephen....@arm.com
Owner: xunji...@chromium.org
Status: Assigned (was: Untriaged)
Did the CL in comment#13 fix this issue?
Cc: cavalcantii@chromium.org alex...@chromium.org
Owner: ----
Status: Untriaged (was: Assigned)
(unassigning myself.. I helped to triage this bug but this is not mine.)

Folks cc-ed (lhchavez@chromium.org, stephen.kyle@arm.com, alexmos@chromium.org, cavalcantii@chromium.org), did the CL fix this issue?

#14: Yes, the CL fixed this issue.
Status: Fixed (was: Untriaged)

Comment 18 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)

Sign in to add a comment