New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 907069 link

Starred by 1 user

Issue metadata

Status: Verified
Owner:
Closed: Jan 11
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocking:
issue 913878
issue 907067



Sign in to add a comment

Perfetto: Support flushing or recovering data from worker threads

Project Member Reported by eseckler@chromium.org, Nov 20

Issue description

We would currently have to flush worker threads (i.e. commit their SMB chunks) after each trace event to ensure that their data isn't lost (perfetto only provides committed SMB chunks to consumers), but this is too inefficient to do.

We can't simply flush once stopping, because we can't post messages to the individual worker threads, as they don't have a MessageLoop (and thus they are not part of TraceLog's thread_task_runners_ [1]).

This includes TaskScheduler workers as well as Compositor workers (content::CategorizedWorkerPoolThread), and other non-MessageLoop threads.

Can we either a) support recovering this uncommitted data on perfetto service side or b) support a generic way to flush all these threads on demand?

[1] https://cs.chromium.org/chromium/src/base/trace_event/trace_log.cc?type=cs&sq=package:chromium&g=0&l=417
 
Status: Available (was: Untriaged)
Description: Show this description
Blocking: 907067
After some discussion with hjd@ and skyostil@ it sounds like a perfetto service-side solution would be most suitable to this problem. The service would have to scrape the remaining uncommitted chunks out of the shared memory buffers once tracing is disabled (i.e. after all data sources have acked stop).
Cc: hjd@chromium.org
Description: Show this description
Blocking: 900935
Owner: eseckler@chromium.org
Status: Assigned (was: Available)
Any update on this? This is causing redness on the waterfall because we're not writing clock sync markers that are necessary to import and sync the subtraces and compute trace-based metrics for Telemetry.
We're actively working on scraping buffers from worker threads in the perfetto service. It's technically complex (we're basically want to steal data in shared memory from untrusted producers, so we need to be careful...), so likely will stretch into Q1. You can follow b/73828976 which tracks the work in perfetto.

Dunno if there's a work around for the clock sync marker issue in the meantime. How are they emitted and by whom? Oystein, do you have any suggestions?
If clock syncs don't happen super frequently (do they?) maybe we could force a flush on the recording thread immediately after?
What's the current impact of this? Looking at the https://ci.chromium.org/buildbot/internal.client.clank/integration-android-low-end-phone/ waterfall for example, the flakiness is super-rare:

One run:

  PASSED  ] 1199 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ]  memory.long_running_dual_browser_test/http_search_yahoo_com_search__ylt_p_google@{'phase': 'on_chrome'}

1 FAILED TEST

Another run:

[  PASSED  ] 1199 tests.
[  FAILED  ] 1 test, listed below:
[  FAILED  ]  memory.long_running_dual_browser_test/http_en_m_wikipedia_org_wiki_Science@{'phase': 'on_chrome'}

1 FAILED TEST

i.e. one random flake out of 1200 memory.long_running_dual_browser_test tests each run.

The tree redness is annoying and causes extra sheriff attention drain in the meantime, of course, so I agree a quick fix (if we can find one) would be nice.

Random idea: Can we just emit the clocksync markers twice? One at the beginning of the trace, and one at the end (like we currently do)?
Blocking: 913878
Blocking: -900935
#12: Looks like we currently emit at the beginning which is biting us here. See issue 900935#c24.
Project Member

Comment 16 by bugdroid1@chromium.org, Jan 9

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/0f6182210893904e32d5d623de12cf0f2a2912b0

commit 0f6182210893904e32d5d623de12cf0f2a2912b0
Author: Eric Seckler <eseckler@chromium.org>
Date: Wed Jan 09 11:10:30 2019

perfetto: Enable SMB scraping

Enable scraping of shared memory buffers used by the perfetto tracing
service. This should help reduce missing data in traces. Because of
that, this change is expected to cause some changes in perf waterfall
metrics.

Bug:  907069 
Change-Id: I3a4592ffa9d5113885d5859d87a2f5708255ee26
Reviewed-on: https://chromium-review.googlesource.com/c/1402462
Commit-Queue: Eric Seckler <eseckler@chromium.org>
Reviewed-by: Primiano Tucci <primiano@chromium.org>
Cr-Commit-Position: refs/heads/master@{#621112}
[modify] https://crrev.com/0f6182210893904e32d5d623de12cf0f2a2912b0/services/tracing/perfetto/perfetto_service.cc

Status: Fixed (was: Assigned)
I'm gonna optimistically close this. Let's see how many regressions we'll see :D
Project Member

Comment 18 by bugdroid1@chromium.org, Jan 10

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3359b08464e431c49494e8678125b1063d1bbcc5

commit 3359b08464e431c49494e8678125b1063d1bbcc5
Author: Eric Seckler <eseckler@chromium.org>
Date: Thu Jan 10 17:06:59 2019

Revert "perfetto: Enable SMB scraping"

This reverts commit 0f6182210893904e32d5d623de12cf0f2a2912b0.

Reason for revert: No effect yet because trace writer registration is still missing in ProducerClient. Will reland once that's fixed, so that any performance regressions are attributed correctly.

Original change's description:
> perfetto: Enable SMB scraping
>
> Enable scraping of shared memory buffers used by the perfetto tracing
> service. This should help reduce missing data in traces. Because of
> that, this change is expected to cause some changes in perf waterfall
> metrics.
>
> Bug:  907069 
> Change-Id: I3a4592ffa9d5113885d5859d87a2f5708255ee26
> Reviewed-on: https://chromium-review.googlesource.com/c/1402462
> Commit-Queue: Eric Seckler <eseckler@chromium.org>
> Reviewed-by: Primiano Tucci <primiano@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#621112}

TBR=primiano@chromium.org,eseckler@chromium.org
# Skipping presubmit as this is a safe change without any effect.

No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  907069 
Change-Id: Id3a5a3fe8864caa2e95cad7051d2faef0cd343f5
Reviewed-on: https://chromium-review.googlesource.com/c/1405309
Commit-Queue: Eric Seckler <eseckler@chromium.org>
Reviewed-by: Eric Seckler <eseckler@chromium.org>
Cr-Commit-Position: refs/heads/master@{#621615}
[modify] https://crrev.com/3359b08464e431c49494e8678125b1063d1bbcc5/services/tracing/perfetto/perfetto_service.cc

Status: Started (was: Fixed)
Project Member

Comment 20 by bugdroid1@chromium.org, Jan 10

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/555e6b16175aabd036ec7d9780c730ae91f9031c

commit 555e6b16175aabd036ec7d9780c730ae91f9031c
Author: Eric Seckler <eseckler@chromium.org>
Date: Thu Jan 10 19:13:57 2019

perfetto: Add support for registering TraceWriters to ProducerHost

This is necessary to support SMB scraping.

Bug:  907069 
Change-Id: I6368dfa10566e1331948d35a4807ff2475e11560
Reviewed-on: https://chromium-review.googlesource.com/c/1405330
Reviewed-by: Primiano Tucci <primiano@chromium.org>
Reviewed-by: Tom Sepez <tsepez@chromium.org>
Commit-Queue: Eric Seckler <eseckler@chromium.org>
Cr-Commit-Position: refs/heads/master@{#621685}
[modify] https://crrev.com/555e6b16175aabd036ec7d9780c730ae91f9031c/services/tracing/perfetto/producer_host.cc
[modify] https://crrev.com/555e6b16175aabd036ec7d9780c730ae91f9031c/services/tracing/perfetto/producer_host.h
[modify] https://crrev.com/555e6b16175aabd036ec7d9780c730ae91f9031c/services/tracing/public/cpp/perfetto/producer_client.cc
[modify] https://crrev.com/555e6b16175aabd036ec7d9780c730ae91f9031c/services/tracing/public/mojom/perfetto_service.mojom

Project Member

Comment 21 by bugdroid1@chromium.org, Jan 10

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/884929d17a33fadadde7c772e08128e9191c56b6

commit 884929d17a33fadadde7c772e08128e9191c56b6
Author: Eric Seckler <eseckler@chromium.org>
Date: Thu Jan 10 20:15:30 2019

Reland "perfetto: Enable SMB scraping"

This is a reland of 0f6182210893904e32d5d623de12cf0f2a2912b0

Now that the TraceWriters are actually registered, this will
actually have some effect :)

Original change's description:
> perfetto: Enable SMB scraping
>
> Enable scraping of shared memory buffers used by the perfetto tracing
> service. This should help reduce missing data in traces. Because of
> that, this change is expected to cause some changes in perf waterfall
> metrics.
>
> Bug:  907069 
> Change-Id: I3a4592ffa9d5113885d5859d87a2f5708255ee26
> Reviewed-on: https://chromium-review.googlesource.com/c/1402462
> Commit-Queue: Eric Seckler <eseckler@chromium.org>
> Reviewed-by: Primiano Tucci <primiano@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#621112}

TBR=primiano@chromium.org

Bug:  907069 
Change-Id: I63c890671ffe12a0d466a7b8667dd81c685b2036
Reviewed-on: https://chromium-review.googlesource.com/c/1405511
Reviewed-by: Eric Seckler <eseckler@chromium.org>
Commit-Queue: Eric Seckler <eseckler@chromium.org>
Cr-Commit-Position: refs/heads/master@{#621722}
[modify] https://crrev.com/884929d17a33fadadde7c772e08128e9191c56b6/services/tracing/perfetto/perfetto_service.cc

Status: Verified (was: Started)
This seems to work now. For example, this trace [1] shows data from two worker threads in the last few ms before the trace is completed. Whereas a trace before the change [2] shows that the last few ms from the TaskSchedulerSingleThreadBackgroundBlocking1 task runner are missing.

[1] https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/balls_svg_animations_2019-01-11_03-29-51_19152.html
[2] https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/balls_svg_animations_2019-01-10_21-56-24_40113.html

Sign in to add a comment