New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 879884 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 17
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

rendering.desktop/text_hover_50000_pixels_per_second fails on win-10-perf

Project Member Reported by nednguyen@chromium.org, Sep 2

Issue description

Failure log:
Duplicate draw events: 5
    at Object.addPipelineHistograms (/tracing/metrics/rendering/pipeline.html:127:19)
    at new renderingMetric (/tracing/metrics/rendering/rendering_metric.html:90:26)
    at runMetrics (c:\b\s\w\ir\third_party\catapult\tracing\tracing\metrics\metric_map_function.html:61:16)
    at metricMapFunction (c:\b\s\w\ir\third_party\catapult\tracing\tracing\metrics\metric_map_function.html:190:24)
    at Object.mapSingleTrace (/tracing/mre/map_single_trace.html:39:7)
    at eval (c:\b\s\w\ir\third_party\catapult\tracing\tracing\mre\map_single_trace_cmdline.html:61:18)
    at Object.runAndConvertErrorsToFailures (/tracing/mre/map_single_trace.html:24:10)
    at mapSingleTraceWithResult (c:\b\s\w\ir\third_party\catapult\tracing\tracing\mre\map_single_trace_cmdline.html:52:12)
    at Object.mapSingleTraceMain (c:\b\s\w\ir\third_party\catapult\tracing\tracing\mre\map_single_trace_cmdline.html:77:20)
    at eval (c:\b\s\w\ir\third_party\catapult\tracing\tracing\mre\map_single_trace_cmdline.html:90:15)

https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/1bf9ebde-ae3a-11e8-a992-d89ef387258f


Sadrul@: can you own this as well?
 
Ping
Status: Started (was: Untriaged)
Sample trace from a failing run: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/text_hover_20000_pixels_per_second_2018-09-10_20-15-14_90819.html

It looks like there are overlapping DrawAndSwap()s happening (see in VizCompositorThread in Gpu process): one DrawAndSwap() starts at 2381ms, and while it's waiting for the async-ack, it does another DrawAndSwap() at 2382ms. I seem to remember we avoid doing a DrawAndSwap() while still waiting for the ack for the previous DrawAndSwap().

+sunnyps@: fyi. Anything you can think of to investigate for this? I am continuing to look ...
Cc: sunn...@chromium.org
Some initial findings: it looks like technically, there can be overlapping DrawAndSwap(), depending on |max_pending_swaps_|, which is determined by the |OutputSurface::Capabilities::max_frames_pending| (which defaults to 1 [0]). GLOutputSurfaceBufferQueue sets |max_frames_pending| to 2 [1], but it looks like that is used only on ozone or mac, so this shouldn't be affecting Windows. GLOutputSurface sets |max_frames_pending| to |gpu::Capabilities::num_surface_buffers - 1| [2]. |gpu::Capabilities::num_surface_buffers| defaults to 2 ([3, 4]), and is set to 3 only for chrome-cast when using triple-buffering [5]. So as far as I can tell, there should still be no overlapping DrawAndSwap() on Windows.

[0] https://cs.chromium.org/chromium/src/components/viz/service/display/output_surface.h?l=47
[1] https://cs.chromium.org/chromium/src/components/viz/service/display_embedder/gl_output_surface_buffer_queue.cc?type=cs&sq=package:chromium&g=0&l=38
[2] https://cs.chromium.org/chromium/src/components/viz/service/display_embedder/gl_output_surface.cc?type=cs&sq=package:chromium&g=0&l=41
[3] https://cs.chromium.org/chromium/src/gpu/command_buffer/common/capabilities.h?l=128
[4] https://cs.chromium.org/chromium/src/ui/gl/gl_surface.cc?l=239
[5] https://cs.chromium.org/chromium/src/ui/ozone/platform/cast/gl_surface_cast.cc?l=130
Cc: fsam...@chromium.org kylec...@chromium.org samans@chromium.org
/cc some more folks who might have ideas.
Hm, looks like there may be two browser windows submitting compositor-frames, which would also explain overlapping DrawAndSwap()s. For example, see "LayerTreeHost::DoUpdateLayers" event in CrBrowserMain at ~2376ms with source_frame_number = 83, followed by another DoUpdateLayers event at ~2381ms, with source_frame_number = 0, which likely means a new browser window (or some other toplevel widget with its own ui::Compositor) opened and is submitting frames. Hrm ...

Is it possible to get screenshots at the time of failure?
Latest failure is in https://ci.chromium.org/buildbot/chromium.perf/win-10-perf/575

Sadly there are two reasons why we don't have screenshot here:
1) We don't support platform screenshots on windows yet, just the devtool screenshot
2) We currently don't take screenshot on metrics failure. Note that at this point, the browser under test is already closed. 
The second window could be the URL popup when you hover the mouse over a link, that is a separate window with a separate ui::Compositor on Windows.
It could probably be a tooltip too :(

I will put up a fix for the benchmark to ignore the second window for now, instead of fataling out. But perhaps we should consider ignoring native input events during the tests.
Shouldn't those async events have different IDs?
Here:

https://cs.chromium.org/chromium/src/components/viz/service/display/display.cc?l=288

Instead of increasing swapped_trace_id_ and using it later in trace events, shouldn't we do the following?

int local_swapped_trace_id = GetNextSwappedTraceId(); // <- increments swapped_trace_id_ and returns it.

// later traces in DrawAndSwap()
TRACE_EVENT...(..., local_swapped_trace_id);
Do you mean |swapped_trace_id_| becomes a static variable, so that multiple instances of Display doesn't end up re-using the ids?

That becomes trickier with |last_acked_trace_id_| etc.
Ah, I thought it's a reentrancy issue. If the issue is that we have multiple display instances, then can we have a static variable that increases only when you create an instance of Display, to build a display number, and then scope trace event IDs with that display number so that they are not mixed with events from other displays? Or, a little bit easier but hackier way is to scope IDs by the memory address of the display, maybe?
Project Member

Comment 15 by bugdroid1@chromium.org, Sep 12

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1647a8ece33b4b8bba5d2d59ed3b3c8f0d1da0da

commit 1647a8ece33b4b8bba5d2d59ed3b3c8f0d1da0da
Author: Sadrul Habib Chowdhury <sadrul@chromium.org>
Date: Wed Sep 12 20:27:41 2018

viz: Assign Display-specific trace ids.

Assign each Display a separate starting point for the trace ids, so
that the trace-ids do not all start from zero, which makes it difficult
to associate the trace-events with the Display intances.

BUG= 879884 

Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel
Change-Id: I79146409a5acff362c348fb32493a247f6736f72
Reviewed-on: https://chromium-review.googlesource.com/1222586
Reviewed-by: Fady Samuel <fsamuel@chromium.org>
Commit-Queue: Sadrul Chowdhury <sadrul@chromium.org>
Cr-Commit-Position: refs/heads/master@{#590805}
[modify] https://crrev.com/1647a8ece33b4b8bba5d2d59ed3b3c8f0d1da0da/components/viz/service/display/display.cc
[modify] https://crrev.com/1647a8ece33b4b8bba5d2d59ed3b3c8f0d1da0da/components/viz/service/display/display.h

Project Member

Comment 16 by bugdroid1@chromium.org, Sep 14

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c741eb8100cf931d3c4a6ffec135dcbc1f42143e

commit c741eb8100cf931d3c4a6ffec135dcbc1f42143e
Author: Sadrul Habib Chowdhury <sadrul@chromium.org>
Date: Fri Sep 14 01:52:43 2018

viz: Fix trace-id for canceled draw events.

It is possible that while waiting for the ack of a previous draw, another
draw is attempted and aborted. In such cases, the trace-event for the
first draw would incorrectly be terminated. Fix that bug by fixing the
trace-id for such canceled draw events. Also, instrument such events so
that telemetry tests can identify such events.

Also, fix the trace-id to be int64_t in SurfaceAggregator, which was
mistakenly left unchanged in crrev.com/590805

BUG=883592,  879884 

Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel
Change-Id: Id39bf22559bd9f02e2765e69cc32a8cb1e9c62a2
Reviewed-on: https://chromium-review.googlesource.com/1223326
Reviewed-by: Ehsan Chiniforooshan <chiniforooshan@chromium.org>
Commit-Queue: Sadrul Chowdhury <sadrul@chromium.org>
Cr-Commit-Position: refs/heads/master@{#591259}
[modify] https://crrev.com/c741eb8100cf931d3c4a6ffec135dcbc1f42143e/components/viz/service/display/display.cc
[modify] https://crrev.com/c741eb8100cf931d3c4a6ffec135dcbc1f42143e/components/viz/service/display/surface_aggregator.cc
[modify] https://crrev.com/c741eb8100cf931d3c4a6ffec135dcbc1f42143e/components/viz/service/display/surface_aggregator.h

Status: Fixed (was: Started)

Sign in to add a comment