New issue
Advanced search Search tips

Issue 871955 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Nov 14
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 1
Type: Bug

Blocking:
issue 602688



Sign in to add a comment

TimelineBasedMeasurementTest.testFirstPaintMetricSmoke failing with passthrough command decoder

Project Member Reported by kbr@chromium.org, Aug 7

Issue description

This test:
telemetry.web_perf.timeline_based_page_test_unittest.TimelineBasedMeasurementTest.testFirstPaintMetricSmoke

is failing when attempting to switch Chrome from its current command buffer implementation to the new pass-through implementation to be used with ANGLE, in Issue 602688.

Here's one example failure:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng/158921

from this CL:
https://chromium-review.googlesource.com/1163989

Here's the failing shard:
https://chromium-swarm.appspot.com/task?id=3f2ee648b5108610&refresh=10&show_raw=1

and the failure excerpt:
[281/294] telemetry.web_perf.timeline_based_page_test_unittest.TimelineBasedMeasurementTest.testFirstPaintMetricSmoke failed unexpectedly 14.4080s:
  Chrome build location for linux_x86_64 not found. Browser will be run without Flash.
  Traceback (most recent call last):
    File "/b/s/w/ir/third_party/catapult/telemetry/telemetry/web_perf/timeline_based_page_test_unittest.py", line 231, in testFirstPaintMetricSmoke
      self.assertGreater(v_ttfcp_max[0].value, 0)
  AssertionError: None not greater than 0


I'm struggling to understand how the timeToFirstContentfulPaint histogram is generated, which the test is querying here:
https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/web_perf/timeline_based_page_test_unittest.py?q=testFirstPaintMetricSmoke&sq=package:chromium&g=0&l=212

It looks like it's produced here:
https://cs.chromium.org/chromium/src/components/data_reduction_proxy/content/browser/data_reduction_proxy_pingback_client_impl.cc?type=cs&q=time_to_first_contentful_paint&sq=package:chromium&g=0&l=66

but I haven't traced back through all of the callers to understand how the page load metrics are generated. Can anyone from the Speed team help us understand whether there are some trace events which are missing in the passthrough command decoder, for example?

 
Cc: sullivan@chromium.org tdres...@chromium.org kouhei@chromium.org
+speed metrics folks: can you help?
Cc: penghuang@chromium.org
Correction: we're seeing failures both with the pass-through command decoder, and when running on top of SwiftShader.

With just the pass-through command decoder:
https://chromium-review.googlesource.com/c/chromium/src/+/1163989/3

the tests run on top of the machine's default Mesa installation:
https://ci.chromium.org/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng/158564
https://chromium-swarm.appspot.com/task?id=3f2dc51c959e4d10&refresh=10&show_raw=1

    gl_renderer         : ANGLE (VMware, Inc. Gallium 0.4 on llvmpipe (LLVM 3.4, 256 bits) OpenGL 2.1)

When running on top of SwiftShader, the system falls back to using the validating command decoder.

It looks like this test only passes when using the "real" GL driver on the machine?

Was this related to the hookup of presentation timestamps throughout Chrome's GPU stack? CC'ing penghuang.

Cc: sadrul@chromium.org
I am not familiar to this test. Maybe sadrul@ know anything about it.
Owner: sadrul@chromium.org
Status: Assigned (was: Untriaged)
Sadrul, can you dig in? My guess is this has to do with using presentation timestamps.
When using the --disable-gpu flag, the test passes.  I think it makes sense to run telementry_unittests with that flag given they are on VMs.
The first-paint metrics do use the presentation-time to report the metrics. Does presentation-time not work with passthrough command decoder?

The presentation-time is requested from the renderer here: https://cs.chromium.org/chromium/src/content/renderer/gpu/layer_tree_view.cc?type=cs&g=0&l=93

If the test works with --disable-gpu, is that a proper fix to run the test with the flag? It would still be useful to understand the failure though I think.
sadrul@ could you please provide more pointers into the code about where request_presentation_feedback is honored, and which CompositorFrameSinkClient is typically called back with the gfx::PresentationFeedback?

I get lost around the call to Surface::QueueFrame here:
https://cs.chromium.org/chromium/src/components/viz/service/frame_sinks/compositor_frame_sink_support.cc?type=cs&g=0&l=429

and am not sure whether we aren't calling back the client here:
https://cs.chromium.org/chromium/src/components/viz/service/frame_sinks/compositor_frame_sink_support.cc?type=cs&g=0&l=477

or whether something else is going wrong in the gathering of presentation feedback. Can you show us where it is gathered in response to request_presentation_feedback being set in the compositor frame metadata?

Since presentation feedback seems to not be working *either* with the passthrough command decoder *or* with SwiftShader, I'm guessing that there's both a missing EGL extension (maybe not advertised from SwiftShader) and a missing feature in the passthrough command decoder.

If we decide to go ahead with disabling GPU compositing on this test, the CL to do it is here: https://chromium-review.googlesource.com/c/chromium/src/+/1167894
I think we should disable GPU compositing for this test suite now, and perhaps try to re-enable it. Could you please send out https://chromium-review.googlesource.com/1167894 for official review? Just LGTM'd it proactively.

GLSurfacePresentationHelper responds with the presentation-feedback here: https://cs.chromium.org/chromium/src/ui/gl/gl_surface_presentation_helper.cc?sq=package:chromium&dr=CSs&g=0&l=216

This is a callback to PassThroughImageTransportSurface::BufferPresented, which is registered here: https://cs.chromium.org/chromium/src/gpu/ipc/service/pass_through_image_transport_surface.cc?type=cs&g=0&l=57

This then goes to GLES2CommandBufferStub:BufferPresented, which sends an IPC here: https://cs.chromium.org/chromium/src/gpu/ipc/service/gles2_command_buffer_stub.cc?type=cs&g=0&l=403

The IPC is handled in CommandBufferProxyImpl::OnBufferPresented here: https://cs.chromium.org/chromium/src/gpu/ipc/client/command_buffer_proxy_impl.cc?type=cs&g=0&l=790

This then calls into GLES2Implementation::OnSwapBufferPresented, which runs a callback here: https://cs.chromium.org/chromium/src/gpu/command_buffer/client/gles2_implementation.cc?type=cs&g=0&l=411

This callback was registered previously in GLES2Implementation::Swap here: https://cs.chromium.org/chromium/src/gpu/command_buffer/client/gles2_implementation.cc?l=4738

The callback comes from OutputSurface, specifically GpuBrowserCompositorOutputSurface::SwapBuffers here: https://cs.chromium.org/chromium/src/content/browser/compositor/gpu_browser_compositor_output_surface.cc?g=0&l=131

When this callback runs, it runs the [potentially many] presentation-callbacks that have been stored in viz::Display here: https://cs.chromium.org/chromium/src/components/viz/service/display/display.cc?g=0&l=489

These callbacks come from the various viz::Surface instances, which got the callback from CompositorFrameSinkSupport in the previously linked code (in comment #7).

I am not sure which of these stages would be affected by the passthrough command-decoder.
Thank you sadrul@ for the thorough explanation. (Is it possible to write up some of this for posterity, maybe in src/docs/gpu/presentation_feedback.md?)

Looking through this code, it seems that gl_surface_presentation_helper.cc is the area most likely to have differences in behavior:
https://cs.chromium.org/chromium/src/ui/gl/gl_surface_presentation_helper.cc?sq=package:chromium&dr=CSs&g=0&l=216

Since the passthrough command decoder implies using ANGLE, my best guess about the common failure mode between (passthrough + ANGLE) and (validating + SwiftShader) is that both ANGLE and SwiftShader are missing some extension causing either GPUTiming, GLFence or the vsync querying APIs to fail. It looks like there isn't an EGL-specific implementation of VSyncProvider::SupportGetVSyncParametersIfAvailable:

https://cs.chromium.org/chromium/src/ui/gfx/vsync_provider.h?sq=package:chromium&dr=CSs&g=0&l=36

so I'm guessing it's either GPUTiming or GLFence which aren't supported.

Project Member

Comment 12 by bugdroid1@chromium.org, Aug 9

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d1f6e47c3337853fb41fa5ea496276d727a3145b

commit d1f6e47c3337853fb41fa5ea496276d727a3145b
Author: Geoff Lang <geofflang@chromium.org>
Date: Thu Aug 09 00:10:20 2018

Disable GPU compositing for telemetry_unittests.

These tests are run on VMs without real GPU access.  This avoids trying
to use the vmware GL drivers.

BUG= 871955 
BUG=602688

Change-Id: If8cbd19c80b3bb29788abab342c5c14a2d482d5f
Reviewed-on: https://chromium-review.googlesource.com/1167894
Commit-Queue: Kenneth Russell <kbr@chromium.org>
Reviewed-by: Kenneth Russell <kbr@chromium.org>
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Cr-Commit-Position: refs/heads/master@{#581728}
[modify] https://crrev.com/d1f6e47c3337853fb41fa5ea496276d727a3145b/testing/buildbot/chromium.clang.json
[modify] https://crrev.com/d1f6e47c3337853fb41fa5ea496276d727a3145b/testing/buildbot/chromium.fyi.json
[modify] https://crrev.com/d1f6e47c3337853fb41fa5ea496276d727a3145b/testing/buildbot/chromium.linux.json
[modify] https://crrev.com/d1f6e47c3337853fb41fa5ea496276d727a3145b/testing/buildbot/chromium.mac.json
[modify] https://crrev.com/d1f6e47c3337853fb41fa5ea496276d727a3145b/testing/buildbot/chromium.swarm.json
[modify] https://crrev.com/d1f6e47c3337853fb41fa5ea496276d727a3145b/testing/buildbot/chromium.win.json
[modify] https://crrev.com/d1f6e47c3337853fb41fa5ea496276d727a3145b/testing/buildbot/client.v8.chromium.json
[modify] https://crrev.com/d1f6e47c3337853fb41fa5ea496276d727a3145b/testing/buildbot/test_suites.pyl

Owner: geoffl...@chromium.org
Status: Fixed (was: Assigned)

Comment 14 by benhenry@google.com, Jan 16 (6 days ago)

Components: Test>Telemetry

Comment 15 by benhenry@google.com, Jan 16 (6 days ago)

Components: -Speed>Telemetry

Sign in to add a comment