New issue
Advanced search Search tips

Issue 893289 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 2
Type: Bug



Sign in to add a comment

Black flashes due to direct composition

Project Member Reported by brucedaw...@chromium.org, Oct 8

Issue description

On my home laptop I recently (a week ago?) started seeing black flashes in Chrome. The first flash observed was when playing a video in full-screen - there was a reproducible black flash when the UI elements faded out, presumably during the transition to the true full-screen display mode.

After that I started noticing the black flashes happening somewhat randomly on https://www.smbc-comics.com/ and https://twitter.com. I don't know if those sites were doing something special to trigger these black flashes or if I just spend most of my time on them.

I blame direct composition because of the flash when going to full screen and because when I added --disable-direct-composition to my command line arguments the problem went away.

I have not double checked that the problem returns when I remove the flag.

The flashing happens frequently enough to be quite annoying and therefore important.

I'm happy to do whatever additional diagnosis would be useful.

 
Cc: zmo@chromium.org
A bisect would be useful.  There were some changes to swap chain resizing logic recently that might've caused this: https://chromium-review.googlesource.com/c/chromium/src/+/1235316

I'll try reproducing on the Kaby Lake system that I have.
I couldn't reproduce on a Lenovo Yoga 710 even with NV12 enabled (--enable-features=DirectCompositionPreferNV12Overlays), although I did notice two other issues:
1) Video controls do not behave correctly independent of direct composition.  Fullscreen toggle doesn't let you go back to windowed, and video doesn't resize correctly when switching from fullscreen to windowed.
2) I don't see overlays being used for Youtube even when fullscreen.  GPU.DirectComposition.DCLayerResult is either 1 (unsupported quad) or 4 (occluded).  I don't see any obvious element on top of the video, but it's possible some new UI element was added to Youtube or we're detecting something invisible as occluding.

Can you attach chrome://gpu contents after you see flashing?  It contains GPU process log output so it might be useful.

Also, does the entire window flash or only some parts?


I removed the switch and then went back and forth between normal and full-screen mode while playing a video several times but never saw the flash.

I then browsed through smbc.com for a bit and quickly hit a flash. I then went to chrome://gpu, copied the contents to the clipboard, and then attached it to this comment.

I doubt the specifics matter but I've been going backwards through old smbc comics and I'd made it to this one when the most recent flash happened:

https://www.smbc-comics.com/comic/2013-09-17

I tried getting a video screen capture but the flash didn't happen while capturing, either through bad luck or because the capturing changes the behavior.

I believe the flash is of the entire web page.

chrome_gpu.txt
59.9 KB View Download
I have the exact same GPUs on my new company laptop as yours. Is your laptop a Dell?

I'll see if I can reproduce on my side.
My laptop is a Lenovo P51, purchased around April 2017.

The timing of the bug appearing doesn't seem to correlate to a Chrome update, Windows update, or driver update, so I'm a bit confused about what's going on.

There might be finch experiments change though
First, that's M69, so the swap chain size CL can't be the culprit.

Second, I did 50+ click "random" on smbc.com and couldn't repro. I have exactly the same dual GPUs on my Dell.

By the way, I have almost the exact same error messages in the about:gpu as yours. We need to look into getting rid of these "noises", but I think they are irrelevant to this bug.
Any thoughts on how I can capture data to help with the analysis? chrome://tracing? ETW tracing? Custom build with instrumentation?
Maybe two things
1) enabling logging and provide us the chrome log when you reproduce this
2) provide the tracing (rendering) if you can reproduce this during the recording
I'm pretty sure the video decoder errors are a red herring because:
1) The errors are for initializing VP8 HW decoder which isn't supported on Windows anyway from a quick look at the code.
2) They happen about a full second before the other errors.

Can you take a look at chrome://crashes and paste the crash ids here?  The finch hashes from chrome://version would also be helpful.
Note that smbc.com is quite different from smbc-comics.com :-)

I don't think crashes are related to this, but who knows? I do have two crash reports from yesterday morning when I was investigating and reproducing this, so maybe.

I looked at one of them and it's a gpu-process crash in the Intel driver, so maybe this is simply an Intel driver bug. If so then it's impressive that the GPU process recovers so quickly and smoothly that all I perceived was a brief flash of black.

I've attached the two crashes from yesterday and another one from five days ago. I'm not sure if that is enough crashes to explain all of the black flashes, but maybe.

97bfb423-1ad1-4a64-b66c-046d047ef03d.dmp
1.1 MB Download
c58a23a4-1bf4-4529-9dee-81e52b4669e0.dmp
1.0 MB Download
13f17199-94b6-45f1-aef5-2b0a52944151.dmp
1.1 MB Download
chrome_version.txt
2.1 KB View Download
Cc: yang...@intel.com
Labels: GPU-Intel
GPU process crashes do cause a brief flash and then recover itself quickly. From your log, we do see a GPU crash. However, it would be beneficial to understand what's the root cause of this crash. If it's driver issue, we need to route back to Intel driver team for a fix and potentially have a workaround in Chrome first.

Yang: do you have this hardware in house and can reproduce it?
To correct my comments in #7, I don't have a GPU crash on my side, although I do have similar error reports from GPU process as in #3.
Owner: yang...@intel.com
Status: Assigned (was: Untriaged)
I grabbed 29 crashes from my Chrome crashpad\crashdumps directory and analyzed them at work. They fell into four buckets:

15 instances of crashes in ReportOOMErrorInMainThread
9 instances of crashes in InitializeVideoProcessor - CHECK(SUCCEEDED(hr)); fails with hr = 0x80070057
3 instances of crashes in igd10iumd64 - accounting for some of the black screens but not all?
2 instances of crashes in NativeModule::FunctionCount - called with a completely bogus 'this' pointer, circa May 2018

One thing I noticed is that I was unable to usefully analyze the two recent igd10iumd64 crashes on my work machine. That's because the igd10iumd64 module is not available and without its metadata the stack walk cannot proceed beyond that module - such is the nature of x64 stacks. I understand that Intel may not want to publish their driver *symbols* but it is crucial that Intel publish their driver *binaries* on a symbol server. AMD and NVIDIA have both recently done this.

Of the 15 most recent crashes, 12 are OOM crashes (I think these are all due to a single badly behaved web site) and 3 are due to this Intel driver bug. In all cases the crash is from dereferencing a near-null pointer (the pointer value is 0x10). So, there are two action items for Intel here:


1) Please look at the three attached crash dumps in the Intel driver and see if the bug that they reveal can be fixed.
2) Please try to get Intel's binaries published on a symbol server so that we have some hope of analyzing crashes like this where the stack walk crosses Intel's module boundaries. I am not asking for *symbols* to be published (although that would be nice), just *binaries*. If you 

AMD's symbol server:
https://gpuopen.com/amd-driver-symbol-server/

NVIDIA's symbol server:
https://developer.nvidia.com/nvidia-driver-symbol-server

Intel's symbol server:
[404]



The crashes are not enough to account for all of the black flashes that I have seen, but I can also no longer reproduce the problem so I don't think there is anything more that we can do on our end at this moment.


Please let me know if you will be able to do the two requested tasks - they are both quite important, beyond just Chrome.

The igd10iumd64 crashes might have been fixed already if you are able to upgrade from 23.20.16.4973 to 24.20.100.6286. This is the top Chrome GPU crashes on Windows and Intel driver team likely fixed it with the latest driver.
We don't have this type of laptop. We seldom buy laptop, and what we have now are some DELL XPS ones. 
We will report these 3 crash dumps to internal driver team. And I agree the more important thing is to host a symbol server. Let me initiate some discussion internally to see how far I can go. Thanks!
Intel driver team believe this might be a bug that have been fixed. Could you try with the latest Intel driver?
You mean the black flashes might be a driver bug? What driver version do I need to upgrade to in order to get the fix?

I usually let Lenovo's software manage updates for me so I'm not sure if I can upgrade my driver ahead of schedule without causing myself future problems, but I can try.

24.20.100.6286 is stable and the version we use on our Windows/Intel GPU bots on the waterfalls.
Owner: jie.a.c...@intel.com
Intel usually release the graphics driver at https://downloadcenter.intel.com/product/80939/Graphics-Drivers. The latest version is 25.20.100.6326. I don't have that type of Lenovo laptop device to try it by myself. I am not quite sure it can work for you.


Sign in to add a comment