New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 608436 link

Starred by 0 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug

Blocked on:
issue 561763

Blocking:
issue 703263



Sign in to add a comment

"CompositorTest.CreateAndReleaseOutputSurface" is flaky

Project Member Reported by chromium...@appspot.gserviceaccount.com, May 2 2016

Issue description

"CompositorTest.CreateAndReleaseOutputSurface" is flaky.

This issue was created automatically by the chromium-try-flakes app. Please find the right owner to fix the respective test/step and assign this issue to them. If the step/test is infrastructure-related, please add Infra-Troopers label and change issue status to Untriaged. When done, please remove the issue from Sheriff Bug Queue by removing the Sheriff-Chromium label.

We have detected 3 recent flakes. List of all flakes can be found at https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyNwsSBUZsYWtlIixDb21wb3NpdG9yVGVzdC5DcmVhdGVBbmRSZWxlYXNlT3V0cHV0U3VyZmFjZQw.

Flaky tests should be disabled within 30 minutes unless culprit CL is found and reverted. Please see more details here: https://sites.google.com/a/chromium.org/dev/developers/tree-sheriffs/sheriffing-bug-queues#triaging-auto-filed-flakiness-bugs
 
Cc: vollick@chromium.org piman@chromium.org
Components: Internals>Compositing
Labels: -Sheriff-Chromium
Owner: danakj@chromium.org
Can one of you compositing owners take a look, please?
Cc: siev...@chromium.org

Comment 3 by kbr@chromium.org, May 3 2016

Cc: jaydasika@chromium.org kbr@chromium.org
Please investigate. Example failure: https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/215071


[ RUN      ] CompositorTest.CreateAndReleaseOutputSurface

Backtrace:

	(No symbol) [0x005AC270]

	GetHandleVerifier [0x01210684+890996]

	GetHandleVerifier [0x0120FABC+887980]

	GetHandleVerifier [0x0127AA14+1326084]

	GetHandleVerifier [0x011C3E28+577560]

	GetHandleVerifier [0x0113F157+33607]

	(No symbol) [0x0112E4FB]

	(No symbol) [0x0112D935]

	GetHandleVerifier [0x011404DA+38602]

	GetHandleVerifier [0x01140C8A+40570]

	(No symbol) [0x0112E037]

	(No symbol) [0x0111AEF9]

	GetHandleVerifier [0x012785E5+1316821]

	(No symbol) [0x010D5FF2]

	GetHandleVerifier [0x01252360+1160528]

	GetHandleVerifier [0x012524DA+1160906]

	GetHandleVerifier [0x01252AF9+1162473]

	GetHandleVerifier [0x01252836+1161766]

	GetHandleVerifier [0x01168421+202257]

	GetHandleVerifier [0x0116603D+193069]

	GetHandleVerifier [0x01165D38+192296]

	(No symbol) [0x01117AE9]

	GetHandleVerifier [0x0157C444+4478516]

	BaseThreadInitThunk [0x76AC336A+18]

	RtlInitializeExceptionChain [0x772292B2+99]

	RtlInitializeExceptionChain [0x77229285+54]

[152/152] CompositorTest.CreateAndReleaseOutputSurface (CRASHED)

Comment 4 by piman@chromium.org, May 3 2016

That stack trace doesn't make any sense :( I don't think we have symbols - the function offsets are all over the place.
The x64 bot is equally nonsense:

[ RUN      ] CompositorTest.CreateAndReleaseOutputSurface
Backtrace:
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F2D8DD4+1668004]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F2D8409+1665497]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F2D7B65+1663285]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F77F244+6543380]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F77F244+6543380]
	GetHandleVerifier [0x000000014014BB0C+389116]
	ScaleYUVToRGB32Row_SSE2_X64 [0x00000001400D489C+16330348]
	ScaleYUVToRGB32Row_SSE2_X64 [0x00000001400D38B0+16326272]
	GetHandleVerifier [0x000000014014D158+394824]
	GetHandleVerifier [0x000000014014DFA4+398484]
	ScaleYUVToRGB32Row_SSE2_X64 [0x00000001400AECFE+16175822]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F1E3FD5+664997]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F1E3F2B+664827]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F1944B5+338565]
	GetHandleVerifier [0x00000001401611B9+476841]
	GetHandleVerifier [0x000000014016EB12+532482]
	GetHandleVerifier [0x000000014016ED01+532977]
	GetHandleVerifier [0x000000014016EBE9+532697]
	GetHandleVerifier [0x000000014016F0A7+533911]
	GetHandleVerifier [0x0000000140161259+477001]
	GetHandleVerifier [0x000000014016EE41+533297]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F1F84B7+748167]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F20EA83+839763]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F20E680+838736]
	ScaleYUVToRGB32Row_SSE2_X64 [0x000000013F1D1680+588880]
	GetHandleVerifier [0x00000001404035DC+3238604]
	BaseThreadInitThunk [0x0000000077685A4D+13]
	RtlUserThreadStart [0x00000000778BB831+33]
Project Member

Comment 6 by chromium...@appspot.gserviceaccount.com, May 3 2016

Labels: Sheriff-Chromium
Detected 5 new flakes for test/step "CompositorTest.CreateAndReleaseOutputSurface". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyNwsSBUZsYWtlIixDb21wb3NpdG9yVGVzdC5DcmVhdGVBbmRSZWxlYXNlT3V0cHV0U3VyZmFjZQw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).

Comment 7 by kbr@chromium.org, May 3 2016

Blockedon: 561763
Cc: dpranke@chromium.org mar...@chromium.org
Components: Infra>Platform>Swarming
Looking at the run from https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/215071 :

and the task https://chromium-swarm.appspot.com/user/task/2e8c6fbbdf57f810 :

and downloading the isolate:

python tools\swarming_client\isolateserver.py download -I https://isolateserver.appspot.com -s fc22ca3cf33bc19a2b9dd105be38a23ff1396d70 -t foo

yes, no .pdb files are contained in the compositor_unittests isolate.

Key .pdb files were recently added to chrome.isolate to enable debugging in  Issue 561763 . Presumably adding them to compositor_unittests.isolate will fix this?

Comment 8 by kbr@chromium.org, May 3 2016

Also, note that the switch from GYP to GN on Windows is supposed to automatically add the dependent .pdbs to the isolate. If this is close to being done then maybe we should push that work through now.

Labels: -Sheriff-Chromium
Removing "Sheriff-Chromium" since this bug is assigned.
Project Member

Comment 10 by chromium...@appspot.gserviceaccount.com, May 5 2016

Labels: Sheriff-Chromium
Detected 3 new flakes for test/step "CompositorTest.CreateAndReleaseOutputSurface". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyNwsSBUZsYWtlIixDb21wb3NpdG9yVGVzdC5DcmVhdGVBbmRSZWxlYXNlT3V0cHV0U3VyZmFjZQw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Project Member

Comment 11 by chromium...@appspot.gserviceaccount.com, May 6 2016

Detected 3 new flakes for test/step "CompositorTest.CreateAndReleaseOutputSurface". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyNwsSBUZsYWtlIixDb21wb3NpdG9yVGVzdC5DcmVhdGVBbmRSZWxlYXNlT3V0cHV0U3VyZmFjZQw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Labels: -Sheriff-Chromium
Status: Assigned (was: Untriaged)
Without seeing a real stack trace yet, I'm guessing this is actually 608946
Project Member

Comment 14 by chromium...@appspot.gserviceaccount.com, May 10 2016

Labels: Sheriff-Chromium
Detected 3 new flakes for test/step "CompositorTest.CreateAndReleaseOutputSurface". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyNwsSBUZsYWtlIixDb21wb3NpdG9yVGVzdC5DcmVhdGVBbmRSZWxlYXNlT3V0cHV0U3VyZmFjZQw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Labels: -Sheriff-Chromium
Still happening; removing Sheriff-Chromium again unless you think this needs a different owner, danakj
Project Member

Comment 16 by chromium...@appspot.gserviceaccount.com, May 11 2016

Labels: Sheriff-Chromium
Detected 4 new flakes for test/step "CompositorTest.CreateAndReleaseOutputSurface". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyNwsSBUZsYWtlIixDb21wb3NpdG9yVGVzdC5DcmVhdGVBbmRSZWxlYXNlT3V0cHV0U3VyZmFjZQw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Labels: -Sheriff-Chromium
I'm not sure what to do with this without stack traces, but I'm going to keep plugging away and getting output surface destroyed on the compositor thread and see if that helps.
Project Member

Comment 18 by chromium...@appspot.gserviceaccount.com, May 12 2016

Labels: Sheriff-Chromium
Detected 7 new flakes for test/step "CompositorTest.CreateAndReleaseOutputSurface". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyNwsSBUZsYWtlIixDb21wb3NpdG9yVGVzdC5DcmVhdGVBbmRSZWxlYXNlT3V0cHV0U3VyZmFjZQw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Project Member

Comment 19 by chromium...@appspot.gserviceaccount.com, May 12 2016

Cc: stale-flakes-reports@google.com
Reporting to stale-flakes-reports@google.com to investigate why this issue has been in the appropriate queue 5 times or more.
Cc: -stale-flakes-reports@google.com
I wonder why wasn't this test disabled yet? Instructions mentioned in the first message explain that flaky tests should be disabled by a Sheriff within 30 minutes unless a culprit CL is found and reverted. Are these instructions not clear enough? I would really welcome some feedback.

Regarding the stacktrace, I can see a backtrace in https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_x64_rel_ng/builds/212506/steps/compositor_unittests%20%28with%20patch%29%20on%20Windows-7-SP1/logs/stdio. Also if more details are necessary, why not add some code printing them for debugging purposes?
The stacktraces aren't correct. See comments #3 #4 #5 #7. I've been working on fixing races in context destruction and hoping they do something here.

I don't know why sheriffs didn't disable it, maybe because it's try bots and not the waterfall?

> Also if more details are necessary, why not add some code printing them for debugging purposes?

Printf debugging on tot is possible I suppose. I hadn't considered that.
[ RUN      ] CompositorTest.CreateAndReleaseOutputSurface
[1128:3060:0512/171558:16259890:ERROR:compositor_unittest.cc(98)] > Setup
[1128:3060:0512/171558:16259890:ERROR:compositor_unittest.cc(104)] < Setup
[1128:3060:0512/171558:16259890:ERROR:compositor_unittest.cc(106)] > ScheduleDraw 1
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(109)] < ScheduleDraw 1
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(112)] > SetVisible(false)
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(114)] < SetVisible(false)
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(115)] > ReleaseAcceleratedWidget
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(118)] < ReleaseAcceleratedWidget
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(121)] > SetAcceleratedWidget
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(123)] < SetAcceleratedWidget
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(124)] > SetVisible(true)
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(126)] < SetVisible(true)
[1128:3060:0512/171558:16259921:ERROR:compositor_unittest.cc(128)] > ScheduleDraw 2

Backtrace:
	(No symbol) [0x2A792E73]
	GetHandleVerifier [0x00AE3454+901156]
	GetHandleVerifier [0x00AE288C+898140]
  ....
The sad news is that means the crash is probably in some completely async callstack so I've narrowed down very little.
Project Member

Comment 25 by bugdroid1@chromium.org, May 13 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/60271333ba58503032993dcef9b4228f96d21d28

commit 60271333ba58503032993dcef9b4228f96d21d28
Author: danakj <danakj@chromium.org>
Date: Fri May 13 01:19:19 2016

Disable CreateAndReleaseOutputSurface test on windows as it's flaky.

R=piman@chromium.org
BUG=608436

Review-Url: https://codereview.chromium.org/1976443003
Cr-Commit-Position: refs/heads/master@{#393418}

[modify] https://crrev.com/60271333ba58503032993dcef9b4228f96d21d28/ui/compositor/compositor_unittest.cc

Project Member

Comment 26 by chromium...@appspot.gserviceaccount.com, May 13 2016

Detected 3 new flakes for test/step "CompositorTest.CreateAndReleaseOutputSurface". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyNwsSBUZsYWtlIixDb21wb3NpdG9yVGVzdC5DcmVhdGVBbmRSZWxlYXNlT3V0cHV0U3VyZmFjZQw. This message was posted automatically by the chromium-try-flakes app. Since flakiness is ongoing, the issue was moved back into Sheriff Bug Queue (unless already there).
Labels: -Sheriff-Chromium
The flaky analyzer is late. 
Ran it on the debug trybot, got this equally unreasonable backtrace

[216:6540:0516/140138:40317787:ERROR:compositor_unittest.cc(131)] > Wait for draw 2

Backtrace:

	cc::PictureLayerImpl::PictureLayerImpl [0x016A61FB+2507518]
	cc::PictureLayerImpl::PictureLayerImpl [0x016A5492+2504085]
	cc::PictureLayerImpl::PictureLayerImpl [0x016A3E9D+2498464]
	cc::PictureLayerImpl::PictureLayerImpl [0x016A4B79+2501756]
	cc::PictureLayerImpl::PictureLayerImpl [0x016A098B+2484878]
	cc::PictureLayerImpl::PictureLayerImpl [0x016A0931+2484788]
	cc::PictureLayerImpl::PictureLayerImpl [0x016A55E6+2504425]
	cc::PictureLayerImpl::PictureLayerImpl [0x014EA5CE+689873]
	cc::PictureLayerImpl::PictureLayerImpl [0x014EA062+688485]
	cc::PictureLayerImpl::PictureLayerImpl [0x014E95BB+685758]
	cc::PictureLayerImpl::PictureLayerImpl [0x014E94F1+685556]
	cc::PictureLayerImpl::PictureLayerImpl [0x014EA616+689945]
	base::MessagePumpForUI::~MessagePumpForUI [0x00FDBDBE+89571]
	base::MessagePumpForUI::~MessagePumpForUI [0x0100D8C4+293097]
	base::MessagePumpForUI::~MessagePumpForUI [0x01078AE0+731909]
	base::MessagePumpForUI::~MessagePumpForUI [0x010769BD+723426]
	base::MessagePumpForUI::~MessagePumpForUI [0x01076FA4+724937]
	base::MessagePumpForUI::~MessagePumpForUI [0x0107FE43+761448]
	base::MessagePumpForUI::~MessagePumpForUI [0x0108106B+766096]
	base::MessagePumpForUI::~MessagePumpForUI [0x01078821+731206]
	base::MessagePumpForUI::~MessagePumpForUI [0x01120104+1417513]
	(No symbol) [0x004D0280]
	(No symbol) [0x004D01BC]
	(No symbol) [0x00471B55]
	(No symbol) [0x008E4324]
	(No symbol) [0x00900D27]
	(No symbol) [0x00900FAD]
	(No symbol) [0x00900E4F]
	(No symbol) [0x00901435]
	(No symbol) [0x008E4414]
	(No symbol) [0x0090112F]
	(No symbol) [0x0052950F]
	(No symbol) [0x0052965D]
	(No symbol) [0x004CF4FB]
	(No symbol) [0x004CF4B8]
	(No symbol) [0x004CF699]
	(No symbol) [0x0058E7FE]
	(No symbol) [0x0058DAC1]
	(No symbol) [0x0058D8EB]
	(No symbol) [0x004CF720]
	(No symbol) [0x00A6C39E]
	(No symbol) [0x00A6C20A]
	(No symbol) [0x00A6C09D]
	(No symbol) [0x00A6C3B8]
	BaseThreadInitThunk [0x7616336A+18]
	RtlInitializeExceptionChain [0x774492B2+99]
	RtlInitializeExceptionChain [0x77449285+54]

Comment 29 by kbr@chromium.org, May 16 2016

Sorry. That's really unfortunate. Chrome's in-process stack walking code should pick up program database (PDB) files alongside the binaries. Which bot did you run this on (link to the tryjob?) The recent switch of Windows to GN should cause PDBs to be included for all isolates and if they're not there then that should be fixed.

https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_dbg_ng/builds/1326/steps/compositor_unittests%20%28with%20patch%29%20on%20Windows-7-SP1/logs/stdio

I re-read your comment about pdbs and am trying to understand what GN magic to invoke to have the right debugging stuff. The CL you linked to did a bunch of stuff with crash reporter, but I'm hoping it's just the pdb parts we need?
Cc: brucedaw...@chromium.org scottmg@chromium.org wfh@chromium.org
That bot is currently configured to build w/ minimal_symbols.

Would switching to full symbols help?

scottmg/brucedawson/wfh - any ideas on what we might need to get better stacktraces?
Should win_chromium_rel_ng be giving better symbols than it was last week? I can try those again.
minimal_symbols should have function-level symbols for stack traces, so it sounds like pdbs aren't getting to the runner.
last week win_chromium_rel_ng was GYP; this week (or, at least today) is GN.

I will see if I can tell if the PDBs are part of the isolate or not.
@scottmg is right; the pdbs aren't getting to the runner. Fixing that should not be tricky, but we haven't done it yet.
is there a bug for that? we can block this on it.
@danakj - I don't think so. Feel free to file one.

I think the only real change that is needed is to modify the `component` template in GN to include `data = [ '{{output_name}}.pdb' ]` in a component build or something roughly like that.
Ping. Should this be Pri-1? If so please assign a milestone.
Labels: -Pri-1 Pri-2
I ran the test on 8 windows trybots without it flaking. I'll try some more, maybe it's fixed though.
Blocking: 703263
Owner: ----
Status: Available (was: Assigned)
Components: -Infra>Platform>Swarming Infra>Platform>Swarming>Admin

Sign in to add a comment