New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 638744 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Closed: Sep 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug

Blocked on:
issue 599794

Blocking:
issue 600440



Sign in to add a comment

trace_test fails on Windows GPU.FYI bots

Project Member Reported by ynovikov@chromium.org, Aug 17 2016

Issue description

Including Win7 Release (NVIDIA), Win8 Release (NVIDIA), Win7 Debug (ATI), Win7 x64 Release (NVIDIA), Win7 x64 Debug (NVIDIA).

It started here: https://build.chromium.org/p/chromium.gpu.fyi/builders/Win8%20Release%20%28NVIDIA%29/builds/24036

Failure log contains:
[1288:452:0817/093457:FATAL:trace_event_impl.cc(188)] Check failed: duration_.ToInternalValue() == -1 (11 vs. -1)
Backtrace:
	base::debug::StackTrace::StackTrace [0x6F1CB3A7+23]
	logging::LogMessage::~LogMessage [0x6F164471+49]
	base::trace_event::TraceEvent::UpdateDuration [0x6F1DB772+98]
	base::trace_event::TraceLog::UpdateTraceEventDuration [0x6F17B1CE+366]
	base::WorkerPool::RunsTasksOnCurrentThread [0x6F1C4051+657]
	TpReleaseWork [0x773040BF+1231]
	WinSqmIncrementDWORD [0x77301978+392]
	BaseThreadInitThunk [0x75F57C04+36]
	RtlInitializeExceptionChain [0x7733AD1F+143]
	RtlInitializeExceptionChain [0x7733ACEA+90]

There were couple of catapult rolls in that run, plus v8, skia and other changes. No idea who to blame.
 

Comment 1 by kbr@chromium.org, Aug 17 2016

Components: Blink>JavaScript Platform>DevTools>Tracing Tests>Telemetry Internals>Tracing
Here are the first failing builds on a few bots:

https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Release%20%28NVIDIA%29/builds/27027
https://build.chromium.org/p/chromium.gpu.fyi/builders/Win8%20Release%20%28NVIDIA%29/builds/24036
https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Release%20%28ATI%29/builds/23128
https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Debug%20%28ATI%29/builds/5486

The failure might be flaky. I can't tell clearly from the bots.

I scanned the blamelists of some of these failing builds and a couple of earlier green builds but haven't found a suspect CL yet.

Comment 2 by kbr@chromium.org, Aug 17 2016

Cc: zh...@chromium.org

Comment 3 by kbr@chromium.org, Aug 17 2016

Cc: oysteine@chromium.org

Comment 4 by kbr@chromium.org, Aug 17 2016

Blockedon: 599794
Cc: ssid@chromium.org
The last time this showed up was  Issue 599794 . Has a similar kind of change been reintroduced?

Comment 5 by kbr@chromium.org, Aug 17 2016

Cc: vollick@chromium.org
The patch which broke this in  Issue 599794  added a TRACE_EVENT on a worker thread. I note that https://codereview.chromium.org/2254623002 modifies the compositor worker. CC'ing vollick@.

Comment 6 by kbr@chromium.org, Aug 17 2016

Cc: -kbr@chromium.org -vollick@chromium.org -zh...@chromium.org loyso@chromium.org r...@chromium.org enne@chromium.org
Could the revert in  https://chromium.googlesource.com/chromium/src/+/2f03c7f2b75275f951efaf6b5e5f61cc52d74a8e have bypassed the CQ and changed the threading in the compositor slightly, including where TRACE_EVENTs are installed?

Comment 7 by kbr@chromium.org, Aug 17 2016

Blocking: 600440

Comment 8 by kbr@chromium.org, Aug 17 2016

Possibly slightly better stack trace from:
https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20Debug%20%28ATI%29/builds/5486/steps/trace_test%20on%20ATI%20GPU%20on%20Windows%20on%20Windows-2008ServerR2-SP1/logs/stdio :

[4532:4720:0817/111045:FATAL:trace_event_impl.cc(188)] Check failed: duration_.ToInternalValue() == -1 (106 vs. -1)
Backtrace:
	base::debug::StackTrace::StackTrace [0x10079367+23]
	logging::LogMessage::~LogMessage [0x100C626B+59]
	base::trace_event::TraceEvent::UpdateDuration [0x102A6934+164]
	base::trace_event::TraceLog::UpdateTraceEventDuration [0x102CEA57+375]
	trace_event_internal::ScopedTracer::~ScopedTracer [0x1007C751+65]
	base::WorkerPool::RunsTasksOnCurrentThread [0x10230ECB+475]
	TpAllocWork [0x7788672F+799]
	TpCallbackIndependent [0x77872D89+2012]
	BaseThreadInitThunk [0x75A1336A+18]
	RtlInitializeExceptionChain [0x778592B2+99]
	RtlInitializeExceptionChain [0x77859285+54]

Comment 9 by kbr@chromium.org, Aug 17 2016

Cc: -loyso@chromium.org -vmi...@chromium.org dskiba@chromium.org simonhatch@chromium.org
Since this is triggered from WorkerPool::RunsTasksOnCurrentThread, I'm guessing this is the callback in worker_pool_win.cc:24 which does a TRACE_TASK_EXECUTION but it's hard to know for sure. That trace event has been there for months though :/. 

Comment 11 Deleted

#11: That CL landed after this first broke, as far as I can see 

Comment 13 by kbr@chromium.org, Aug 17 2016

Cc: primiano@chromium.org

Comment 14 by kbr@chromium.org, Aug 17 2016

Owner: oysteine@chromium.org
Status: Assigned (was: Untriaged)
oysteine@ said he would drive the investigation.

Found a different assert triggered in one build:

[3956:4380:0817/124602:FATAL:trace_buffer.h(34)] Check failed: index < size(). 
Backtrace:
	base::debug::StackTrace::StackTrace [0x10079367+23]
	logging::LogMessage::~LogMessage [0x100C626B+59]
	base::trace_event::TraceBufferChunk::GetEventAt [0x10286D34+164]
	base::trace_event::TraceLog::GetEventByHandleInternal [0x102CBC15+197]
	base::trace_event::TraceLog::UpdateTraceEventDuration [0x102CE9A3+195]
	trace_event_internal::ScopedTracer::~ScopedTracer [0x1007C751+65]
	base::WorkerPool::RunsTasksOnCurrentThread [0x10230ECB+475]
	TpAllocWork [0x7773672F+799]
	TpCallbackIndependent [0x77722D89+2012]
	BaseThreadInitThunk [0x7603337A+18]
	RtlInitializeExceptionChain [0x777092B2+99]
	RtlInitializeExceptionChain [0x77709285+54]
0  Id: 3f4.150 Suspend: 0 Teb: 000007ff`fffde000 Unfrozen
	RetAddr           : Args to Child                                                           : Call Site
	00000000`77037c56 : 00000000`00000058 000007fe`f2ed5469 000007fe`00000000 00000000`00000000 : ntdll!NtWaitForKeyedEvent+0xa
	000007fe`f0ac94bf : 00000000`00351a50 000007fe`00000000 00000000`0de582e0 00000000`00385dc0 : ntdll!RtlFindMostSignificantBit+0xf6
	000007fe`f0aa6b73 : 00000000`00000000 00000000`00000000 00000000`00351dd8 000007fe`f0aa9fe2 : chrome_7feef590000!base::internal::LockImpl::Lock+0x1f
	000007fe`f0aa7688 : ffffffff`fffffffe 00000000`0022d0f0 00000002`79b0386b 00000000`3f800000 : chrome_7feef590000!base::trace_event::TraceLog::ThreadLocalEventBuffer::AddTraceEvent+0x43
	000007fe`f153c076 : 00000002`79b0386b 00000002`79b0386b 00000000`00000001 00000000`00000000 : chrome_7feef590000!base::trace_event::TraceLog::AddTraceEventWithThreadIdAndTimestamp+0x6d8
	000007fe`f2435d88 : 000007fe`f2ed5469 00000000`0022d7f0 00000000`0022dea0 00000000`09277e80 : chrome_7feef590000!cc::ScrollAndScaleSet::~ScrollAndScaleSet+0x116
	000007fe`f2436539 : 00000000`0022df90 00000000`0935d230 00000000`0935bd60 00000000`00000000 : chrome_7feef590000!cc::DirectRenderer::DrawRenderPass+0x88
	000007fe`f2435a75 : 00000000`0935d230 00000000`0935d230 00000000`0935d238 00000000`0022db70 : chrome_7feef590000!cc::DirectRenderer::DrawRenderPassAndExecuteCopyRequests+0xa9
	000007fe`f15eaff8 : 00000000`09277e80 00000000`09241998 00000000`00000000 0000030c`0000040c : chrome_7feef590000!cc::DirectRenderer::DrawFrame+0x505
	000007fe`f15e92c7 : 00000000`06c147a0 00000000`0022e590 00000000`00000000 0000030c`0000040c : chrome_7feef590000!cc::Display::DrawAndSwap+0x53c
	000007fe`f15e8cbf : 000007fe`00000001 000007fe`00000000 000007fe`f2ed5469 000007fe`f2a39f88 : chrome_7feef590000!cc::DisplayScheduler::DrawAndSwap+0x14f
	000007fe`f15e9524 : 000007fe`f2ed5469 00000000`0022e8c9 000007fe`f2a39f88 00000000`034969d0 : chrome_7feef590000!cc::DisplayScheduler::AttemptDrawAndSwap+0x77
	000007fe`f209f9b3 : 000007fe`f15e949c 000007fe`f2ed5456 000007fe`f2ed5456 00000000`0022ed30 : chrome_7feef590000!cc::DisplayScheduler::OnBeginFrameDeadline+0x88
	000007fe`f209f9b3 : 000007fe`ef716240 00000000`00000000 000007fe`f2ed5456 000007fe`f27cad48 : chrome_7feef590000!base::internal::Invoker<base::internal::BindState<void (__cdecl base::CancelableCallback<void __cdecl(void)>::*)(void)const __ptr64,base::WeakPtr<base::CancelableCallback<void __cdecl(void)> > >,void __cdecl(void)>::Run+0x3b
	000007fe`f0b225e8 : 000007fe`f2a39ce8 000007fe`f2a39ce8 00000000`00000001 00000000`0022e8c9 : chrome_7feef590000!base::internal::Invoker<base::internal::BindState<void (__cdecl base::CancelableCallback<void __cdecl(void)>::*)(void)const __ptr64,base::WeakPtr<base::CancelableCallback<void __cdecl(void)> > >,void __cdecl(void)>::Run+0x3b
	000007fe`f0a9cadc : 00000002`79b037d1 00000002`79b037d1 00000000`00a23966 000007fe`00a23966 : chrome_7feef590000!base::debug::TaskAnnotator::RunTask+0x148
	000007fe`f0a9bc5b : 00000000`00000606 00000000`0038f7a0 00000000`00000001 00000002`79b037d1 : chrome_7feef590000!base::MessageLoop::RunTask+0x36c
	000007fe`f0b23aa8 : 00000000`00355f00 00000000`00000001 00000000`00000001 00000000`00000001 : chrome_7feef590000!base::MessageLoop::DoWork+0x19b
	000007fe`f0b24974 : 00000000`00000001 00000000`00355f00 00000000`0022f3b0 000007fe`f0afd32e : chrome_7feef590000!base::MessagePumpForUI::DoRunLoop+0x78
	000007fe`f0af2bae : 00000000`0022ef00 00000000`0038f7a0 00000000`00000001 00000000`00000003 : chrome_7feef590000!base::MessagePumpWin::Run+0x54
	000007fe`f07c1155 : 00000000`00000000 00000000`0038c150 00000000`0022f3b0 00000000`00000001 : chrome_7feef590000!base::RunLoop::Run+0x2e
	000007fe`ef82644a : 00000000`00000000 00000000`0038bf80 00000000`0022f580 000007fe`f0ac9656 : chrome_7feef590000!ChromeBrowserMainParts::MainMessageLoopRun+0x125
	000007fe`ef827e4a : 00000000`00989680 00000000`000003e8 00000002`78fde430 00000000`000e1761 : chrome_7feef590000!content::BrowserMainLoop::RunMainMessageLoopParts+0x6a
	000007fe`ef821e90 : 00000000`00000000 00000000`00370560 00000000`00370560 00000000`0022f4d0 : chrome_7feef590000!content::BrowserMainRunnerImpl::Run+0xae
	000007fe`f0689170 : 00000000`00000000 00000000`00370560 00000000`0022f580 00000000`00000000 : chrome_7feef590000!content::BrowserMain+0xa8
	000007fe`f0688f56 : 00000000`00000000 00000000`0022f480 00000000`003683b0 000007fe`00000000 : chrome_7feef590000!content::RunNamedProcessTypeMain+0x1c0
	000007fe`f06881a4 : 00000001`3f990000 00000000`0036d710 00000000`00000000 00000000`ffffffff : chrome_7feef590000!content::ContentMainRunnerImpl::Run+0x166
	000007fe`ef674746 : 000007fe`f42cecd8 000007fe`f42cecd8 00000000`00000000 00000000`0022f6d8 : chrome_7feef590000!content::ContentMain+0x30
The above is from a dump in https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20x64%20Release%20%28NVIDIA%29/builds/6434/steps/trace_test%20on%20NVIDIA%20GPU%20on%20Windows%20on%20Windows-2008ServerR2-SP1/logs/stdio and shows several other threads in mid-process of adding trace events when the assert occurs. May or may not be relevant.
Dumping current thoughts, in case simonhatch@ gets to look at this before me tomorrow :).

The DCHECK triggers specifically in the case when a trace event is added in a thread without a messageloop, which means TraceLog::thread_shared_chunk_ is used. I'm suspecting a race condition related to the shared chunk, and I suspect this bug has been there always, it just triggers more now possibly due to WorkerPools being used more (or the timing has changed); unsure.

The reason why there's usually several other threads in the midst of adding a trace event looks like may just be a side-effect of the tracelog lock being held when thread_shared_chunk_ is used.

Tangential thought: Very high-frequency events like this (for every task) may not be a good idea for threads where tracing uses the shared buffer instead of TLS, and grabs a lock.
It failed on Linux Debug bot now.
There are a lot of VERBOSE1 messages in the log, maybe it will give you a hint where a new trace was introduced.
https://build.chromium.org/p/chromium.gpu.fyi/builders/Linux%20Debug%20%28New%20Intel%29/builds/3475/steps/trace_test/logs/stdio
Project Member

Comment 20 by bugdroid1@chromium.org, Aug 18 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/67b18170c9542318566e14b1ed1bcc371af5b00b

commit 67b18170c9542318566e14b1ed1bcc371af5b00b
Author: oysteine <oysteine@chromium.org>
Date: Thu Aug 18 23:00:25 2016

Tracing: Add more DCHECKS to get more info on recent test flakiness

TBR=simonhatch@chromium.org
BUG= 638744 

Review-Url: https://codereview.chromium.org/2256913003
Cr-Commit-Position: refs/heads/master@{#412964}

[modify] https://crrev.com/67b18170c9542318566e14b1ed1bcc371af5b00b/base/trace_event/trace_log.cc

DCHECKs not firing; still investigating.

Also found a crash in this test on the 15th of August, lending credence to the idea that this has started flaking due to more tracing being used and exposing existing an existing.

https://build.chromium.org/p/chromium.gpu.fyi/builders/Win7%20x64%20Debug%20%28NVIDIA%29/builds/5083
Project Member

Comment 22 by bugdroid1@chromium.org, Aug 19 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a6506fef1e007d5c2c2441288c0e03b624c03bca

commit a6506fef1e007d5c2c2441288c0e03b624c03bca
Author: oysteine <oysteine@chromium.org>
Date: Fri Aug 19 21:07:19 2016

Tracing: Added some temp debug prints to investigate  crbug.com/638744 

R=simonhatch@chromium.org
BUG= 638744 

Review-Url: https://codereview.chromium.org/2265613002
Cr-Commit-Position: refs/heads/master@{#413235}

[modify] https://crrev.com/a6506fef1e007d5c2c2441288c0e03b624c03bca/base/trace_event/trace_log.cc

Comment 23 by kbr@chromium.org, Aug 19 2016

Note: this is failing at least sometimes on the commit queue. Example:

https://codereview.chromium.org/2254393002/

Failing tryjob:

https://build.chromium.org/p/tryserver.chromium.win/builders/win_optional_gpu_tests_rel/builds/3068

Full stdout attached in case it expires.

stdout.txt
1005 KB View Download
Labels: Hotlist-PixelWrangler
Also failing on the ANGLE CQ. Example:

https://build.chromium.org/p/tryserver.chromium.angle/builders/win_angle_rel_ng/builds/2000

Cc: jmad...@chromium.org
(Still looking into this; latest angle of attack is getting my Windows workstation back up and running)
Cc: vmp...@chromium.org
Still flaking on https://build.chromium.org/p/chromium.gpu.fyi/builders/Win8%20Release%20(NVIDIA)

Should we disable the test while the investigation is ongoing?
Still flaking on the ANGLE CQ as well, causing try and CQ jobs to fail.

Comment 29 by kbr@chromium.org, Aug 24 2016

This is really distressing -- it looks like trace_test is the only end-to-end test of the tracing subsystem. If the tracing team will commit to fixing this bug quickly, then yes, the test can be temporarily disabled. Otherwise, it is best to not disable it, so that permanent accidental breakage can not slip in to the source tree.

+Ken: the only end-to-end test that cover the gpu trace code path. We have a bunch of other tracing tests in telemetry_unittest & telemetry_perf_unittests that assert state of collected traces.

Comment 31 by kbr@chromium.org, Aug 24 2016

Thanks Ned. Are those tests passing reliably right now, or are they affected by this bug too?

AFAIK, I haven't seen any flakes from those tracing tests in the past month.

Comment 33 by kbr@chromium.org, Aug 24 2016

OK. Vlad, in that case, if you could prepare as narrow a suppression as possible in trace_test_expectations.py, I'll be glad to review. Thanks.

As a workaround, can we retry the test by like 3 times instead of disabling it?
Techstop is coming by tomorrow to get my Windows workstation working again; the added debug info we added gave some clues but not enough, so I still need to repro this. i.e. it'll be broken at least for a couple of more days.

Comment 36 by kbr@chromium.org, Aug 24 2016

Owner: vmp...@chromium.org
Ned: yes, a flaky expectation can be added. Vlad, could you please try to find the exact test names that are failing and the exact configurations (Windows only? Other platforms?), and add an entry to trace_test_expectations.py?

Vlad, assigning to you to add the suppression. Will assign back to Oystein after.

Will do.
Project Member

Comment 38 by bugdroid1@chromium.org, Aug 25 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8d51c44d357327406654c4f07204d453fc86ca94

commit 8d51c44d357327406654c4f07204d453fc86ca94
Author: vmpstr <vmpstr@chromium.org>
Date: Thu Aug 25 01:01:09 2016

trace_tests: Mark some windows tests as flaky.

BUG= 638744 
R=kbr@chromium.org
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel
NOTRY=true

Review-Url: https://codereview.chromium.org/2276123002
Cr-Commit-Position: refs/heads/master@{#414238}

[modify] https://crrev.com/8d51c44d357327406654c4f07204d453fc86ca94/content/test/gpu/gpu_tests/trace_test_expectations.py

Owner: oysteine@chromium.org
Back to oysteine@. I'll update the expectations if more flakes of the same type creep up. 

Comment 40 by kbr@chromium.org, Aug 25 2016

Thanks Vlad, and thanks in advance Oystein for getting to the bottom of this.

Project Member

Comment 41 by bugdroid1@chromium.org, Aug 26 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ad46abcca1ef130f810d56a8eb682658bd2cfa51

commit ad46abcca1ef130f810d56a8eb682658bd2cfa51
Author: vmpstr <vmpstr@chromium.org>
Date: Fri Aug 26 02:43:51 2016

trace_tests: Change flaky tests to fail tests.

BUG= 638744 
TBR=kbr@chromium.org
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2279963002
Cr-Commit-Position: refs/heads/master@{#414636}

[modify] https://crrev.com/ad46abcca1ef130f810d56a8eb682658bd2cfa51/content/test/gpu/gpu_tests/trace_test_expectations.py

Project Member

Comment 42 by bugdroid1@chromium.org, Aug 27 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f4c1b518e9b9a8f679a57260aa8a9d9e5a328c60

commit f4c1b518e9b9a8f679a57260aa8a9d9e5a328c60
Author: vmpstr <vmpstr@chromium.org>
Date: Fri Aug 26 23:55:58 2016

trace_test: Actually mark the trace tests as flaky.

TBR=kbr@chromium.org
BUG= 638744 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2283963003
Cr-Commit-Position: refs/heads/master@{#414864}

[modify] https://crrev.com/f4c1b518e9b9a8f679a57260aa8a9d9e5a328c60/content/test/gpu/gpu_tests/trace_test_expectations.py

Comment 43 by perkj@chromium.org, Aug 29 2016

Does the expectations actually work as expected?
Should the expectation be set to fail for the time being since the bot still gets red? 

https://build.chromium.org/p/tryserver.chromium.win/builders/win_chromium_rel_ng/builds/282986

Comment 44 by perkj@chromium.org, Aug 29 2016

 Issue 627044  has been merged into this issue.
Project Member

Comment 45 by chromium...@appspot.gserviceaccount.com, Aug 30 2016

Detected 8 new flakes for test/step "trace_test (with patch)". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyIgsSBUZsYWtlIhd0cmFjZV90ZXN0ICh3aXRoIHBhdGNoKQw. This message was posted automatically by the chromium-try-flakes app.
Happens on main GPU waterfall too. And still flaking the try servers. Should I change expectation to fail instead of flaky?

Comment 47 by kbr@chromium.org, Aug 31 2016

weiliangc@: yes, please. Thanks.

Project Member

Comment 48 by chromium...@appspot.gserviceaccount.com, Aug 31 2016

Detected 14 new flakes for test/step "trace_test (with patch)". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyIgsSBUZsYWtlIhd0cmFjZV90ZXN0ICh3aXRoIHBhdGNoKQw. This message was posted automatically by the chromium-try-flakes app.
Project Member

Comment 49 by bugdroid1@chromium.org, Sep 1 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/55cb1387798384b78903e1e5f7337427652c476e

commit 55cb1387798384b78903e1e5f7337427652c476e
Author: weiliangc <weiliangc@chromium.org>
Date: Thu Sep 01 17:29:43 2016

trace_tests: Mark some trace test on Windows as fail

BUG= 638744 
TBR=kbr@chromium.org
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel

NOTRY=true

Review-Url: https://codereview.chromium.org/2306683002
Cr-Commit-Position: refs/heads/master@{#415985}

[modify] https://crrev.com/55cb1387798384b78903e1e5f7337427652c476e/content/test/gpu/gpu_tests/trace_test_expectations.py

Comment 50 by kbr@chromium.org, Sep 1 2016

What's the status of the investigation of this bug? The tracing subsystem appears to actually be unreliable and that needs to be fixed.

Project Member

Comment 51 by bugdroid1@chromium.org, Sep 1 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8b8ca0fecf1454e534b6007499a88591abbd7b6e

commit 8b8ca0fecf1454e534b6007499a88591abbd7b6e
Author: weiliangc <weiliangc@chromium.org>
Date: Thu Sep 01 19:17:15 2016

trace_test: Skip flaking test on Win

BUG= 638744 
TBR=kbr@chromium.org
NOTRY=true
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2304663002
Cr-Commit-Position: refs/heads/master@{#416011}

[modify] https://crrev.com/8b8ca0fecf1454e534b6007499a88591abbd7b6e/content/test/gpu/gpu_tests/trace_test_expectations.py

#50: In progress; apologies for the delay; it took much, much longer than I expected to get my Windows desktop back up and running (just in terms of techstop ticket latency).

In terms of seriousness, the problem is fairly narrow in scope: Scoped trace events added on a thread without a messageloop and hence using a shared buffer, will sometimes have the incorrect duration. The offending events in this case are only used by memory-infra, which use them for pseudo stacks and don't care about the duration, so the problem is limited in effect to causing flakes.

Comment 53 by kbr@chromium.org, Sep 1 2016

Thanks for the information. Why are these trace events causing problems for the GPU trace test then? Can src/content/test/gpu/gpu_tests/trace_test.py be adjusted to pass command line options that will disable the memory-infra trace events? The test only cares about seeing gl_category type events, as far as I can tell.

Project Member

Comment 54 by chromium...@appspot.gserviceaccount.com, Sep 2 2016

Detected 17 new flakes for test/step "trace_test (with patch)". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyIgsSBUZsYWtlIhd0cmFjZV90ZXN0ICh3aXRoIHBhdGNoKQw. This message was posted automatically by the chromium-try-flakes app.
#53: Yes, we could do that.

How the category_filters work is a mess, but I believe the current test setup of "included_categories": ["disabled-by-default-gpu.service", "disabled-by-default-gpu.device"] will also include all of the default categories as there's no default categories explicitly specified and hence '*' is assumed... Adding an exclusion for * should work:

config.chrome_trace_config.category_filter.AddExcludedCategory('*')

Comment 56 by kbr@chromium.org, Sep 2 2016

oysteine: could you please try modifying the test locally to do that and see whether it still passes?

Seem to work still; will try a few trybot runs.

https://codereview.chromium.org/2306033003/
Project Member

Comment 58 by chromium...@appspot.gserviceaccount.com, Sep 3 2016

Detected 3 new flakes for test/step "trace_test (with patch)". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyIgsSBUZsYWtlIhd0cmFjZV90ZXN0ICh3aXRoIHBhdGNoKQw. This message was posted automatically by the chromium-try-flakes app.
Project Member

Comment 59 by bugdroid1@chromium.org, Sep 7 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/2258e283c912c509fcd45b7f6141a9a37c6a6a45

commit 2258e283c912c509fcd45b7f6141a9a37c6a6a45
Author: oysteine <oysteine@chromium.org>
Date: Wed Sep 07 22:14:07 2016

Reenabled GPU trace_test and constrained enabled categories.

R=kbr@chromium.org
BUG= 638744 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel

Review-Url: https://codereview.chromium.org/2306033003
Cr-Commit-Position: refs/heads/master@{#417083}

[modify] https://crrev.com/2258e283c912c509fcd45b7f6141a9a37c6a6a45/content/test/gpu/gpu_tests/trace_test.py
[modify] https://crrev.com/2258e283c912c509fcd45b7f6141a9a37c6a6a45/content/test/gpu/gpu_tests/trace_test_expectations.py

Project Member

Comment 60 by chromium...@appspot.gserviceaccount.com, Sep 16 2016

Detected 3 new flakes for test/step "trace_test (with patch)". To see the actual flakes, please visit https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyIgsSBUZsYWtlIhd0cmFjZV90ZXN0ICh3aXRoIHBhdGNoKQw. This message was posted automatically by the chromium-try-flakes app.

Comment 61 by kbr@chromium.org, Sep 17 2016

Thanks oysteine@ for your work on this. It looks like the Windows flakiness is fixed.

A few flaky timeouts showed up on Linux today. I'm turning down the logging verbosity in https://codereview.chromium.org/2343063005/ and will skip those tests:
trace_test.WebGLGreenTriangle.NoAA.Alpha
trace_test.WebGLGreenTriangle.AA.Alpha

if it doesn't clear it up.

https://chromium-try-flakes.appspot.com/all_flake_occurrences?key=ahVzfmNocm9taXVtLXRyeS1mbGFrZXNyIgsSBUZsYWtlIhd0cmFjZV90ZXN0ICh3aXRoIHBhdGNoKQw

I'll close this as fixed once the above CL lands. Thanks again for your help.

Project Member

Comment 62 by bugdroid1@chromium.org, Sep 17 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/911e6e4e3b6b47416dc89c0a61fb585f86bb1afa

commit 911e6e4e3b6b47416dc89c0a61fb585f86bb1afa
Author: kbr <kbr@chromium.org>
Date: Sat Sep 17 20:15:53 2016

Turn down verbosity in trace_test.

The logs are huge when these tests fail, and for no good reason as far
as I can tell. The huge amount of logging might actually be slowing
down the tests.

BUG= 638744 
CQ_INCLUDE_TRYBOTS=master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel;master.tryserver.chromium.android:android_optional_gpu_tests_rel
TBR=zmo@chromium.org

Review-Url: https://codereview.chromium.org/2343063005
Cr-Commit-Position: refs/heads/master@{#419381}

[modify] https://crrev.com/911e6e4e3b6b47416dc89c0a61fb585f86bb1afa/content/test/gpu/gpu_tests/trace_test.py

Comment 63 by kbr@chromium.org, Sep 19 2016

Status: Fixed (was: Assigned)
Things are looking good. Thank you oysteine@ for tracking down the cause of the Windows failures and fixing them!

Comment 64 by kbr@chromium.org, Sep 19 2016

Status: Verified (was: Fixed)

Sign in to add a comment