New issue
Advanced search Search tips

Issue 612523 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: May 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug



Sign in to add a comment

Several tests failing consistently on Win7 Tests (dbg)(1) since build #48776

Project Member Reported by qyears...@chromium.org, May 17 2016

Issue description

Recent builds:

https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29?numbuilds=50

Tests that are failing include:

telemetery_unittests: telemetry.internal.browser.browser_unittest.TestBrowserOperationDoNotLeakTempFiles
browser_tests (...some shards did not complete...)
interactive_ui_tests (> 20 tests) including AutofillInteractiveTest...
sync_integration_tests (> 100 test) including TwoClientBookmarksSyncTest...
unit_tests (> 20 tests) including DownloadProtectionServiceTest...

Blame range for build #48776: https://chromium.googlesource.com/chromium/src/+log/71017aa70fb2feaa60c8178e253b100969800b63%5E..5c14299c1f3b0ff1926b9e05607afd96cfadd809?pretty=fuller
 
Cc: vakh@chromium.org aga...@chromium.org
Labels: Infra-Troopers
Not sure if this could be an infra issue -- would it make sense to try restarting the machine?

Otherwise the only idea I have is to try speculatively reverting CLs in that range

Comment 2 by vakh@chromium.org, May 17 2016

In the case of DownloadProtectionServiceTest tests, the stack trace for the crash looks very generic and has no symbols from that class. That looks suspicious.

Comment 3 by vakh@chromium.org, May 17 2016

In fact, all failed tests have an identical backtrace:

	(No symbol) [0x010AB18A]
	ovly_debug_event [0x048AAE8A+1879834]
	ovly_debug_event [0x048A31DA+1847914]
	RelaunchChromeBrowserWithNewCommandLineIfNeeded [0x029A20F8+8039160]
	RelaunchChromeBrowserWithNewCommandLineIfNeeded [0x028CADD1+7157713]
	RelaunchChromeBrowserWithNewCommandLineIfNeeded [0x028C3E46+7129158]
	RelaunchChromeBrowserWithNewCommandLineIfNeeded [0x028C3BE5+7128549]
	RelaunchChromeBrowserWithNewCommandLineIfNeeded [0x028CA2AC+7154860]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A17BDBE+89561]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A1AD8C4+293087]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A218AE0+731899]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A2169BD+723416]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A216FA4+724927]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A21FE43+761438]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A22106B+766086]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A218821+731196]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A2C0164+1417599]
	base::MessagePumpForUI::~MessagePumpForUI [0x0A2C01B6+1417681]
	(No symbol) [0x017AA1BA]
	SetCrashKeyValueImpl [0x0586B6F4+11595460]
	SetCrashKeyValueImpl [0x0587F365+11676469]
	SetCrashKeyValueImpl [0x0587F54D+11676957]
	SetCrashKeyValueImpl [0x0587F42F+11676671]
	SetCrashKeyValueImpl [0x0587F955+11677989]
	SetCrashKeyValueImpl [0x0586B7B4+11595652]
	SetCrashKeyValueImpl [0x0587F68F+11677279]
	(No symbol) [0x01B179DF]
	(No symbol) [0x01B17AFD]
	(No symbol) [0x01C2AB26]
	(No symbol) [0x01AF21AB]
	(No symbol) [0x01AF2168]
	(No symbol) [0x01AF2319]
	(No symbol) [0x01B5638E]
	(No symbol) [0x01B55921]
	(No symbol) [0x01B5577B]
	(No symbol) [0x01AF2436]
	IsSandboxedProcess [0x06BCA9BE+8147870]
	IsSandboxedProcess [0x06BCA8AA+8147594]
	IsSandboxedProcess [0x06BCA74D+8147245]
	IsSandboxedProcess [0x06BCA9D8+8147896]
	BaseThreadInitThunk [0x7702336A+18]
	RtlInitializeExceptionChain [0x77DD9882+99]
	RtlInitializeExceptionChain [0x77DD9855+54]

It might be worth restarting the machine before reverting the CLs to identify any culprit(s).
At around 12:20 agable restarted the machine (although it's now in a weird state where it says build 48789 is still running although 48790 is completed).

After that, a full build completed but the same tests failed again.
Cc: sadrul@chromium.org
Note: Tried reverting 26771182e06d28496aca4c73e095d3c9a43713ce (Reland: Initialize and reset V4LocalDBManager, r393989) but that didn't appear to help.

Now choosing another CL in that range to try to revert: https://codereview.chromium.org/1985283002/ (reverts r393985)
Relanding that CL (r393985) since sadrul pointed out that the code that that modifies doesn't get build on Windows.

The remaining CLs in the blame range are:
5c14299 [Media Router WebUI] Add slight delay before requesting initial data for Mac. by apacible · 23 hours ago
824ed16 Update WebGL2 conformance test expectations on Linux. by zmo · 23 hours ago
6f4816f Roll src/third_party/skia/ d2c7ef949..edea94c35 (2 commits). by skia-deps-roller · 24 hours ago
eb23b41 Make sure ScriptWrappables have a wrapper before calling setReference() by adamk · 24 hours ago

Note, maybe it's still possible that this  the machine is in a weird state, even though restarting it once before apparently didn't help.
Update: bot still failing (and it still says it's doing build 48789 even though it's on to 48806. Looked again at the CLs in the blame range, everything looks irrelevant to these test failures.

Comment 8 by aga...@chromium.org, May 18 2016

It will continue to say it is still running 48789 until the master restarts. That should be a totally unrelated problem.

I don't think restarting the bot will help more: all of the failures are on swarming. And they're all actual failures or timeouts, not capacity issues as far as I can tell.

Comment 9 by sadrul@chromium.org, May 18 2016

Cc: -sadrul@chromium.org
The problem also occurs on trybot win_chromium_dbg_ng, and I'm trying reverting each CLs on win_chromium_dbg_ng to look for the culprit.
Culprit: 26771182e06d28496aca4c73e095d3c9a43713ce (Reland: Initialize and reset V4LocalDBManager, r393989) 
- On win_chromium_dbg_ng, reverting the CL fixed the test failure (my local CL: https://codereview.chromium.org/1993133002)
- Comment 5 says reverting the commit didn't appear to help. https://codereview.chromium.org/1984283003/ says "Revert didn't help the problem (tests still failing as before".
  However, the test passed on https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%28dbg%29%281%29/builds/48793, where the commit was reverted, and the test are failing again after this, perhaps the commit is relanded?

I'll revert the commit again.
Labels: -Infra-Troopers
The builder seems to be healthy since 2:13am today. Is it fixed then?

Either way, I don't think it's a trooper issue anymore, removing the label.
Status: Fixed (was: Assigned)
After I reverted r393989, the bot turned green. Fixed.
\o/ thanks hiroshige, I definitely relanded too early the first time and didn't look closely enough.

Sign in to add a comment