New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 784182 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Dec 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 1
Type: Bug



Sign in to add a comment

SafeBrowsingService browser_tests flaky on Win7

Project Member Reported by dgozman@chromium.org, Nov 12 2017

Issue description

Builders failed on Win7 Tests (1): https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%281%29

Example tests are:

MaybeSetMetadata/SafeBrowsingServiceMetadataTest.MalwareImg/0
MaybeSetMetadata/SafeBrowsingServiceMetadataTest.MalwareImg/1
SafeBrowsingServiceTest.SubResourceHitOnFreshTab
SafeBrowsingServiceTest.MalwareWithWhitelist
SafeBrowsingServiceWebSocketNoInterstitialTest.MalwareWebSocketBlocked/1
SafeBrowsingServiceWebSocketInterstitialTest.MalwareWebSocketBlocked/0
SafeBrowsingServiceTest.SubResourceHitWithMainFrameRendererInitiatedSlowLoad
... and more ...


They all crash in the same line:

Received fatal exception EXCEPTION_ACCESS_VIOLATION
Backtrace:
	rtc::BasicNetworkManager::`vector deleting destructor' [0x0577974D+125]
	safe_browsing::SafeBrowsingServiceTest::SetupResponseForUrl [0x01B3EFFE+76]
	safe_browsing::SafeBrowsingServiceTest_SubResourceHitWithMainFrameRendererInitiatedSlowLoad_Test::RunTestOnMainThread [0x01B42450+336]
	content::BrowserTestBase::ProxyRunTestOnMainThreadLoop [0x02F2F83F+239]
	ChromeBrowserMainParts::PreMainMessageLoopRunImpl [0x035F5547+3493]
	ChromeBrowserMainParts::PreMainMessageLoopRun [0x035F46E6+150]
	content::BrowserMainLoop::PreMainMessageLoopRun [0x02174E32+66]
	content::StartupTaskRunner::RunAllTasksNow [0x023DD42F+23]
	content::BrowserMainLoop::CreateStartupTasks [0x02173935+479]
	content::BrowserMainRunnerImpl::Initialize [0x02177A3B+507]
	content::BrowserMain [0x021722A8+136]
	content::RunNamedProcessTypeMain [0x02DDFE17+123]
	content::ContentMainRunnerImpl::Run [0x02DE02B2+146]
	service_manager::Main [0x03DC8C2C+620]
	content::ContentMain [0x02DDFD45+49]
	content::BrowserTestBase::SetUp [0x02F2F689+1273]
	InProcessBrowserTest::SetUp [0x02EA32C2+274]
	safe_browsing::SafeBrowsingServiceTest::SetUp [0x01B500C7+231]
	testing::Test::Run [0x01C9898D+103]
	testing::TestInfo::Run [0x01C990CB+201]
	testing::TestCase::Run [0x01C99459+235]
	testing::internal::UnitTestImpl::RunAllTests [0x01C9D183+595]
	testing::UnitTest::Run [0x01C9CE59+151]
	base::TestSuite::Run [0x02EB9848+100]
	ChromeTestSuiteRunner::RunTestSuite [0x054B2ECB+75]
	content::LaunchTests [0x02F24E01+353]
	LaunchChromeTests [0x054B31E3+143]
	main [0x054B2E41+109]
	__scrt_common_main_seh [0x05F01CBA+248] (f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl:283)
	BaseThreadInitThunk [0x7731336A+18]
	RtlInitializeExceptionChain [0x77B69882+99]
	RtlInitializeExceptionChain [0x77B69855+54]

Assigning to csharrison for triage.

 
Cc: yzshen@chromium.org jam@chromium.org
The earliest build that I can find with these failures is https://build.chromium.org/p/chromium.win/builders/Win7%20Tests%20%281%29/builds/73155 (r#514446). It's pretty near the bottom of the past 200 builds, so it's possible that there are earlier builds with the problem.

CC'ing a couple of folks who have been poking around SafeBrowsing and might know something.
Cc: csharrison@chromium.org
Owner: ----
Status: Untriaged (was: Assigned)
Moving myself to cc. I am not super familiar with these tests.

Comment 3 by yzshen@chromium.org, Nov 13 2017

Owner: yzshen@chromium.org
Status: Assigned (was: Untriaged)
It is possible that my change to the loading deferral logic reveals some previously hidden races? I enabled S13nSafeBrowsingParallelUrlCheck on our bulidbots on Oct 29.

I could take a look. (But please feel free to take over if you are interested in it.)

Comment 4 by kbr@chromium.org, Nov 14 2017

Cc: kbr@chromium.org
Please triage this urgently. It's causing failures on one of the main Chromium tryservers, win7_chromium_rel_ng, e.g.:

https://ci.chromium.org/buildbot/tryserver.chromium.win/win7_chromium_rel_ng/43662

Failed:

MaybeSetMetadata/SafeBrowsingServiceMetadataTest.MalwareMainFrame/0
ChromeTracingDelegateBrowserTest.BackgroundTracingThrottleTimeElapsed
SafeBrowsingServiceTest.MainFrameHitWithReferrer
SafeBrowsingServiceTest.SubResourceHitWithMainFrameBrowserInitiatedSlowLoad

[ RUN      ] SafeBrowsingServiceTest.SubResourceHitWithMainFrameBrowserInitiatedSlowLoad
[2564:424:1114/140537.405:WARNING:chrome_browser_main_win.cc(613)] Command line too long for RegisterApplicationRestart:  --brave-new-test-launcher --cfi-diag=0 --gtest_also_run_disabled_tests --gtest_filter=SafeBrowsingServiceTest.SubResourceHitWithMainFrameBrowserInitiatedSlowLoad --single_process --test-launcher-bot-mode --test-launcher-output="C:\Users\CHROME~2\AppData\Local\Temp\scoped_dir5360_20285\results5360_15136\test_results.xml" --test-launcher-summary-output="e:\b\swarm_slave\w\iogv9sxd\output.json" --user-data-dir="C:\Users\CHROME~2\AppData\Local\Temp\scoped_dir5360_20285\d5360_21467" --disable-offline-auto-reload --safebrowsing-disable-auto-update --no-first-run --no-default-browser-check --enable-logging=stderr --safebrowsing-disable-auto-update --disable-default-apps --wm-window-animations-disabled --disable-component-update --test-type=browser --force-color-profile=srgb --disable-zero-browsers-open-for-tests --ipc-connection-timeout=30 --allow-file-access-from-files --dom-automation --log-gpu-control-list-decisions --disable-backgrounding-occluded-windows --disable-gl-drawing-for-tests --override-use-software-gl-for-tests --force-color-profile=srgb --enable-features=TestFeatureForBrowserTest1 --disable-features=NetworkPrediction,TestFeatureForBrowserTest2 --flag-switches-begin --flag-switches-end --restore-last-session about:blank
[3668:5976:1114/140537.468:INFO:media_foundation_video_encode_accelerator_win.cc(370)] Windows versions earlier than 8 are not supported.
Received fatal exception EXCEPTION_ACCESS_VIOLATION
Backtrace:
	testing::internal::TuplePrefix<1>::ExplainMatchFailuresTo<std::tuple<testing::Matcher<security_interstitials::UnsafeResource const &> >,std::tuple<security_interstitials::UnsafeResource const &> > [0x011F1A47+3137]
	safe_browsing::SafeBrowsingServiceTest::SetupResponseForUrl [0x011DEAEC+76]
	safe_browsing::SafeBrowsingServiceTest_SubResourceHitWithMainFrameBrowserInitiatedSlowLoad_Test::RunTestOnMainThread [0x011E293C+332]
	content::BrowserTestBase::ProxyRunTestOnMainThreadLoop [0x02C4A8B9+265]
	ChromeBrowserMainParts::PreMainMessageLoopRunImpl [0x03492D47+3615]
	ChromeBrowserMainParts::PreMainMessageLoopRun [0x03491E6B+171]
	content::BrowserMainLoop::PreMainMessageLoopRun [0x019EB892+66]
	content::StartupTaskRunner::RunAllTasksNow [0x01D02E2C+30]
	content::BrowserMainLoop::CreateStartupTasks [0x019EA384+632]
	content::BrowserMainRunnerImpl::Initialize [0x019EE783+531]
	content::BrowserMain [0x019E8B38+136]
	content::RunNamedProcessTypeMain [0x02AB35A2+146]
	content::ContentMainRunnerImpl::Run [0x02AB3B26+278]
	service_manager::Main [0x03E322B7+675]
	content::ContentMain [0x02AB34B9+49]
	content::BrowserTestBase::SetUp [0x02C4A6E2+1634]
	InProcessBrowserTest::SetUp [0x02BA770A+346]
	safe_browsing::SafeBrowsingServiceTest::SetUp [0x011F0457+231]
	testing::Test::Run [0x013396CB+103]
	testing::TestInfo::Run [0x01339DA5+201]
	testing::TestCase::Run [0x0133A133+235]
	testing::internal::UnitTestImpl::RunAllTests [0x0133DED3+595]
...

Comment 7 by yzshen@chromium.org, Nov 16 2017

Status: Started (was: Assigned)
Will look at it now and report back progress. If it is not related to anything that I am familiar with or touched, I will ask for re-assign.

Comment 8 by kbr@chromium.org, Nov 16 2017

Cc: -dft@google.com dougt@chromium.org

Comment 9 by yzshen@chromium.org, Nov 16 2017

Cc: vakh@chromium.org
Owner: vakh@chromium.org
Status: Assigned (was: Started)
kbr@: May I ask why do you think this is mojo/servicification-related? (I noticed that you are CCing relevant people.) SafeBrowsing "Service" is not a mojo service. But if you have seen anything suggesting that it might be mojo/servicification-related, please let me know.

If I read the stack correctly, this access violation is very unlikely to be related to my recent change. -- the code that I touched is not involved in those tests.

Adding SafeBrowsing experts to better triage the bug. (At the same time I will continue looking)
Varun: Would you please triage this bug? Please feel free to assign it back to me if you think there is any possibility that it could be related to my recent work.

Thanks!

Comment 10 by vakh@chromium.org, Nov 16 2017

Taking a look. Is it possible to identify the culprit CL(s) or a range?

Comment 11 by kbr@chromium.org, Nov 16 2017

yzshen@: I don't have any particular evidence except your comment above https://bugs.chromium.org/p/chromium/issues/detail?id=784182#c3 .

The point is that this needs to be diagnosed and fixed urgently. It's causing a large number of flaky retries on one of Chrome's main trybots as well as hiding other failures.

kbr@: Sorry for any delay that I have caused! Will keep looking until it is fixed or the SB experts finds some better owner.

Comment 13 by vakh@chromium.org, Nov 17 2017

Status: Started (was: Assigned)
I am disabling the tests that have been listed in the description since that code is now deprecated and slated for removal soon.

I still haven't been able to find the rootcause.

Comment 14 by vakh@chromium.org, Nov 17 2017

https://crrev.com/c/775905 has been LGTM'd and is being submitted to CQ
I looked at the code and the issue may be caused by TestProtocolManager and TestSafeBrowsingDatabase being accessed from multiple threads.

SetupResponseForUrl is called on the main thread, but they could be accessed from IO thread.

Keep looking...

https://chromium-review.googlesource.com/c/chromium/src/+/776202 protects the data of those two classes with locks.

Trying it on win7_chromium_rel_ng trybot to see whether it addresses the issue.
Project Member

Comment 17 by bugdroid1@chromium.org, Nov 17 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/5f40f9c308e02afabd8f4c5f9c3e35fe14f91952

commit 5f40f9c308e02afabd8f4c5f9c3e35fe14f91952
Author: Varun Khaneja <vakh@chromium.org>
Date: Fri Nov 17 03:37:24 2017

Disable flaky PVer3 tests since PVer3 is no longer used

Bug:  784182 
Change-Id: If7935838a2b0bb296f943e8498039de12e6c585f
Reviewed-on: https://chromium-review.googlesource.com/775905
Commit-Queue: Varun Khaneja <vakh@chromium.org>
Reviewed-by: Varun Khaneja <vakh@chromium.org>
Reviewed-by: Yuzhu Shen <yzshen@chromium.org>
Reviewed-by: Nathan Parker <nparker@chromium.org>
Cr-Commit-Position: refs/heads/master@{#517279}
[modify] https://crrev.com/5f40f9c308e02afabd8f4c5f9c3e35fe14f91952/chrome/browser/safe_browsing/safe_browsing_service_browsertest.cc

Result of #16:

With lock of TestProtocolManager and TestSafeBrowsingDatabase I still see access violation. When trying to access the lock itself. That suggests that likely either TestProtocolManager or TestSafeBrowsingDatabase is invalid at that point.


One example on trybot with patch from comment #16:

https://logs.chromium.org/v/?s=chromium%2Fbb%2Ftryserver.chromium.win%2Fwin7_chromium_rel_ng%2F46277%2F%2B%2Frecipes%2Fsteps%2Fbrowser_tests__with_patch_%2F0%2Fstdout

Received fatal exception EXCEPTION_ACCESS_VIOLATION
Backtrace:
	RtlTryAcquireSRWLockExclusive [0x770B1E57+8]
	base::internal::LockImpl::Lock [0x02C80242+34]
	safe_browsing::SafeBrowsingServiceTest::SetupResponseForUrl [0x0133205D+63]
	safe_browsing::SafeBrowsingServiceTest_CheckResourceUrl_Test::RunTestOnMainThread [0x013388B0+240]
	content::BrowserTestBase::ProxyRunTestOnMainThreadLoop [0x02DB4B49+265]
	ChromeBrowserMainParts::PreMainMessageLoopRunImpl [0x035FC617+3615]
	ChromeBrowserMainParts::PreMainMessageLoopRun [0x035FB73B+171]
	content::BrowserMainLoop::PreMainMessageLoopRun [0x01B50362+66]
	content::StartupTaskRunner::RunAllTasksNow [0x01E6AF88+30]
	content::BrowserMainLoop::CreateStartupTasks [0x01B4EE44+632]
New patch uploaded to https://chromium-review.googlesource.com/c/chromium/src/+/776202
which fixed the issue that TestProtocolManager/TestSafeBrowsingDatabase could be accessed on the UI thread before they are created on the IO thread.

(Varun's CL on comment #17 should already eliminate the flakiness from our bot. This CL is for investigating the root cause.)

Okay, I verified that https://chromium-review.googlesource.com/c/chromium/src/+/776202 has fixed the access violation issue.

Varun: Are you planning to re-enable the tests or remove them? If you would like to re-enable them, I could land 776202.

In any case, I think we could lower the priority of this bug now. WDYT?
Labels: -Sheriff-Chromium

Comment 22 by kbr@chromium.org, Nov 17 2017

Labels: -Pri-0 Pri-1
Owner: jialiul@chromium.org
--> jialiul while vakh is out.
Hi, Jialiu.

Please see my comment #20. I have no concerns if you prefer removing the tests. If you would like to re-enable the tests, however, please review 776202.

Thanks!
Labels: SafeBrowsing-Triaged
Project Member

Comment 26 by bugdroid1@chromium.org, Dec 8 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c535dfb4753effb51ee45b1889f098223d72003b

commit c535dfb4753effb51ee45b1889f098223d72003b
Author: Yuzhu Shen <yzshen@chromium.org>
Date: Fri Dec 08 19:49:33 2017

safe_browsing_service_browsertest: protect data which is accessed from multiple threads.

BUG= 784182 

Change-Id: Ib9500aeeda7cf7984973ea745b398fe260d2c5b4
Reviewed-on: https://chromium-review.googlesource.com/776202
Reviewed-by: Jialiu Lin <jialiul@chromium.org>
Reviewed-by: Varun Khaneja <vakh@chromium.org>
Commit-Queue: Yuzhu Shen <yzshen@chromium.org>
Commit-Queue: Jialiu Lin <jialiul@chromium.org>
Cr-Commit-Position: refs/heads/master@{#522848}
[modify] https://crrev.com/c535dfb4753effb51ee45b1889f098223d72003b/chrome/browser/safe_browsing/safe_browsing_service_browsertest.cc

Status: Fixed (was: Started)

Sign in to add a comment