New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 836785 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 6
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Fuchsia
Pri: 1
Type: Bug-Regression



Sign in to add a comment

PipelineIntegrationTest.TrackStatusChangesWhileSuspended flakily hangs

Project Member Reported by w...@chromium.org, Apr 25 2018

Issue description

Observed under Fuchsia/x64/Debug FYI bot e.g. at https://ci.chromium.org/buildbot/chromium.fyi/Fuchsia%20%28dbg%29/18598.

Stack trace of the test main thread when hung is:

#01: pc 0x5fdf6e5ab8f0 sp 0x8289d9a81a8 (<vDSO>,0x68f0)
#02: pc 0x5fdf6e5ab5a9 sp 0x8289d9a81b0 (<vDSO>,0x65a9)
#03: pc 0x7aacdaddf4a1 sp 0x8289d9a81d0 (libc.so,0x434a1)
#04: pc 0x7aacdadb3a58 sp 0x8289d9a8210 (libc.so,0x17a58)
#05: base::ConditionVariable::Wait() at ??:?
#06: base::WaitableEvent::TimedWaitUntil(base::TimeTicks const&) at ??:?
#07: base::WaitableEvent::Wait() at ??:?
#08: base::MessagePumpDefault::Run(base::MessagePump::Delegate*) at ??:?
#09: base::MessageLoop::Run(bool) at ??:?
#10: base::RunLoop::Run() at ??:?
#11: media::PipelineIntegrationTestBase::RunUntilIdleEndedOrErrorInternal(base::RunLoop*, bool, bool) at pipeline_integration_test_base.cc:?
#12: media::PipelineIntegrationTestBase::RunUntilIdleOrEnded(base::RunLoop*) at pipeline_integration_test_base.cc:?
#13: media::PipelineIntegrationTestBase::WaitUntilOnEnded() at pipeline_integration_test_base.cc:?
#14: media::PipelineIntegrationTest_TrackStatusChangesWhileSuspended_Test::TestBody() at pipeline_integration_test.cc:?
#15: testing::Test* testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::TestFactoryBase, testing::Test*>(testing::internal::TestFactoryBase*, testing::Test* (testing::internal::TestFactoryBase::*)(), char const*) at gtest.cc:?
#16: void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*) at gtest.cc:?
#17: testing::Test::Run() at gtest.cc:?
#18: testing::TestInfo::Run() at gtest.cc:?
#19: testing::TestCase::Run() at gtest.cc:?
#20: testing::internal::UnitTestImpl::RunAllTests() at gtest.cc:?
#21: bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) at gtest.cc:?
#22: bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*) at gtest.cc:?
#23: testing::UnitTest::Run() at gtest.cc:?
#24: RUN_ALL_TESTS() at test_suite.cc:?
#25: base::TestSuite::Run() at test_suite.cc:?
#26: void base::internal::FunctorTraits<void (media::AliveChecker::*)(), void>::Invoke<media::AliveChecker*>(void (media::AliveChecker::*)(), media::AliveChecker*&&) at alive_checker_unittest.cc:?
#27: void base::internal::InvokeHelper<false, void>::MakeItSo<void (media::AliveChecker::*)(), media::AliveChecker*>(void (media::AliveChecker::*&&)(), media::AliveChecker*&&) at alive_checker_unittest.cc:?
#28: int base::internal::Invoker<base::internal::BindState<int (base::TestSuite::*)(), base::internal::UnretainedWrapper<TestSuiteNoAtExit> >, int ()>::RunImpl<int (base::TestSuite::* const&)(), std::__1::tuple<base::internal::UnretainedWrapper<TestSuiteNoAtExit> > const&, 0ul>(int (base::TestSuite::* const&)(), std::__1::tuple<base::internal::UnretainedWrapper<TestSuiteNoAtExit> > const&, std::__1::integer_sequence<unsigned long, 0ul>) at run_all_unittests.cc:?
#29: base::internal::Invoker<base::internal::BindState<int (base::TestSuite::*)(), base::internal::UnretainedWrapper<TestSuiteNoAtExit> >, int ()>::Run(base::internal::BindStateBase*) at run_all_unittests.cc:?
#30: base::OnceCallback<int ()>::Run() && at unit_test_launcher.cc:?
#31: base::(anonymous namespace)::LaunchUnitTestsInternal(base::OnceCallback<int ()>, unsigned long, int, bool, base::OnceCallback<void ()>) at unit_test_launcher.cc:?
#32: base::LaunchUnitTests(int, char**, base::OnceCallback<int ()>) at unit_test_launcher.cc:?
#33: main at run_all_unittests.cc:?
#34: pc 0x7aacdadb6abe sp 0x8289d9aafe0 (libc.so,0x1aabe)
#35: pc 0 sp 0x8289d9ab000

Fuchsia/x64/Debug runs tests under nested virtualization, under the Fuchsia thread scheduler, and of course is a Debug build, so issues of this sort are usually race-conditions that will flake much less frequently on other bots.

This appears to have regressed following https://chromium-review.googlesource.com/c/chromium/src/+/899843 landing.  That CL reorganized how the pipeline handles state changes, and this test's failure mode appears to be to timeout, so this suggests a possible bug in the new implementation.

However, the test appears to rely on waiting for particular timestamps to be reached, which means it may also be flaky in the presence of unexpected delays (e.g. if the test system is under high load, so threads have less time to run).

 

Comment 1 by w...@chromium.org, Apr 30 2018

Cc: dalecur...@chromium.org
Ted, could you take a look at this ASAP, please?

Comment 2 by w...@chromium.org, May 3 2018

Cc: x...@chromium.org
+xjz, who modified the PipelineIntegrationTestBase implementation as per of the media remoting work.

Looking at this test-base, I've found two issues, though I'm not sure if either actually explains these flakes:

1. The PipelineIntegrationTestBase mixes EXPECT_CALL()s in with application logic. Interleaving call-expectations with calls to the mocked methods actually has undefined behaviour.

2. Various tests call WaitUntilEnded(), which appears to be designed to run until either ended or error, and then return True if no error occurred, but is actually not waiting on errors.

#2 looks the likely cause of the hangs; I expect that fixing it will cause the test to fail, in a more helpful way!

Comment 3 by w...@chromium.org, May 3 2018

Cc: tmathmeyer@chromium.org
Owner: w...@chromium.org
Status: Started (was: Assigned)
Project Member

Comment 4 by bugdroid1@chromium.org, May 3 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/157985202218231a93bd09f12236d1f3687b851e

commit 157985202218231a93bd09f12236d1f3687b851e
Author: Wez <wez@chromium.org>
Date: Thu May 03 17:09:06 2018

Fix PipelineIntegrationTestBase::WaitUntilEnded() to fail on error.

Previously WaitUntilEnded() would not install an on-error closure, and
so would fail to exit on pipeline errors, causing the test to hang
rather than fail.

WaitUntilEnded() now waits for either end or error, and EXPECTs a
PIPELINE_OK status before returning.

Also tweaked the naming of RunUntilIdle* to RunUntilQuit to avoid
confusion with the RunLoop::RunIntilIdle() API behaviour.

Bug:  836785 
Change-Id: Idcf9bbec5ac0d7d78e2ba0530eaed0748a1626c8
Reviewed-on: https://chromium-review.googlesource.com/1041807
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Commit-Queue: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#555777}
[modify] https://crrev.com/157985202218231a93bd09f12236d1f3687b851e/media/test/pipeline_integration_test_base.cc
[modify] https://crrev.com/157985202218231a93bd09f12236d1f3687b851e/media/test/pipeline_integration_test_base.h

Comment 5 by w...@chromium.org, May 12 2018

Cc: -tmathmeyer@chromium.org
Owner: tmathmeyer@chromium.org
We are still seeing this test hang, presumably still in the WaitUntilOnEnded(). I can't immediately see how we can manage to reach the RunUntilQuitOrEndedOrError() call in there if |ended_| is already true, so it seems the OnEnded is never dispatched, somehow?

tmathmeyer: Any ideas?
Project Member

Comment 6 by bugdroid1@chromium.org, May 22 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/11f4ce4066cbd1bda4f00ea56e6d9b41ac27adcb

commit 11f4ce4066cbd1bda4f00ea56e6d9b41ac27adcb
Author: Michał Pichliński <mpichlinski@opera.com>
Date: Tue May 22 16:32:27 2018

Wait for completion of changing enabled tracks in PipelineIntegrationTests.

Calling Seek/Resume on Pipeline when OnEnabledAudioTracksChanged or
OnSelectedVideoTrackChanged have not finished their job will end with a crash:
- PlaybackWithVideoTrackDisabledThenEnabled crashes on
  DCHECK(!(pending_audio_track_change_ || pending_video_track_change_));
  in RendererImpl::Flush when calling Seek after unfinished
  OnSelectedVideoTrackChanged,
- TrackStatusChangesWhileSuspended crashes on
  return time_source_->CurrentMediaTime();
  in RendererImpl::GetMediaTime() due to nullptr time_source_
  when OnEnabledAudioTracksChanged finishes its job after calling Resume.

In production code PipelineController is responsible for gating operations
via callbacks in order to prevent race conditions.

Sometimes RunUntilIdle may be insufficient for changing track state, instead
of it wait in the RunLoop until completion callback has been called.

Bug:  836785 
Change-Id: I8a784d668b6d92bebd90fafd6e6dbddb18c9ceb4
Reviewed-on: https://chromium-review.googlesource.com/1066015
Commit-Queue: Wez <wez@chromium.org>
Reviewed-by: Matthew Wolenetz <wolenetz@chromium.org>
Reviewed-by: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#560640}
[modify] https://crrev.com/11f4ce4066cbd1bda4f00ea56e6d9b41ac27adcb/media/test/pipeline_integration_test.cc

Comment 7 by w...@chromium.org, May 24 2018

Ted, we are still seeing the TrackStatusChangesWhileSuspended test hanging pretty frequently under Fuchsia.
Cc: xhw...@chromium.org
Ted, can you prioritize this? Last we talked I indicated there may be an issue where the track change and ended may race and prevent delivery of the ended event.

We should make sure there isn't a lingering issue around ended signaling and track changes before M68 goes to stable or we may end up with some nasty hangs due to background track disable/enable.

Since I'll be OOO soon, +xhwang who is familiar with our pipeline and can help advise investigation.
Project Member

Comment 9 by bugdroid1@chromium.org, May 24 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c348474f11c937b8c3a19df36c9bb55211facf55

commit c348474f11c937b8c3a19df36c9bb55211facf55
Author: Kevin Marshall <kmarshall@chromium.org>
Date: Thu May 24 18:32:15 2018

Fuchsia: Filter PipelineIntegrationTest.TrackStatusChangesWhileSuspended.

PipelineIntegrationTest.TrackStatusChangesWhileSuspended times out
frequently on the FYI bots, so filter it out for now.

Bug:  836785 
Change-Id: Id209d83a72e16a52cd7cc8b9281bf167a63cf636
Reviewed-on: https://chromium-review.googlesource.com/1068617
Reviewed-by: Wez <wez@chromium.org>
Commit-Queue: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#561561}
[modify] https://crrev.com/c348474f11c937b8c3a19df36c9bb55211facf55/testing/buildbot/filters/fuchsia.base_unittests.filter

Project Member

Comment 10 by bugdroid1@chromium.org, May 25 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/ebac442c103f42e897de49c16f6d831c6eae44b7

commit ebac442c103f42e897de49c16f6d831c6eae44b7
Author: Wez <wez@chromium.org>
Date: Fri May 25 17:21:00 2018

Revert "Fuchsia: Filter PipelineIntegrationTest.TrackStatusChangesWhileSuspended."

This reverts commit c348474f11c937b8c3a19df36c9bb55211facf55.

Reason for revert: Filter was added to the wrong test suite, so had no effect.

Original change's description:
> Fuchsia: Filter PipelineIntegrationTest.TrackStatusChangesWhileSuspended.
> 
> PipelineIntegrationTest.TrackStatusChangesWhileSuspended times out
> frequently on the FYI bots, so filter it out for now.
> 
> Bug:  836785 
> Change-Id: Id209d83a72e16a52cd7cc8b9281bf167a63cf636
> Reviewed-on: https://chromium-review.googlesource.com/1068617
> Reviewed-by: Wez <wez@chromium.org>
> Commit-Queue: Wez <wez@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#561561}

TBR=wez@chromium.org,kmarshall@chromium.org

Change-Id: I057d4db914418cfd2a65a03a827c36a72c9439b0
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  836785 
Reviewed-on: https://chromium-review.googlesource.com/1073491
Reviewed-by: Wez <wez@chromium.org>
Commit-Queue: Wez <wez@chromium.org>
Cr-Commit-Position: refs/heads/master@{#561928}
[modify] https://crrev.com/ebac442c103f42e897de49c16f6d831c6eae44b7/testing/buildbot/filters/fuchsia.base_unittests.filter

Project Member

Comment 11 by bugdroid1@chromium.org, May 25 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/cb593be96c0552c08d75c43076b569884874ba94

commit cb593be96c0552c08d75c43076b569884874ba94
Author: Ted Meyer <tmathmeyer@chromium.org>
Date: Fri May 25 21:03:31 2018

Fix for flaky track change test.

Omitting kNoClockless allowed the renderer to pull frames quickly rather
than trying to ratelimit them at the video fps. This was causing the
video to finish playback in the short timespan between being resumed and
suspended, causing the 'WaitUntilOnEnded' call to incorrectly return
early, as the 'ended_' attribute on the PipelineIntegrationTest had been
set to true. This ended_ boolean is only reset on a seek or a resume,
none of which had come after the suspend. This was flaking because
sometimes the video _wouldn't_ finish before the Suspend call, leaving
the pipeline test in an unended state.

Bug:  836785 
Change-Id: I89d4382354e4cfcd6593b69f6cc02d6821492e41
Reviewed-on: https://chromium-review.googlesource.com/1072955
Commit-Queue: Ted Meyer <tmathmeyer@chromium.org>
Reviewed-by: Wez <wez@chromium.org>
Reviewed-by: Dale Curtis <dalecurtis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#562008}
[modify] https://crrev.com/cb593be96c0552c08d75c43076b569884874ba94/media/test/pipeline_integration_test.cc

Status: Fixed (was: Started)

Sign in to add a comment