TabRestoreTest.RestoreTabWithSpecialURL flaky under CrOS ASan |
|||||
Issue descriptionIt looks like TabRestoreTest.RestoreTabWithSpecialURL has a flaky failure on "Linux Chromium OS ASan LSan Tests (1)". This test seems to flake very rarely. I see one failure in the past 200 runs now from this test. I had also noted a failure in https://crbug.com/742623#c12 . I didn't see any failures from TabRestoreTest.RestoreTabWithSpecialURL at later dates and assumed the test was fixed. Here is the failing run: https://build.chromium.org/p/chromium.memory/builders/Linux%20Chromium%20OS%20ASan%20LSan%20Tests%20%281%29/builds/24926 Here is the relevant part of the logs for the leak (which is truncated and pretty useless): <truncated (513094 bytes)> lib/connector.cc:469:10 #16 0x116a1523 in mojo::Connector::OnHandleReadyInternal(unsigned int) mojo/public/cpp/bindings/lib/connector.cc:374:3 #17 0x7960e1f in Run base/callback.h:94:12 #18 0x7960e1f in mojo::SimpleWatcher::DiscardReadyState(base::RepeatingCallback<void (unsigned int)> const&, unsigned int, mojo::HandleSignalsState const&) mojo/public/cpp/system/simple_watcher.h:193 #19 0x11717b4f in Run base/callback.h:94:12 #20 0x11717b4f in mojo::SimpleWatcher::OnHandleReady(int, unsigned int, mojo::HandleSignalsState const&) mojo/public/cpp/system/simple_watcher.cc:276 #21 0x11718d84 in Invoke<const base::WeakPtr<mojo::SimpleWatcher> &, const int &, const unsigned int &, const mojo::HandleSignalsState &> base/bind_internal.h:194:12 #22 0x11718d84 in MakeItSo<void (mojo::SimpleWatcher::*const &)(int, unsigned int, const mojo::HandleSignalsState &), const base::WeakPtr<mojo::SimpleWatcher> &, const int &, const unsigned int &, const mojo::HandleSignalsState &> base/bind_internal.h:297 #23 0x11718d84 in void base::internal::Invoker<base::internal::BindState<void (mojo::SimpleWatcher::*)(int, unsigned int, mojo::HandleSignalsState const&), base::WeakPtr<mojo::SimpleWatcher>, int, unsigned int, mojo::HandleSignalsState>, void ()>::RunImpl<void (mojo::SimpleWatcher::* const&)(int, unsigned int, mojo::HandleSignalsState const&), std::__1::tuple<base::WeakPtr<mojo::SimpleWatcher>, int, unsigned int, mojo::HandleSignalsState> const&, 0ul, 1ul, 2ul, 3ul>(void (mojo::SimpleWatcher::* const&)(int, unsigned int, mojo::HandleSignalsState const&), std::__1::tuple<base::WeakPtr<mojo::SimpleWatcher>, int, unsigned int, mojo::HandleSignalsState> const&, std::__1::integer_sequence<unsigned long, 0ul, 1ul, 2ul, 3ul>) base/bind_internal.h:351 #24 0xd242eee in Run base/callback.h:65:12 #25 0xd242eee in base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) base/debug/task_annotator.cc:57 #26 0xcaae93f in blink::scheduler::TaskQueueManager::ProcessTaskFromWorkQueue(blink::scheduler::internal::WorkQueue*, bool, blink::scheduler::LazyNow, base::TimeTicks*) third_party/WebKit/Source/platform/scheduler/base/task_queue_manager.cc:522:19 #27 0xcaa7218 in blink::scheduler::TaskQueueManager::DoWork(bool) third_party/WebKit/Source/platform/scheduler/base/task_queue_manager.cc:313:13 #28 0xcab5e3e in Invoke<const base::WeakPtr<blink::scheduler::TaskQueueManager> &, const bool &> base/bind_internal.h:194:12 #29 0xcab5e3e in MakeItSo<void (blink::scheduler::TaskQueueManager::*const &)(bool), const base::WeakPtr<blink::scheduler::TaskQueueManager> &, const bool &> base/bind_internal.h:297 #30 0xcab5e3e in RunImpl<void (blink::scheduler::TaskQueueManager::*const &)(bool), const std::__1::tuple<base::WeakPtr<blink::scheduler::TaskQueueManager>, bool> &, 0, 1> base/bind_internal.h:351 #31 0xcab5e3e in base::internal::Invoker<base::internal::BindState<void (blink::scheduler::TaskQueueManager::*)(bool), base::WeakPtr<blink::scheduler::TaskQueueManager>, bool>, void ()>::Run(base::internal::BindStateBase*) base/bind_internal.h:333 #32 0xd242eee in Run base/callback.h:65:12 #33 0xd242eee in base::debug::TaskAnnotator::RunTask(char const*, base::PendingTask*) base/debug/task_annotator.cc:57 #34 0xd4cafde in base::internal::IncomingTaskQueue::RunTask(base::PendingTask*) base/message_loop/incoming_task_queue.cc:128:19 #35 0xd2bf6fc in base::MessageLoop::RunTask(base::PendingTask*) base/message_loop/message_loop.cc:394:25 #36 0xd2c02cc in base::MessageLoop::DeferOrRunPendingTask(base::PendingTask) base/message_loop/message_loop.cc:406:5 #37 0xd2c0c23 in base::MessageLoop::DoWork() base/message_loop/message_loop.cc:450:16 #38 0xd2c5912 in base::MessagePumpDefault::Run(base::MessagePump::Delegate*) base/message_loop/message_pump_default.cc:37:31 SUMMARY: AddressSanitizer: 211569 byte(s) leaked in 1171 allocation(s). +sky for help with triage since you're the only one in the chrome/browser/sessions/OWNERS file. It doesn't look like anyone has modified the test in years at this point. Any ideas?
,
Nov 28 2017
Here's a link to the flakiness dashboard: https://test-results.appspot.com/dashboards/flakiness_dashboard.html#tests=TabRestoreTest.RestoreTabWithSpecialURL. It looks like it was failing approx every other run for a while, but now it seems less so. The truncated log is a problem. That looks like a feature of test_launcher. +dpranke: do you know who to talk to about changing the way ASan logs are generated so that the useful part of the stack isn't chopped off? Alternatively, it reasonable to increase the max log length for ASan tests?
,
Nov 29 2017
Yes, we intentionally truncate logs that are overly large. There was some discussion on this last year: https://groups.google.com/a/chromium.org/d/topic/infra-dev/Hh5XeJuvpfo/discussion . The logic in the code seems a bit puzzling; it looks like we check to see if we got > 300k of output, and, if so, throw away 5000 lines of code. https://cs.chromium.org/chromium/src/base/test/launcher/test_launcher.cc?l=602 We could certainly play with this, but I'd prefer it if we focused more on eliminating useless log messages, it looks like there are a lot of them in that log.
,
Dec 5 2017
I was able to reproduce the failure locally when I tried again. I've attached the full test output because it's very long. There are a bunch of memory leaks coming from blink. Could this be because the test case runs with --single-process and something in blink isn't shut down correctly? dpranke: The output There are ~100 lines of output from the test case itself. 80 of those lines are kinda useless, but I don't think that is the issue, because there is another ~3000 lines of output from ASan about leaks. It seems like the large number of leaks is the problem here? Looking at the logic in TestLauncher::OnTestFinished() *I think* if the test output is over 300k then we throw away everything except the last 5000 characters. The comments/variable names are misleading.
,
Dec 5 2017
Actually, I think there is a bug in TestLauncher::OnTestFinished(). It uses the wrong constant when truncating, so instead of the last 300kB of log output we get the last 5000 bytes. kOutputSnippetLinesLimit is used instead of kOutputSnippetBytesLimit. +phajdan.jr: Does this change look correct to you? https://crrev.com/c/809085
,
Dec 5 2017
I think you just rediscovered what I wrote in #c3 :).
,
Dec 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8c25eab6514af05f6ce3dc1e40d83764410ea3f9 commit 8c25eab6514af05f6ce3dc1e40d83764410ea3f9 Author: kylechar <kylechar@chromium.org> Date: Wed Dec 06 02:25:19 2017 Truncate test output by kOutputSnippetBytesLimit. If test output is over kOutputSnippetBytesLimit, which is 300kb, then TestLauncher::OnTestFinished() trunactes the string to the last 5000 bytes. It looks like this a mix up of kOutputSnippetBytesLimit and kOutputSnippetLinesLimit. Bug: 788739 Change-Id: I4aacc4b62f51f764143060bdaeef00031053d1b5 Reviewed-on: https://chromium-review.googlesource.com/809085 Reviewed-by: Dirk Pranke <dpranke@chromium.org> Reviewed-by: Scott Violet <sky@chromium.org> Commit-Queue: kylechar <kylechar@chromium.org> Cr-Commit-Position: refs/heads/master@{#521957} [modify] https://crrev.com/8c25eab6514af05f6ce3dc1e40d83764410ea3f9/base/test/launcher/test_launcher.cc
,
Dec 6 2017
dpranke: Yep, it appears so! The test output is more complete when TabRestoreTest.RestoreTabWithSpecialURL fails on the bots now. I still have no idea what is going wrong and causing so many memory leaks from blink code though.
,
Apr 13 2018
,
Aug 3
This bug has an owner, thus, it's been triaged. Changing status to "assigned". |
|||||
►
Sign in to add a comment |
|||||
Comment 1 by kylec...@chromium.org
, Nov 27 2017