FATAL:top_sites_backend.cc(84)] Check failed: !db_. |
|||||||||
Issue description
,
Jul 31 2017
,
Jul 31 2017
This has been causing a lot of flakiness on linux_chromium_rel_ng trybots, so please treat this as a P1. Though I don't think the suspected range in comment 2 actually contain the culprit.
,
Jul 31 2017
For completeness, here is the backtrace: FATAL:top_sites_backend.cc(84)] Check failed: !db_. #0 0x000002d45397 base::debug::StackTrace::StackTrace() #1 0x000002d5de11 logging::LogMessage::~LogMessage() #2 0x000004b93511 history::TopSitesBackend::~TopSitesBackend() #3 0x000004b71b8b history::TopSitesImpl::~TopSitesImpl() #4 0x000004b71bc9 history::TopSitesImpl::~TopSitesImpl() #5 0x000002faeb0f ThumbnailListSource::~ThumbnailListSource() #6 0x000001e65b35 content::URLDataSourceImpl::~URLDataSourceImpl() #7 0x000001e61bae content::URLDataManager::DeleteDataSources() #8 0x000001a1cb90 content::BrowserMainLoop::ShutdownThreadsAndCleanUp() #9 0x000001a1eea1 content::BrowserMainRunnerImpl::Shutdown()
,
Jul 31 2017
,
Jul 31 2017
,
Jul 31 2017
OK, I spent 2 minutes reading r490764 and I don't see how it can cause this. Rather than just throwing out CL numbers as suspects, it may be more helpful to also say why it might be the cause.
,
Jul 31 2017
Apologies thestig@, it was identified as the only commit in a regression range in sheriff-o-matic for interactive_ui_tests failures, so I thought it might be worth pointing it out (and it's one commit before the suspect range that is mentioned earlier). You're right though, it can't cause this.
,
Jul 31 2017
Trying to reproduce locally. The crash was reproduced locally once, but I haven't encountered the second occurrence yet.
,
Jul 31 2017
This was a month ago, but given the issue is that db_ is intermittently not unset, could this be relevant? "Remove BrowserThread::DB usage in TopSites": https://chromium.googlesource.com/chromium/src/+/db7be2a305a4c064915a1e64773ebc78a0f55618 It did touch some shutdown code, so if it wasn't a 1:1 migration from BrowserThread::DB and MessageLoop it's probably worth investigating.
,
Jul 31 2017
Right - TopSitesBackend is asserting that Shutdown() was invoked on it, by TopSitesImpl::ShutdownOnUIThread - the implication is that the RefcountedKeyedService infrastructure is not invoking it, or the task posted to the DB thread is being dropped. Is the new sequenced worker that the class uses sufficiently high-priority to ensure that the task gets executed during shutdown?
,
Jul 31 2017
+yutak@ as Chromium Sheriff in Tokyo timezone.
,
Aug 1 2017
Issue 750957 has been merged into this issue.
,
Aug 1 2017
,
Aug 1 2017
Regarding the CL in comment 11: it uses SequencedTaskRunner, but is this okay? i.e. SequencedTaskRunner is weaker than SingleThreadTaskRunner in that the tasks may be invoked on different threads (but not simultaneously). This smells a bit.
,
Aug 1 2017
Or it may need to set a ShutdownBehavior in TaskTraits.
,
Aug 1 2017
OK this seems to be due to the CL in comment 11. The CL moved to use TaskScheduler to post tasks. However, TaskScheduler skips the remaining tasks on shutdown by default, so the posted task TopSitesBackend::ShutdownDBOnDBThread may not be run. As a bandaid, we can set BLOCK_SHUTDOWN shutdown behavior and this should fix the issue. I'm going to land this with TBR. In the long term, the use of BLOCK_SHUTDOWN is undesirable, so a better solution will be needed.
,
Aug 1 2017
https://chromium-review.googlesource.com/c/595350/ is in CQ. This should fix the current flakiness (I hope)
,
Aug 1 2017
For other people searching for this bug: It also affects linux_chromium_chromeos_rel_ng so it may explain the CQ failing your Chrome OS CLs.
,
Aug 1 2017
,
Aug 1 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2eda659509015d2a9e3d5908d9f0035fece774af commit 2eda659509015d2a9e3d5908d9f0035fece774af Author: Yuta Kitamura <yutak@chromium.org> Date: Tue Aug 01 04:45:19 2017 Execute TopSitesBackend's shutdown task reliably. Currently, we see many flaky DCHECKs saying the db_ variable is not cleared on TopSitesBackend's destruction. This was caused by a bug introduced in a change db7be2a, which migrates TopSitesBackend's task runner to task scheduler's. However, by default, the task scheduler doesn't run tasks remaining on shutdown, unlike the old task runner, which may cause the ShutdownDBOnDBThread() task to get skipped. This patch adds BLOCK_SHUTDOWN trait to run ShutDownDBOnDBThread() task reliably. This should stop the current flakiness. However, this trait is not desirable in a long term, so a better solution will need to be implemented. TBR=treib@chromium.org,sky@chromium.org Bug: 750786 Change-Id: If311de8949572c80862de035694e82d787d7e706 Reviewed-on: https://chromium-review.googlesource.com/595350 Commit-Queue: Yuta Kitamura <yutak@chromium.org> Reviewed-by: Scott Violet <sky@chromium.org> Reviewed-by: Yuta Kitamura <yutak@chromium.org> Cr-Commit-Position: refs/heads/master@{#490850} [modify] https://crrev.com/2eda659509015d2a9e3d5908d9f0035fece774af/components/history/core/browser/top_sites_backend.cc
,
Aug 1 2017
Linux trybots have become a lot better: https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng?numbuilds=200 The fix seems working well. There seems another flakiness around OmniboxViewTest. I'll file another bug for this.
,
Aug 1 2017
Thanks for fixing this, and sorry for the trouble! Task scheduling stuff is tricky... :-/ |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by hirosh...@chromium.org
, Jul 31 2017