Issue metadata
Sign in to add a comment
|
Chromium crashes FATAL:site_engagement_service.cc(318) last_engagement_time() <= rebase_time |
||||||||||||||||||||||||
Issue descriptionVersion: 56.0.2887.0 OS: Win7 What steps will reproduce the problem? (1) Build Chromium (2) Run it (3) Type in omnibox RESULT: Browser crashes from failed DCHECK https://chromium.googlesource.com/chromium/src/+/709246a0f3afee77f19d056a6306d2f63b613fad --------------------------- Fatal error --------------------------- [4472:992:1010/162855:FATAL:site_engagement_service.cc(318)] Check failed: score.last_engagement_time() <= rebase_time (2016-10-10 15:42:53.045 UTC vs. 2016-10-10 13:28:55.681 UTC) Backtrace: base::win::RegistryValueIterator::ValueSize [0x1007B857+336407] base::win::RegistryValueIterator::ValueSize [0x100CAC7B+661051] policy::PolicyDetails::operator= [0x049316C5+21980873] policy::PolicyDetails::operator= [0x0493135B+21979999] policy::PolicyDetails::operator= [0x049278A8+21940396] policy::PolicyDetails::operator= [0x04927AB9+21940925] policy::PolicyDetails::operator= [0x04927C44+21941320] policy::PolicyDetails::operator= [0x04933F94+21991320] policy::PolicyDetails::operator= [0x0378162F+3434035] policy::PolicyDetails::operator= [0x046D03F2+19485686] policy::PolicyDetails::operator= [0x046CD767+19474283] policy::PolicyDetails::operator= [0x046CD834+19474488] policy::PolicyDetails::operator= [0x046CD909+19474701] policy::PolicyDetails::operator= [0x046D02C4+19485384] base::win::RegistryValueIterator::ValueSize [0x1004127E+97342] base::win::RegistryValueIterator::ValueSize [0x100809B1+357233] base::win::RegistryValueIterator::ValueSize [0x100F9606+851910] base::win::RegistryValueIterator::ValueSize [0x100F765C+843804] base::win::RegistryValueIterator::ValueSize [0x100F7A1C+844764] base::win::RegistryValueIterator::ValueSize [0x10101431+884209] base::win::RegistryValueIterator::ValueSize [0x10103710+893136] base::win::RegistryValueIterator::ValueSize [0x100FFA46+877574] gapfnScSendMessage [0x764862FA+818] GetThreadDesktop [0x76486D3A+215] CharPrevW [0x764877C4+312] DispatchMessageW [0x7648788A+15] base::win::RegistryValueIterator::ValueSize [0x1010256F+888623] base::win::RegistryValueIterator::ValueSize [0x10102761+889121] base::win::RegistryValueIterator::ValueSize [0x10101122+883426] base::win::RegistryValueIterator::ValueSize [0x10102E0B+890827] base::win::RegistryValueIterator::ValueSize [0x100F9359+851225] base::win::RegistryValueIterator::ValueSize [0x1019CC64+1521188] policy::PolicyDetails::operator= [0x04B15755+23963481] content::RenderThreadImpl::GetFileThreadTaskRunner [0x115E03FD+15492529] content::RenderThreadImpl::GetFileThreadTaskRunner [0x115E5D8F+15515459] content::RenderThreadImpl::GetFileThreadTaskRunner [0x115D3BEB+15441311] content::RenderThreadImpl::GetFileThreadTaskRunner [0x12EBD537+41563883] content::RenderThreadImpl::GetFileThreadTaskRunner [0x12EBD3F8+41563564] content::RenderThreadImpl::GetFileThreadTaskRunner [0x12EBB1D4+41554824] policy::PolicyDetails::operator= [0x036E7362+2802534] policy::PolicyDetails::operator= [0x0042E00F+114020] policy::PolicyDetails::operator= [0x00428E1C+93041] policy::PolicyDetails::operator= [0x004C7FEE+744771] policy::PolicyDetails::operator= [0x004C7E50+744357] policy::PolicyDetails::operator= [0x004C7CED+744002] policy::PolicyDetails::operator= [0x004C8008+744797] BaseThreadInitThunk [0x765D336A+18] RtlInitializeExceptionChain [0x77C59902+99] RtlInitializeExceptionChain [0x77C598D5+54] --------------------------- OK ---------------------------
,
Oct 11 2016
Hi Eric, was this with a clean profile? If not, can you run Chromium with a clean user-data-dir, and let me know if the crash happens then? Also see crbug.com/627550
,
Oct 11 2016
This was not a clean profile. I presume this code isn't even run in a Clean Profile because CleanupEngagementScores should only be running when IsLastEngagementStale == true, right? Before I could try with a clean profile, I pulled and rebuilt Chromium 56.0.2888.0, and the repro disappeared.
,
Oct 12 2016
Crash has reappeared with my Version 56.0.2889.0 build. Does not repro when I use the command line argument --user-data-dir=C:\temp\, but otherwise Chromium crashes on each startup. --------------------------- Fatal error --------------------------- [11068:2228:1012/142057:FATAL:site_engagement_service.cc(318)] Check failed: score.last_engagement_time() <= rebase_time (2016-10-14 11:25:46.575 UTC vs. 2016-10-12 11:20:57.648 UTC) Backtrace: base::PersistentHistogramAllocator::Id [0x1007B987+336371] base::PersistentHistogramAllocator::Id [0x100CADCB+661047] policy::PolicyDetails::operator= [0x048F1B65+21723503] policy::PolicyDetails::operator= [0x048F17FB+21722629] policy::PolicyDetails::operator= [0x048E79A8+21682098] policy::PolicyDetails::operator= [0x048E7BB9+21682627] policy::PolicyDetails::operator= [0x048E7D44+21683022] policy::PolicyDetails::operator= [0x048F44B4+21734078] policy::PolicyDetails::operator= [0x0377FBEF+3430905] policy::PolicyDetails::operator= [0x04689F12+19201308] policy::PolicyDetails::operator= [0x04687287+19189905] policy::PolicyDetails::operator= [0x04687354+19190110] policy::PolicyDetails::operator= [0x04687429+19190323] policy::PolicyDetails::operator= [0x04689DE4+19201006] base::PersistentHistogramAllocator::Id [0x100413AE+97306] base::PersistentHistogramAllocator::Id [0x10080AE1+357197] base::PersistentHistogramAllocator::Id [0x100FAB78+857060] base::PersistentHistogramAllocator::Id [0x100F89EC+848472] base::PersistentHistogramAllocator::Id [0x100F8DAC+849432] base::PersistentHistogramAllocator::Id [0x10102B20+889740] base::PersistentHistogramAllocator::Id [0x1010479B+897031] base::PersistentHistogramAllocator::Id [0x100FA86C+856280] base::PersistentHistogramAllocator::Id [0x1019E9B4+1528352] policy::PolicyDetails::operator= [0x04AD6D35+23710527] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x1160362D+15632204] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x11608FBF+15655134] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x115F6ECB+15581162] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x12EB03E7+41506054] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x12EB02A8+41505735] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x12EAE084+41496995] policy::PolicyDetails::operator= [0x036E3E22+2792492] policy::PolicyDetails::operator= [0x0042E00F+114020] policy::PolicyDetails::operator= [0x00428E1C+93041] policy::PolicyDetails::operator= [0x004C7FEE+744771] policy::PolicyDetails::operator= [0x004C7E50+744357] policy::PolicyDetails::operator= [0x004C7CED+744002] policy::PolicyDetails::operator= [0x004C8008+744797] BaseThreadInitThunk [0x765D336A+18] RtlInitializeExceptionChain [0x77C59902+99] RtlInitializeExceptionChain [0x77C598D5+54] --------------------------- OK --------------------------- --------------------------- Fatal error --------------------------- [10868:9528:1012/142126:FATAL:site_engagement_service.cc(318)] Check failed: score.last_engagement_time() <= rebase_time (2016-10-14 11:25:46.575 UTC vs. 2016-10-12 11:21:26.822 UTC) Backtrace: base::PersistentHistogramAllocator::Id [0x1007B987+336371] base::PersistentHistogramAllocator::Id [0x100CADCB+661047] policy::PolicyDetails::operator= [0x048F1B65+21723503] policy::PolicyDetails::operator= [0x048F17FB+21722629] policy::PolicyDetails::operator= [0x048E79A8+21682098] policy::PolicyDetails::operator= [0x048E7BB9+21682627] policy::PolicyDetails::operator= [0x048E7D44+21683022] policy::PolicyDetails::operator= [0x048F44B4+21734078] policy::PolicyDetails::operator= [0x0377FBEF+3430905] policy::PolicyDetails::operator= [0x04689F12+19201308] policy::PolicyDetails::operator= [0x04687287+19189905] policy::PolicyDetails::operator= [0x04687354+19190110] policy::PolicyDetails::operator= [0x04687429+19190323] policy::PolicyDetails::operator= [0x04689DE4+19201006] base::PersistentHistogramAllocator::Id [0x100413AE+97306] base::PersistentHistogramAllocator::Id [0x10080AE1+357197] base::PersistentHistogramAllocator::Id [0x100FAB78+857060] base::PersistentHistogramAllocator::Id [0x100F89EC+848472] base::PersistentHistogramAllocator::Id [0x100F8DAC+849432] base::PersistentHistogramAllocator::Id [0x10102B20+889740] base::PersistentHistogramAllocator::Id [0x1010479B+897031] base::PersistentHistogramAllocator::Id [0x100FA86C+856280] base::PersistentHistogramAllocator::Id [0x1019E9B4+1528352] policy::PolicyDetails::operator= [0x04AD6D35+23710527] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x1160362D+15632204] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x11608FBF+15655134] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x115F6ECB+15581162] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x12EB03E7+41506054] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x12EB02A8+41505735] IPC::MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void>::~MessageT<ScreenOrientationMsg_LockSuccess_Meta,std::tuple<int>,void> [0x12EAE084+41496995] policy::PolicyDetails::operator= [0x036E3E22+2792492] policy::PolicyDetails::operator= [0x0042E00F+114020] policy::PolicyDetails::operator= [0x00428E1C+93041] policy::PolicyDetails::operator= [0x004C7FEE+744771] policy::PolicyDetails::operator= [0x004C7E50+744357] policy::PolicyDetails::operator= [0x004C7CED+744002] policy::PolicyDetails::operator= [0x004C8008+744797] BaseThreadInitThunk [0x765D336A+18] RtlInitializeExceptionChain [0x77C59902+99] RtlInitializeExceptionChain [0x77C598D5+54] --------------------------- OK ---------------------------
,
Oct 13 2016
,
Oct 13 2016
Could this be some daylight savings thing?
,
Oct 13 2016
Either way from a quick scan of the code I think this DCHECK is problematic - I think it can fire in normal chrome operation if people change their clocks. The code should deal with this in some sane way.
,
Oct 13 2016
Thanks Eric for the follow-up. benwells: I don't think this is to do with daylight savings (that hasn't happened in the US yet), but I do see a potential issue when clocks move. This only comes about because of the more aggressive decay period. Score rebasing kicks in at 5 hours after no activity was detected, and tries to rebase scores to a time 4 hours in the past from right now. Let's say it's time 0 right now and you last used Chrome on site X 4 hours ago. Then you change your clock back a day, and then use Chrome on site Y, then leave it for 6 hours. Score rebasing kicks in. The overall last engagement time is time -18 hours. Site Y was last used at time -24 hours. But site X was last used at time -4 hours, and we try to rebase it to time -22 hours, which fails the DCHECK. I think this was potentially an issue before except the time periods involved were much bigger - you would have to change your clock by weeks to get it to kick in. I'll send a CL to fix.
,
Oct 13 2016
As far as I know, the clock on my system is behaving normally-- I definitely haven't seen anything indicating adjustments by anything even approaching a day, and this is a wired PC on Google's corporate network with the default time servers. Is this the sort of bug where the profile could've had bad data from long ago (due to whatever cause) and the problem will repeat indefinitely into the future until that cached data is cleared?
,
Oct 13 2016
Hmmm. Data from a long time ago shouldn't trigger this issue at all, because the DCHECK is that a site's last engagement time is *earlier* than the time we want to rebase the scores to. The bad data would have to set a site's last engagement time far in the future, because the rebase time is calculated relative to the current time. The problem shouldn't recur indefinitely because time moves forward, and eventually the rebase time will eclipse any last engagement times that aren't updated. Additionally, the code that the DCHECK failure is in should only ever be entered if the overall last engagement time is further in the past than the rebase_time, which is why the issue in comment #8 could occur (you have a last engagement time that's affected by your adjusted clock). I'm fairly sure the code is careful to update the overall last engagement time each time an individual score's last engagement time is changed. There's just a 5 minute difference between the two times listed in the crash, so perhaps there's some edge case here triggered when the more aggressive accumulation that was recently added.
,
Oct 13 2016
[11068:2228:1012/142057:FATAL:site_engagement_service.cc(318)] Check failed: score.last_engagement_time() <= rebase_time (2016-10-14 11:25:46.575 UTC vs. 2016-10-12 11:20:57.648 UTC) There's something weird going on in this line. You have a score with a last_engagement_time ~2 days in the future? Super curious! Something is moving your scores around in an odd way. Are you sure your machine hasn't had any clock movements? I think that's the most reasonable explanation of why you have a score time in the future (the moved scores might have caused the rebasing to actually move the last_engagement_time to something forward in time). How about I put in a fix for the clock moving wrt rebasing, and we'll see if you trigger this crash again once that fix lands. WDYT?
,
Oct 13 2016
I saw this problem without my computer's clock changing. It started happening just after I ran Chrome 52 on the same profile, then re-ran Chrome 55. The problem would occur on every startup after that. I have a copy of the profile data saved, if it's useful.
,
Oct 13 2016
elawrence, pkasting: If you can send me the your profile data that would be very useful. Now there's a second report there might be something systemic going on, and I'll need to look at the prefs and content settings to see exactly what's happening when the crash happens.
,
Oct 13 2016
Sent.
,
Oct 19 2016
elawrence, pkasting: i) do you remember if you closed Chrome gracefully, or if you killed it the last time it ran correctly before starting to crash? It may be that the inconsistency stemmed from the site engagement service's overall last engagement time not being persisted to disk while the individual score's last engagement time was being persisted. The pref system and when it persists content settings vs prefs is very confusing.
,
Oct 19 2016
I'm pretty sure I closed gracefully, since I started Chrome 52 not-under-a-debugger to take screenshots of some pre-MD UI, closed it, then launched 55.
,
Oct 19 2016
Thanks pkasting. I haven't managed to work out exactly how to recreate the data in your profile, but it has an overall last engagement time of X, but several individual scores with last engagement times later than X, which shouldn't happen. Every time a last engagement time is written, the overall one should be written too. My suspicion is that the per-score times are being persisted, but the overall last engagement time isn't due to some strangeness in the prefs system and the timing of writing out lossy prefs. I'm submitting a CL to change the DCHECKing bit of code to do more sanity checking, and to adjust times appropriately if it faces input that previously triggered the DCHECK. Thanks very much for the data to help debug here.
,
Oct 21 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ed64b2a6aaeb060283ca5583abc1583a879b1ce1 commit ed64b2a6aaeb060283ca5583abc1583a879b1ce1 Author: dominickn <dominickn@chromium.org> Date: Fri Oct 21 02:25:51 2016 Make the site engagement service more robust to clock changes and time inconsistencies. This CL adjusts the site engagement service's logic for rebasing last engagement times when no engagement has been earned for some time. This code was prone to errors when users changed their clocks, where last engagement times could be moved into the future or into the past based on a clock change. This has only recently become an issue because the rebasing codepath now triggers if no engagement is earned for ~9 hours, as opposed to three weeks. The code was also brittle if the overall last engagement time (a lossy pref) became out of sync with any single engagement score's last engagement time (a lossy content setting). It is theoretically possible that the content setting would be written to prefs, but the overall last engagement time would not due to the prefs system buffering the lossy write and then failing to complete it before browser shutdown. This CL changes the rebase logic to also sanely handle unexpected times caused by changing clocks or pref writing race conditions. Time rebasing now occurs when the last engagement time is detected to be in the future as well as in the past. Regression tests are added to ensure the correct behaviour. BUG= 654560 Review-Url: https://chromiumcodereview.appspot.com/2427343002 Cr-Commit-Position: refs/heads/master@{#426696} [modify] https://crrev.com/ed64b2a6aaeb060283ca5583abc1583a879b1ce1/chrome/browser/engagement/site_engagement_service.cc [modify] https://crrev.com/ed64b2a6aaeb060283ca5583abc1583a879b1ce1/chrome/browser/engagement/site_engagement_service.h [modify] https://crrev.com/ed64b2a6aaeb060283ca5583abc1583a879b1ce1/chrome/browser/engagement/site_engagement_service_unittest.cc
,
Oct 25 2016
Requesting merge of #18 to M55.
,
Oct 26 2016
[Automated comment] There appears to be on-going work (i.e. bugroid changes), needs manual review.
,
Oct 26 2016
[Automated comment] There appears to be on-going work (i.e. bugroid changes), needs manual review.
,
Oct 26 2016
,
Oct 26 2016
Your change meets the bar and is auto-approved for M55 (branch: 2883)
,
Oct 27 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb commit 932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb Author: Dominick Ng <dominickn@chromium.org> Date: Thu Oct 27 00:01:50 2016 Make the site engagement service more robust to clock changes and time inconsistencies. This CL adjusts the site engagement service's logic for rebasing last engagement times when no engagement has been earned for some time. This code was prone to errors when users changed their clocks, where last engagement times could be moved into the future or into the past based on a clock change. This has only recently become an issue because the rebasing codepath now triggers if no engagement is earned for ~9 hours, as opposed to three weeks. The code was also brittle if the overall last engagement time (a lossy pref) became out of sync with any single engagement score's last engagement time (a lossy content setting). It is theoretically possible that the content setting would be written to prefs, but the overall last engagement time would not due to the prefs system buffering the lossy write and then failing to complete it before browser shutdown. This CL changes the rebase logic to also sanely handle unexpected times caused by changing clocks or pref writing race conditions. Time rebasing now occurs when the last engagement time is detected to be in the future as well as in the past. Regression tests are added to ensure the correct behaviour. BUG= 654560 Review-Url: https://chromiumcodereview.appspot.com/2427343002 Cr-Commit-Position: refs/heads/master@{#426696} (cherry picked from commit ed64b2a6aaeb060283ca5583abc1583a879b1ce1) Review URL: https://codereview.chromium.org/2452123003 . Cr-Commit-Position: refs/branch-heads/2883@{#328} Cr-Branched-From: 614d31daee2f61b0180df403a8ad43f20b9f6dd7-refs/heads/master@{#423768} [modify] https://crrev.com/932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb/chrome/browser/engagement/site_engagement_service.cc [modify] https://crrev.com/932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb/chrome/browser/engagement/site_engagement_service.h [modify] https://crrev.com/932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb/chrome/browser/engagement/site_engagement_service_unittest.cc
,
Oct 27 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb commit 932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb Author: Dominick Ng <dominickn@chromium.org> Date: Thu Oct 27 00:01:50 2016 Make the site engagement service more robust to clock changes and time inconsistencies. This CL adjusts the site engagement service's logic for rebasing last engagement times when no engagement has been earned for some time. This code was prone to errors when users changed their clocks, where last engagement times could be moved into the future or into the past based on a clock change. This has only recently become an issue because the rebasing codepath now triggers if no engagement is earned for ~9 hours, as opposed to three weeks. The code was also brittle if the overall last engagement time (a lossy pref) became out of sync with any single engagement score's last engagement time (a lossy content setting). It is theoretically possible that the content setting would be written to prefs, but the overall last engagement time would not due to the prefs system buffering the lossy write and then failing to complete it before browser shutdown. This CL changes the rebase logic to also sanely handle unexpected times caused by changing clocks or pref writing race conditions. Time rebasing now occurs when the last engagement time is detected to be in the future as well as in the past. Regression tests are added to ensure the correct behaviour. BUG= 654560 Review-Url: https://chromiumcodereview.appspot.com/2427343002 Cr-Commit-Position: refs/heads/master@{#426696} (cherry picked from commit ed64b2a6aaeb060283ca5583abc1583a879b1ce1) Review URL: https://codereview.chromium.org/2452123003 . Cr-Commit-Position: refs/branch-heads/2883@{#328} Cr-Branched-From: 614d31daee2f61b0180df403a8ad43f20b9f6dd7-refs/heads/master@{#423768} [modify] https://crrev.com/932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb/chrome/browser/engagement/site_engagement_service.cc [modify] https://crrev.com/932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb/chrome/browser/engagement/site_engagement_service.h [modify] https://crrev.com/932ce7acd24d0b4b2f1fd2eb17be6fe5594962cb/chrome/browser/engagement/site_engagement_service_unittest.cc
,
Nov 2 2016
,
Nov 4 2016
[Automated comment] removing mislabelled merge-merged-2840
,
Nov 4 2016
[Automated comment] removing mislabelled merge-merged-2840 |
|||||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||||
Comment 1 by benwells@chromium.org
, Oct 11 2016