New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 134695 link

Starred by 6 users

Issue metadata

Status: Fixed
Closed: Jun 2012
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

  • Only users with EditIssue permission may comment.

Sign in to add a comment

Chrome: Crash Report - Stack Signature: base::`anonymous namespace'::OnNoMemory()-9...

Reported by, Jun 26 2012

Issue description

kaiwang@ could consistently reproduce it in his machine.

Product: Chrome
Stack Signature: base::`anonymous namespace'::OnNoMemory()-559748
New Signature Label: base::`anonymous namespace'::OnNoMemory()
New Signature Hash: 97a82f64_959b8e4e_9da16f17_f7d04147_3df7e38c

Report link: http://go/crash/reportdetail?reportid=dd23179213e07ffa

Meta information:
Product Name: Chrome
Product Version: 22.0.1186.0
Report ID: dd23179213e07ffa
Report Time: 2012/06/26 05:38:13, Tue
Uptime: 18 sec
Cumulative Uptime: 0 sec
OS Name: Windows NT
OS Version: 6.1.7601 Service Pack 1
CPU Architecture: x86
CPU Info: GenuineIntel family 6 model 37 stepping 5
ptype: browser

Thread 0 *CRASHED* ( EXCEPTION_BREAKPOINT @ 0x60d88fa5 )

0x60d88fa5	 [chrome.dll]	 -	
base::`anonymous namespace'::OnNoMemory()
0x608516d2	 [chrome.dll]	 -	
0x608515b4	 [chrome.dll]	 -	
0x6085abe6	 [chrome.dll]	 - xmemory:187]	
std::allocator::allocate(unsigned int)
0x60862f37	 [chrome.dll]	 - vector:751]	
std::vector,std::allocator >,std::allocator,std::allocator > > >::reserve(unsigned int)
0x60862ee5	 [chrome.dll]	 - vector:1297]	
std::vector,std::allocator >,std::allocator,std::allocator > > >::_Reserve(unsigned int)
0x60862e50	 [chrome.dll]	 - vector:991]	
std::vector,std::allocator >,std::allocator,std::allocator > > >::push_back(std::basic_string,std::allocator > const &)
0x6088b5a6	 [chrome.dll]	 -	
base::SplitStringT,std::allocator > >
0x6088b4ba	 [chrome.dll]	 -	
base::SplitString(std::basic_string,std::allocator > const &,char,std::vector,std::allocator >,std::allocator,std::allocator > > > *)
0x61669caf	 [chrome.dll]	 -	
TemplateURLService::CreateTemplateURLFromTemplateURLAndSyncData(Profile *,TemplateURL *,SyncData const &,std::vector > *)
0x6166befd	 [chrome.dll]	 -	
TemplateURLService::MergeDataAndStartSyncing(syncable::ModelType,std::vector > const &,scoped_ptr,scoped_ptr)
0x617b3599	 [chrome.dll]	 -	
0x617b3372	 [chrome.dll]	 -	
browser_sync::UIDataTypeController::StartAssociating(base::Callback const &)
0x61841f7d	 [chrome.dll]	 -	
0x61841209	 [chrome.dll]	 -	
0x61840fcd	 [chrome.dll]	 - bind_internal.h:938	
base::internal::InvokeHelper<1,void,base::internal::RunnableAdapter,void (base::WeakPtr const &,syncable::ModelType const &,SyncError const &)>::MakeItSo(base::internal::RunnableAdapter,base::WeakPtr const &,syncable::ModelType const &,SyncError const &)
0x61841025	 [chrome.dll]	 - bind_internal.h:1317	
base::internal::Invoker<1,base::internal::BindState,void (browser_sync::ModelAssociationManager *,syncable::ModelType,SyncError),void (base::WeakPtr)>,void (browser_sync::ModelAssociationManager *,syncable::ModelType,SyncError)>::Run(base::internal::BindStateBase *,syncable::ModelType const &,SyncError const &)
0x617b3323	 [chrome.dll]	 -	
0x617b1b22	 [chrome.dll]	 -	
browser_sync::AppNotificationDataTypeController::Observe(int,content::NotificationSource const &,content::NotificationDetails const &)
0x608f400d	 [chrome.dll]	 -	
NotificationServiceImpl::Notify(int,content::NotificationSource const &,content::NotificationDetails const &)
0x6166b637	 [chrome.dll]	 -	
0x6166bc25	 [chrome.dll]	 -	
TemplateURLService::OnWebDataServiceRequestDone(int,WDTypedResult const *)
0x60cd6a3f	 [chrome.dll]	 -	
0x608dcf02	 [chrome.dll]	 - bind_internal.h:1254	
base::internal::Invoker<2,base::internal::BindState,void (media::GpuVideoDecoder *,int),void (media::GpuVideoDecoder *,int)>,void (media::GpuVideoDecoder *,int)>::Run(base::internal::BindStateBase *)
0x60878029	 [chrome.dll]	 -	
MessageLoop::RunTask(base::PendingTask const &)
0x60876f07	 [chrome.dll]	 -	
0x609ebfa3	 [chrome.dll]	 -	
0x60876a98	 [chrome.dll]	 -	
0x60cc66b8	 [chrome.dll]	 -	
MessageLoopForUI::RunWithDispatcher(base::MessagePumpDispatcher *)
0x60cc6414	 [chrome.dll]	 -	
ChromeBrowserMainParts::MainMessageLoopRun(int *)
0x60cc638a	 [chrome.dll]	 -	
0x60cc62fc	 [chrome.dll]	 -	
`anonymous namespace'::BrowserMainRunnerImpl::Run()
0x608e138e	 [chrome.dll]	 -	
BrowserMain(content::MainFunctionParams const &)
0x60867b08	 [chrome.dll]	 -	
content::RunNamedProcessTypeMain(std::basic_string,std::allocator > const &,content::MainFunctionParams const &,content::ContentMainDelegate *)
0x60867a89	 [chrome.dll]	 -	
0x6085a2b7	 [chrome.dll]	 -	
content::ContentMain(HINSTANCE__ *,sandbox::SandboxInterfaceInfo *,content::ContentMainDelegate *)
0x6085a243	 [chrome.dll]	 -	
0x0101627c	 [chrome.exe]	 -	
MainDllLoader::Launch(HINSTANCE__ *,sandbox::SandboxInterfaceInfo *)
0x0101547b	 [chrome.exe]	 -	
RunChrome(HINSTANCE__ *)
0x010154e6	 [chrome.exe]	 -	
0x0106d962	 [chrome.exe]	 - crt0.c:275	
0x75ad3399	 [kernel32.dll]	 + 0x00013399]	
0x77e19ef1	 [ntdll.dll]	 + 0x00039ef1]	
0x77e19ec4	 [ntdll.dll]	 + 0x00039ec4]	
kaiwang, can you debug this and see what's actually going wrong here?

I can see at least one bad thing about this code but I'd be kind of surprised if it caused this crash.
Actually, this could be really bad.

SplitString() doesn't actually clear the destination vector.  As a result I think this code could blow up exponentially: every time the same TemplateURL is run through CreateTemplateURLFromTemplateURLAndSyncData(), the input encoding list doubles in size.

As part of the fix I probably need to run a normalization pass to crunch down anyone's data that has blown up this way.

Steve, is there a way we can quickly see how bad this is in the wild, by looking ion the sync servers for users whose TemplateURL input_encodings have too many entries (or are just too many characters)?

Comment 3 by, Jun 26 2012

cc+Tim who would know more about querying the server-side data. Tim - is there a query we can run quickly to see how pressing this new issue is?

kaiwang - Do you mind emailing me your "Web Data" file from the affected profile so I can see which search engine entry might have been prone to this? (perhaps any entry with some valid Encoding list, but I haven't looked at the code in detail yet)

Comment 4 by, Jun 26 2012

cc -Tim +Raz

Raz: First off, this issue is orthogonal to the DSP issues we've been tacking in the past two weeks. Secondly, let us know when you can help us with this query.

 Issue 134676  has been merged into this issue.
Sent my "Web Data" file to Steve.

Comment 7 by, Jun 26 2012

Inspecting the Web Data file with SQL Lite database browser, it appears that the encoding values are being duplicated as you expected. See the attached file for what the encodings might look like.

I also took a look at a random earlier Web Data file (probably from around M21 or so, but uncertain) and it appears that these values are OK (they show maybe one or two encoding types in the string, with no repeats).

So there appears to be two issues here:
1. The issue you mentioned, Peter, where SplitString isn't clearing the vector.
2. A more recent changed has caused this code to put TemplateURLs through that CreateTemplateURLFromTemplateURLAndSyncData method over and over again.

37.7 KB View Download
Labels: -Mstone-22 Mstone-20 ReleaseBlock-Stable
Status: Started
Using SplitString() incorrectly here was caused by r131224 which is part of M20.  I don't think there's another change that makes the CreateXXX method get called more frequently, I think it just takes a little bit of time for this to blow up.

We should not ship M20 with this bug as it will cause people's startup times to balloon and then Chrome to begin crashing constantly.  Fixing SplitString() to clear its vector is safe (I audited all callers) and will at least prevent the problem from worsening.  I can check that in now and request merges.  Correcting the existing damage will be a little trickier as I need to figure out precisely where to do the correction.
Peter, just so you know, I just pushed M20 to stable. Let's find out how bad is this bug in M20. We should certainly get it merged when the fix is properly verified in canary.
By the time you see this bug appear as a crash, it is way late to be addressing it.

I would halt the push for this.

Comment 11 by, Jun 26 2012

If the client tries to commit an entity that's larger than 10k, the server will return success but *not actually commit the item to durable storage*.

We are seeing over 15 / second commit attempts of large search engine entities.

I think we have some varz's counting occurrences. Standby..

Comment 12 by, Jun 26 2012

The second sentence (15 / second) in my comment 11 is incorrect.  Disregard.  New numbers coming.
I think that will help to reduce the growth of the bug here from "exponential" to "first exponential, then linear", but it won't halt the bug entirely (and the "linear" phase won't kick in until we have close to 10K of encodings, which means this will still grow fairly quickly).

Still waiting on try results for the first of two fixes here, but I'm busy coding the second as we speak.

Comment 14 by, Jun 26 2012

Yeah, I agree with Peter. I didn't mean to mitigate the concern. Dropping commits causes a steaming mess of corruption in addition to the OOM issue, and there is no great way to recover this corruption server side.  And having clients out of sync like that will lead to unrecoverable errors / weird crashes client side.

It looks like .3% of search engine commits server side are > 10k in size.  This is *a lot*. The scale behind that ratio is quite large.
Can we drill down to figure out what fraction of our existing entries have duplicate input_encoding values?
Also, if we have historical data, it'd be nice to know what the fraction has looked like over time.  If this has ramped up from a much smaller value over the last two months, that's another data point that suggests that this bug may be to blame.
When I look at the crash rates in http://crash, M20 isn't significantly affected. Majority of crashes were observed in M21.

Version        Crash counts
22.0.1187.0	46
22.0.1186.0	132
22.0.1185.0	76
22.0.1184.0	1
22.0.1183.1	48
22.0.1183.0	55
22.0.1182.0	80
22.0.1181.0	18
21.0.1180.4	51
21.0.1180.2	2
21.0.1180.11	26
21.0.1180.0	23
21.0.1179.0	5
21.0.1178.0	7
21.0.1171.0	2
21.0.1170.0	5
20.0.1132.43	2
20.0.1132.42	1
20.0.1132.39	3

Comment 18 by, Jun 26 2012

actually this also implies a huge jump from 1180 to 1182 considering anything past 1180 has only been on canary vs the 1180 numbers that are from dev channel. so there are two points where it gets bad. 1179-1180 and 1181-1182

Comment 19 by, Jun 26 2012

oh and clearly something made it even worse between 1185 and 1186
Yes, some change in M21 might have tickled pkasting@ code more compared to M20.
I think you're both mistaken in assuming that higher rates for a version mean something changed in that version.

This is a bug that, once the client is new enough to trigger it (anything r131224 and newer), will progressively worsen on the client, little by little, each time the client syncs.  Thus clients which are used the most frequently are most likely to show problems here.  These are also the clients likely to be on the newest versions.
(Note for example that people that people who are still using M20, i.e. beta users, will have had significantly less time using a buggy build, and probably also less usage per unit time, than Dev/Canary users.)

Comment 24 by, Jun 26 2012

This graph (http://shortn/_hqZFPRGyTE) shows the increase in frequency of large search engine commits on our stable server (which means beta channel and stable channel chrome only) around June 15th.  This graph (http://shortn/_iLlxCWG9uN) shows the same on our "dev" server instance, which means dev channel / canary chrome.  The rate is certainly higher on dev (mostly m21), but it is absolutely happening on m20 -- the 0.3% I referred to above was from the 'stable' instance alone.

(I can't yet explain the drop / bounce on the stable graph a few days after the initial spike, before spiking back up...)

Was this bug present in m20 when m20 was in dev channel?  Is it possible dev channel clients are more likely to hit the issue / have more search engines and hence may trigger this more easily?
Peter's change is part of M20 even before it branched. According to the stats in comment #17, there isn't any crashes on M20 dev.

This is an issue for sure in M20 but based on the current data it doesn't seem to be alarming. So I'm not going to rollback Stable at this moment. However, if we have a clean and simple patch, we can try spinning a build to stable either on Thursday or Friday.
Labels: Merge-Requested
First fix landed in r144323.

This should be safe to merge to M20 and M21 and will prevent the problem from getting any worse.
Project Member

Comment 27 by, Jun 26 2012

Summary: Chrome: Crash Report - Stack Signature: base::`anonymous namespace'::OnNoMemory()-9...
The following revision refers to this bug:

r144323 | | Tue Jun 26 16:37:02 PDT 2012

Changed paths:

Make SplitString() and variants clear their outparam vector.  (Note that SplitStringIntoKeyValues() and SplitStringIntoKeyValuePairs() already did this.)  This is more in line with what other APIs that take outparams do.

I audited all callers, and the only ones affected by this are the buggy ones in  bug 134695  that already wanted this behavior, as well as the couple of places in this CL that were manually calling clear() and now don't have to.

BUG= 134695 
Review URL:

Comment 28 by, Jun 27 2012

Thanks Peter for the fix! Since your patch is touching the file in /base, I'm little nervous on how this will affect other parts of chrome which calls it. Is there a simpler patch that will affect only your code and not the base function?

If there isn't, then let's get this patch baked in tomorrow's canary and see if it fixes or regresses any other issues.
Yes, we could merge a one-liner instead to M20 that just touched  I wouldn't want to do that for M21, as I don't want the trunk and the branch to get too out of sync, but I can put this together for M20 later tonight.
Meanwhile, fix part 2 is at .
For clarity, do you want me to just go ahead and land the one-liner fix on M20, or do you want to see it beforehand?

Basically, it's the addition of this line:


...before the SplitString() call in TemplateURLService::CreateTemplateURLFromTemplateURLAndSyncData().
Status: Fixed
Second fix in in r144387.

I would like to merge this fix to M21 but leave it out of M20.
r144390 fixed a compile error due to a near-simultaneous landing with another sync change (r144385); that shouldn't be merged back.
Project Member

Comment 35 by, Jun 27 2012

The following revision refers to this bug:

r144390 | | Tue Jun 26 20:45:52 PDT 2012

Changed paths:

Fix compile error due to interaction with akalin's r144385.

BUG= 134695 


Comment 36 by, Jun 27 2012

Yes, that's the change that I expected in M20. Could you please check out 1132_43 branch (yes, its _43) and add your change directly in the branch? Thanks!
Project Member

Comment 37 by, Jun 27 2012

Labels: merge-merged-1132_43
The following revision refers to this bug:

r144471 | | Wed Jun 27 10:13:23 PDT 2012

Changed paths:

Minimal mitigation for  bug 134695  for the M20 branch.

BUG= 134695 

Labels: -Mstone-20 -ReleaseBlock-Stable Mstone-21 ReleaseBlock-Beta
This has now been mitigated for M20.  I'd still like to merge the real fixes to M21.

Comment 39 by, Jun 27 2012

we have a little bit of time before beta2 cut for m21, can we let it soak in canary a bit?
Hey Peter - is there anything we can do to validate that this is improving things in canary? Perhaps play with sync for a while and ensure that we're seeing healthy Web Data contents?

When do you think we'll see Tim's graph of giant Search Engine commits calming down?
I'd look at the number of >10k submits coming from users who have this patch versus ones that don't.  If those 10k entries are caused by this bug, then this patch ought to eliminate them immediately.  It should also subsequently eliminate them from those users' other clients once those clients sync.
Project Member

Comment 43 by, Jun 29 2012

Labels: merge-merged-1132
The following revision refers to this bug:

r144962 | | Fri Jun 29 12:44:49 PDT 2012

Changed paths:

Merge 144471 - Minimal mitigation for  bug 134695  for the M20 branch.

BUG= 134695 
Review URL:

Comment 44 by, Jun 29 2012

will review/approve for m21 on july 9 since there's nothing going out before then.
Update on some numbers:

  - Not seeing this OOM crash anymore on canary channel (which confirms the fix)
  - For the current dev channel (21.0.1180.15), this OOM is more than 6% of browser crashes.
  - It is low incidence on the stable channel. Only saw a handful of these and they were in in 20.0.1132.43

Comment 46 by, Jul 9 2012

Labels: -Merge-Requested Merge-Approved
thanks eric!! ok let's go ahead and land it. 144387 is the one u want for m21, right?
Labels: -Merge-Approved Merge-Merged
Merged to 1180.
Issue 136211 has been merged into this issue.
Project Member

Comment 50 by, Oct 13 2012

Labels: Restrict-AddIssueComment-Commit
This issue has been closed for some time. No one will pay attention to new comments.
If you are seeing this bug or have new data, please click New Issue to start a new bug.
Project Member

Comment 51 by, Mar 10 2013

Labels: -Area-UI -Mstone-21 M-21 Cr-UI
Project Member

Comment 52 by, Mar 14 2013

Labels: -Restrict-AddIssueComment-Commit Restrict-AddIssueComment-EditIssue

Comment 53 by, Jul 24 2013


Sign in to add a comment