Issue metadata
Sign in to add a comment
|
chrome://tracing can crash the browser if closed between Record being pressed, and the dialog being displayed |
||||||||||||||||||||||
Issue descriptionChrome Version: 57.0.2951.0 dev OS: ChromeOS Panther What steps will reproduce the problem? [(0) I am running with Memory Coordinator enabled, so this may be an MC bug ;)] (1) Use ChromeOS for a while (I have six windows, each with 3+ tabs, open). (2) Open chrome://tracing. (3) Hit Record. (4) If you don't see the recording dialog come up, use Task Manager to kill the process. (5) Repeat #2 and #3. What is the expected result? Expect that the record dialog shows up promptly after #3. What happens instead? The recording dialog always takes 2-3 seconds to appear, and at one point (with Chrome having been "up" and in active use for several hours) it didn't respond at all, so I did #4 and #5, at which point the OS crashed out (see crash ID d106ec1300000000)
,
Dec 17 2016
,
Dec 19 2016
Eeeh, this is hanging while dumping metadata which is a bunch of strings and should take some millisecond at best. Do you happen to have a trace that took time (but didn't crash) coming from CrOS? I think should be pretty easy to spot the culprit by looking at which _metadata event has some enormous value in the trace.
,
Dec 19 2016
Re #3: No; trying to take a trace (at least with memory-infra ticked) always crashes (see issue 643438). Perhaps the slowness is related to the memory-infra tracing brokeness; I'll try a trace with some other components ticked.
,
Dec 19 2016
Re #3: Oh, I think there's a misunderstanding: This bug is for it taking a long time after pressing Record before the dialog to select what to record is even displayed, so there isn't any trace data at this point, surely? (same as issue 604885 ; we could perhaps close that since it was originally specific to performance event logging)
,
Dec 19 2016
> Perhaps the slowness is related to the memory-infra tracing brokeness ?? Some bug you are specifically thinking about? memory-infra is disabled by default so doesn't easily explain why you would get into this state. Or does it happen only when ticking memory-infra?
,
Dec 19 2016
Re memory-infra brokenness: Yes, I was thinking of issue 643438. ;) Re the explanation: No, I agree; since this is happening before I've even had the opportunity to choose what to record, presumably it can't be related to any specific tracing type. Based on the crash stack it looks like we do gather some metadata when the user hits Record, before we show the dialog, though?
,
Dec 19 2016
> Re #3: Oh, I think there's a misunderstanding: This bug is for it taking a long time after pressing Record before the dialog to select what to record is even displayed, so there isn't any trace data at this point, surely? Well the record dialog IIRC starts and stop tracing for a small interval of time to populate the categories. But still the disabled-by-default categories are disabled. From your stack trace this is really the metadata population (which should be really dumb code) failing. Maybe one of the strings (e.g. the gpu vendor or such) is not null terminated and we end up trying to dump all the heap when getting to the string? If you have a debugger this would be quite easy to narrow down by just adding a breakpoint to TracingControllerImpl::AddFilteredMetadata and see which key causes the hang.
,
Dec 19 2016
RE #7: Ah I see (yup unfortunately we have coverage in a lot of devices on the perf waterfall but not on CrOS. We are looking into adding more browsertest coverage as part of Issue 670828 ). Anyways, I'd rule it out in this bug. That cannot be triggered unless you explicitly tick memory-infra, which is not the case here because, as you say,this happens even before we start recording.
,
Dec 19 2016
Re #9: Agreed; memory-infra does not seem to be the issue here. :) Re #8: I believe that the crash only occurs if you actually kill the chrome://tracing renderer before the recording dialog comes up. I've just let the tab stay open for several minutes, and eventually the dialog did get displayed, so I think that rules out bogus metadata as the cause of the crash. If the metadata gathering were not async-safe (i.e. couldn't cope with the target having been torn-down before it completes), then a regression in time taken to gather metadata would make that crash scenario more obvious, so perhaps we can try to repro on other platforms by introducing a delay in completion of metadata gathering. I think it's also necessary to actually start a second Record operation having killed the first, stalled, one, to repro the crash, FWIW. FWIW once I have allowed the tracing page to run long enough to show the dialog, opening it up again and hitting Record completes very quickly.
,
Dec 19 2016
> Re #8: I believe that the crash only occurs if you actually kill the chrome://tracing renderer before the recording dialog comes up. So wait I'm missing something here. Usually the chrome://tracing dialog comes up instantaneously after you press record. I can't imagine how to kill it timely. So is the problem here that the even first time it takes long time to come up (if the answer is "yes" then I don't understand what is the reason of killing anything to repro this bug. You should now in the state of grabbing a trace no?) > FWIW once I have allowed the tracing page to run long enough to show the dialog, opening it up again and hitting Record completes very quickly. Yes because we won't repeat the category warmup and re-collect metadata.
,
Dec 19 2016
Primiano and I discussed offline and realised that the delay/hang is specific to tracing with MemoryCoordinator enabled, when MC has chosen to suspend one or more processes. Filed issue 675735 to track the MC problem. Repurposing this bug for the crashiness in chrome://tracing.
,
Dec 20 2016
,
Jan 5 2017
Issue 678696 has been merged into this issue.
,
Jan 5 2017
+bashi, since it appears to be necessary to have a suspended renderer to easily repro this, since that blocks the Tracing dialog popping up ( issue 675735 ).
,
Jan 6 2017
Assigning to me for investigation and de-prioritize as renderer suspension is disabled now.
,
Feb 10 2017
,
Feb 10 2017
,
Mar 22 2017
Users experienced this crash on the following builds: Mac Canary 59.0.3047.0 - 0.76 CPM, 1 reports, 1 clients (signature base::DictionaryValue::MergeDictionary) If this update was incorrect, please add "Fracas-Wrong" label to prevent future updates. - Go/Fracas
,
Apr 11 2017
We gave up suspending tabs for now. This should happen anymore.
,
Apr 11 2017
Re #20: This bug was actually for the browser crasher, which can occur with _any_ renderer hang in-progress, not just MC, so I would suggest keeping this open.
,
Apr 12 2017
I see. Re-opening. Not sure who is the right owner. Assigning primiano@ for triage.
,
Apr 13 2017
kraynov can you take a look to this? I don't think this is specific to CrOS and should reproduce on any OS (just try on Linux). See comment #10
,
Apr 18 2017
kraynov: ping
,
Apr 18 2017
Can not (or quite hard to) reproduce on Linux, yet. Steps I took: 1. Launched Chrome with many different tabs out/Release/chrome --enable-memory-coordinator --enable-remote-debugging --remote-debugging-port=9222 http://edition.cnn.com http://bbc.co.uk http://reddit.com http://news.google.com http://crbug.com http://android.com http://samsung.com http://lg.com http://o2.co.uk http://vodafone.co.uk http://gov.uk http://usa.com http://theverge.com http://hsbc.co.uk http://barclays.co.uk http://telegraph.co.uk http://spotify.com http://play.google.com http://apple.com http://nokia.com http://htc.com http://github.com http://dhl.com http://klm.com http://facebook.com http://twitter.com http://plug.google.com http://instagram.com http://youtube.com http://vimeo.com 2. Slightly hacked https://cs.chromium.org/chromium/src/third_party/catapult/tracing/bin/memory_infra_remote_dump to invoke memory pressure listener, just had to call self.send('Memory.simulatePressureNotification', {'level': level}) 3. Waited 5 mins 4. Invoked memory pressure 5. Waited a bit 6. Invoked again 7. Repeated again multiple times 8. Got tabs switching laggy (but not much) But still tracing categories dialog shows almost instantly. However, during my previous experiments (multiple tabs with media playback and heavy content) on the device with physically 8 gigs of RAM I've got observable tracing dialog delay. Unfortunately killing random tabs during this delay didn't crash the browser. Do you have any recipe to strictly limit amount of RAM in order to simulate real memory pressure? Because fake memory pressure on the workstation didn't make things slow enough.
,
Apr 18 2017
As per comment #20, Memory Coordinator no longer tries to suspend tabs, so it's expected that the original repro steps will no longer repro this. This bug is not really about the MC case, but about the fact that closing a hung tab after hitting Record crashes things, so I'd expect that steps like: 1. Navigate to chrome://hang. 2. Open chrome://tracing 3. Start recording. 4. Close chrome://hang tab. should repro the issue.
,
Apr 18 2017
kraynov, as per comment #20 I think that if you add a sleep() in one of the renderer->browser tracing IPCs (TracingHostMsg_ChildSupportsTracing, TracingHostMsg_TraceDataCollected, TracingHostMsg_EndTracingAck) and use wez's repro in #26 the issue should show back.
,
Apr 19 2017
So I tried my best with combinations of various approaches (IPC sleeps, memory pressure, killing pids, hang tabs, real memory pressure, etc.) and nothing led to browser crash. I don't know what kind of progress we can ever try to make with this issue.
,
Jun 1 2017
Can not reproduce |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by w...@chromium.org
, Dec 16 2016