Issue metadata
Sign in to add a comment
|
crashes are not seen by metrics |
||||||||||||||||||||||
Issue descriptionChrome Version: 63.0.3239.132 (Official Build) (32-bit) (cohort: Stable Installs Only) OS: Windows 7 build 7601 service pack 1 32-bit What steps will reproduce the problem? (1) crash renderer using chrome://crash or chrome://memory-exhaust (2) check for evidence that this was caught/logged correctly in chrome://crashes, chrome://histograms and chrome://local-state (3) What is the expected result? 1. A crash is reported in chrome://crashes and it gets bucketed into "[Out of Memory] content::ExhaustMemory" e.g. c55eb75c0f595905 2. CrashExitCodes.Renderer gets entries for each crash, the OOM type for chrome://memory-exhaust (536870904) and the exception type for chrome://crash (-1073741819) 3. renderer_crash_count is incremented in the system profile cache in chrome://local-state What happens instead? 1 happens, but 2 and 3 don't happen. This is concerning. Please use labels and text to provide additional information. For graphics-related bugs, please copy/paste the contents of the about:gpu page at the end of this report.
,
Jan 18 2018
Have you tried on other platforms/channels yet?
,
Jan 18 2018
well I couldn't repro this for a bisect so perhaps this is some experiment on stable.
,
Jan 18 2018
and I can't (easily) bisect with matching experiment config from the machine(s) I can repro on, because of issue 694675
,
Jan 18 2018
For future reference and for when I can come back to this bug - the experiments from the Chrome Stable I can repro the bug on are: c134752e-b8b72c88 3095aa95-3f4a17df 6c43306f-ca7d8d80 47e5d3db-3d47f4f4 1210a805-ecd831c b1edbc38-cf4f6ead ba3f87da-45bda656 776de70c-eadfd437 79616653-3f4a17df 9e201a2b-6e3ce1c 68812885-4d2fac87 5e3a236d-4113a79e f347910c-3d47f4f4 4b61504a-d25ea691 9773d3bd-f23d1dea 8e3b2dc5-93702590 9e5c75f1-ffd2375f f79cb77b-3d47f4f4 4ea303a6-49c9e003 d92562a9-ca7d8d80 90bcbadc-3f4a17df 447469ba-13d9f35f 7aa46da5-c946b150 25fc488a-4d2fac87 58a025e3-c2b41702 1bced4a3-90fa85cd b2f0086-93053e47 ef25c1eb-3f4a17df 494d8760-6843eff2 f47ae82a-86f22ee5 3ac60855-486e2a9c f296190c-a90023b1 4442aae2-6e597ede ed1d377-e1cc0f14 75f0f0a0-6bdfffe7 e2b18481-bd104136 e7e71889-4ad60575 94e68624-803f8fc4 f141d4bc-28ad44a e9ce63c1-36ab09a2 da4aaa01-ca7d8d80
,
Jan 18 2018
I believe gayane is working on 694675 right now actually. I don't believe it's finished so you can't use it yet unfortunately.
,
Jan 19 2018
I can also repro on a win10 machine from a fresh stable (64-bit) install. I cannot repro with any bisect, and obviously the tests are passing, so perhaps this is something to do with being on a real channel. This does still seem quite concerning to me.
,
Jan 19 2018
Debugging the running process, content::RenderProcessHostImpl::ProcessDied is being called correctly, but ChromeStabilityMetricsProvider::Observe is not being called, almost as if it is not registered as an observer. Still trying to work out why this would be the case.
,
Jan 19 2018
metrics is being stopped here:
0:000> kn
# ChildEBP RetAddr
00 04daed18 6a89da90 chrome_69190000!ChromeStabilityMetricsProvider::OnRecordingDisabled [c:\src\gclient\src\chrome\browser\metrics\chrome_stability_metrics_provider.cc @ 66]
01 04daed28 6a896f58 chrome_69190000!metrics::DelegatingProvider::OnRecordingDisabled+0x16 [c:\src\gclient\src\components\metrics\delegating_provider.cc @ 50]
02 04daedf0 6b1657b2 chrome_69190000!metrics::MetricsService::DisableRecording+0x88 [c:\src\gclient\src\components\metrics\metrics_service.cc @ 336]
03 04daeec0 6b165886 chrome_69190000!metrics_services_manager::MetricsServicesManager::UpdateRunningServices+0xca [c:\src\gclient\src\components\metrics_services_manager\metrics_services_manager.cc @ 129]
04 04daef8c 6b165959 chrome_69190000!metrics_services_manager::MetricsServicesManager::UpdatePermissions+0x96 [c:\src\gclient\src\components\metrics_services_manager\metrics_services_manager.cc @ 105]
05 04daefac 69cbda7d chrome_69190000!metrics_services_manager::MetricsServicesManager::UpdateUploadPermissions+0x45 [c:\src\gclient\src\components\metrics_services_manager\metrics_services_manager.cc @ 166]
06 04daf018 69cbf981 chrome_69190000!ChromeBrowserMainParts::StartMetricsRecording+0xb5 [c:\src\gclient\src\chrome\browser\chrome_browser_main.cc @ 759]
07 04daf190 69cbf82f chrome_69190000!ChromeBrowserMainParts::PreMainMessageLoopRunImpl+0x93 [c:\src\gclient\src\chrome\browser\chrome_browser_main.cc @ 1411]
08 04daf1d8 695bf9bc chrome_69190000!ChromeBrowserMainParts::PreMainMessageLoopRun+0xad [c:\src\gclient\src\chrome\browser\chrome_browser_main.cc @ 1218]
09 04daf220 698e05c4 chrome_69190000!content::BrowserMainLoop::PreMainMessageLoopRun+0x44 [c:\src\gclient\src\content\browser\browser_main_loop.cc @ 1182]
0a (Inline) -------- chrome_69190000!base::RepeatingCallback<int ()>::Run+0xb [c:\src\gclient\src\base\callback.h @ 94]
0b 04daf238 695be48c chrome_69190000!content::StartupTaskRunner::RunAllTasksNow+0x1e [c:\src\gclient\src\content\browser\startup_task_runner.cc @ 42]
0c 04daf340 695c24a8 chrome_69190000!content::BrowserMainLoop::CreateStartupTasks+0x292 [c:\src\gclient\src\content\browser\browser_main_loop.cc @ 968]
0d 04daf3b4 695bccb6 chrome_69190000!content::BrowserMainRunnerImpl::Initialize+0x210 [c:\src\gclient\src\content\browser\browser_main_runner.cc @ 117]
0e 04daf3fc 69c28a2e chrome_69190000!content::BrowserMain+0x8a [c:\src\gclient\src\content\browser\browser_main.cc @ 42]
0f 04daf4cc 69c28f6a chrome_69190000!content::RunNamedProcessTypeMain+0xee [c:\src\gclient\src\content\app\content_main_runner.cc @ 426]
10 04daf5c8 69c40785 chrome_69190000!content::ContentMainRunnerImpl::Run+0x118 [c:\src\gclient\src\content\app\content_main_runner.cc @ 720]
11 04daf6d8 69c28917 chrome_69190000!service_manager::Main+0x2a5 [c:\src\gclient\src\services\service_manager\embedder\main.cc @ 456]
12 04daf718 6919119e chrome_69190000!content::ContentMain+0x33 [c:\src\gclient\src\content\app\content_main.cc @ 19]
13 04daf788 008c59aa chrome_69190000!ChromeMain+0x122 [c:\src\gclient\src\chrome\app\chrome_main.cc @ 131]
14 04daf814 008c1551 chrome!MainDllLoader::Launch+0x230 [c:\src\gclient\src\chrome\app\main_dll_loader_win.cc @ 199]
15 04daf98c 009a5dd8 chrome!wWinMain+0x551 [c:\src\gclient\src\chrome\app\chrome_exe_main_win.cc @ 231]
16 (Inline) -------- chrome!invoke_main+0x1a [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 118]
17 04daf9d8 778a8654 chrome!__scrt_common_main_seh+0xf6 [f:\dd\vctools\crt\vcstartup\src\startup\exe_common.inl @ 283]
WARNING: Stack unwind information not available. Following frames may be wrong.
18 04daf9ec 779d4a77 KERNEL32!BaseThreadInitThunk+0x24
19 04dafa34 779d4a47 ntdll!RtlGetAppContainerNamedObjectPath+0x137
1a 04dafa44 00000000 ntdll!RtlGetAppContainerNamedObjectPath+0x107
void MetricsServicesManager::UpdateRunningServices() {
DCHECK(thread_checker_.CalledOnValidThread());
metrics::MetricsService* metrics = GetMetricsService();
const base::CommandLine* cmdline = base::CommandLine::ForCurrentProcess();
if (cmdline->HasSwitch(metrics::switches::kMetricsRecordingOnly)) {
metrics->StartRecordingForTests();
GetRapporServiceImpl()->Update(true, false);
return;
}
client_->UpdateRunningServices(may_record_, may_upload_);
if (may_record_) {
if (!metrics->recording_active())
metrics->Start();
if (may_upload_)
metrics->EnableReporting();
else
metrics->DisableReporting();
} else {
metrics->Stop(); <- HERE
}
UpdateUkmService();
GetRapporServiceImpl()->Update(may_record_, may_upload_);
}
for some reason metrics thinks it's disabled. but settings shows it's not, see screenshot.
,
Jan 19 2018
turning reporting off, restarting, turning it back on, restarting, and then performing the steps in #0 results in the metrics being recorded again.
,
Jan 19 2018
Is it possible your client is being sampled out by the UMA opt out sampling? We only receive data from 10% of users on Windows by design. go/uma-opt-out-faq
,
Jan 19 2018
re: #11 would this correspond to having running experiment: MetricsAndCrashSampling-OutOfReportingSample
,
Jan 19 2018
I think toggling UMA reporting state will reset your UMA client id. The UMA client id is used in Finch experiment randomization. So toggling the state as you've done in comment 10 will re-roll Finch experiments and possibly put you in a different "being sampled" state. Presumably if you do it 100 times, the expected value is 10 of those times would be reported while 90 wouldn't be.
,
Jan 19 2018
re: 12, yeah - I just added a Q about it to go/uma-opt-out-faq
,
Jan 19 2018
when trying to do a bisect I went through each of the field trial configs from #5 that I considered might have an effect on this bug (based on guesswork, and not wanting to have to generate command line for every experiment), and came up with: LoadingWithMojo-Enabled_Launch MetricsAndCrashSampling-OutOfReportingSample ResourceLoadScheduler-Default SignInProcessIsolation-Enabled_100_20180103 (Given issue 694675 is still being worked on) I manually went through the experiment configs and came up with the command line switches to replicate this environment, which came up as: --enable-features=LoadingWithMojo,sign-in-process-isolation --disable-features=MetricsReporting,ResourceLoadScheduler I was surprised that even with that command line, I was still unable to reproduce this...? Does "--disable-features=MetricsReporting" not equate to being in the MetricsAndCrashSampling-OutOfReportingSample experiment group?
,
Jan 19 2018
Hmm. I would expect --disable-features=MetricsReporting to be equivalent to MetricsAndCrashSampling-OutOfReportingSample. Are you able to confirm that running with that you still see it work? (in your current install where it's now working)
,
Jan 19 2018
hmm I was doing --disable-features=MetricsReporting on a bisect command line: python bisect_builds.py -a win64 -r -g 63.0.3239.0 -b 63.0.3239.132 --verify-range -- --enable-features=LoadingWithMojo,sign-in-process-isolation --disable-features=MetricsReporting,ResourceLoadScheduler for both ends of the range I saw that CrashExitCodes.Renderer was being incremented when I visited chrome://crash perhaps metrics reporting is force enabled for developer/unknown builds? I can try the switch on a real stable build, but I've already reverted my VM so I'll have to keep rolling the die until I get into the 10% group again :)
,
Jan 19 2018
okay I rolled a '1' and was opted into metrics reporting which I verified by seeing 5e3a236d-59e286d0 in chrome://version. In this configuration I see CrashExitCodes.Renderer correctly incremented upon a crash. I then add --disable-features=MetricsReporting to command line and then the behavior reverts to not recording metrics as you predicted it would. So, I can't explain why running builds from a bisect with --disable-features=MetricsReporting shows metrics for crashes (in #17 and #7).
,
Feb 22 2018
Thinking about this more, I think we should always connect the chrome stability metrics logger to the render process host, because it's often really useful in diagnosing local issues to see these histograms and also see the entries in the system profile. We can log them but just not upload them to Uma. Can someone in metrics do this, or shall I land a cl?
,
Feb 22 2018
,
Feb 26 2018
The NextAction date has arrived: 2018-02-26
,
Feb 26 2018
Will, I think you're the most likely person to pick this up, assuming you're still interested in it.
,
Mar 5 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a613a90c3a45fa037344903d0c00f5b43add78c2 commit a613a90c3a45fa037344903d0c00f5b43add78c2 Author: Will Harris <wfh@chromium.org> Date: Mon Mar 05 00:04:04 2018 Only stop recording Chrome stability metrics during destruction. BUG= 803621 Change-Id: I2a028ce864808a88f71c0b0a0e0d3f493e0f1f77 Reviewed-on: https://chromium-review.googlesource.com/940616 Reviewed-by: Steven Holte <holte@chromium.org> Commit-Queue: Will Harris <wfh@chromium.org> Cr-Commit-Position: refs/heads/master@{#540773} [modify] https://crrev.com/a613a90c3a45fa037344903d0c00f5b43add78c2/chrome/browser/metrics/chrome_stability_metrics_provider.cc
,
Mar 6 2018
,
May 31 2018
verified fixed on m67 stable (67.0.3396.62 (Official Build) (64-bit) (cohort: 67_win_62)) by checking client is in MetricsAndCrashSampling/OutOfReportingSample group and still seeing entry in CrashExitCodes.Renderer after visiting chrome://crash |
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by wfh@chromium.org
, Jan 18 2018