Consider adding a data source that could tell us that we were completely failing to start. |
||
Issue descriptionFollowing up on https://bugs.chromium.org/p/chromium/issues/detail?id=604923 (failure to launch in SRP configurations), one possible improvement to process was "Should we have a data source that could tell us that we were failing to start?" Capturing discussion here for future reference and further discussion. scottmg: I think since this was so early, we wouldn't see this as crashes, or unclean shutdowns, or any other UMA? Perhaps we should move/add another deadman trigger to be the very-first-absolute-first-for-real thing? wfh: I don't think we can get any earlier than the chrome_elf dllmain... The only thing I can think of (and clutching at straws) is perhaps some kind of integration with Omaha - it could check the last access on the shortcut and compare that with the timestamp on the "dr" key and report any inconsistencies? scottmg: Yeah, I was vaguely thinking of creating a file as the first line of DllMain() and then removing it when we get to WinMain() cleanly (or whenever) but hadn't thought through how we might actually report on the fact that it's sitting on disk if we fail to start cleanly. Hmm. Maybe bug reports are the most reasonable "data source" for this. wfh: perhaps a new key in omaha beside the "dr" (did run) key which is "unclean launch" ("ul") which, if set, omaha will report back in the omaha pings alongside the dr key. We can then monitor for this key on the omaha backend. The key would be created in DLLMain of chrome_elf then cleared later in chrome launch. There would be a possible race if Omaha checks the key while Chrome is launching, but given we'd be using this for determining large spikes in launch failures, I think that false positive rate would be acceptable. +sorin for thoughts. sorin: We could do that, we would need to clear with privacy (there is an ongoing concern with increasing entropy of the update checks) but otherwise lgtm. Please open a bug if a decision has been made to go this route.
,
Nov 2 2016
,
May 21 2018
Bulk edit** This bug has the label Postmortem-Followup but has not been updated in 3+ weeks. We are working on a new workflow to improve postmortem followthrough. Postmortems and postmortem bugs are very important in making sure we don't repeat prior mistakes and for making Chrome better for all. We will be taking a closer look at these bugs in the coming weeks. Please take some time to work on this, reassign, or close if the issue has been fixed. Thank you. |
||
►
Sign in to add a comment |
||
Comment 1 by sorin@chromium.org
, Nov 1 2016Components: Internals>Installer
Owner: sorin@chromium.org
Status: Assigned (was: Untriaged)