(Re?)consider merging chrome.dll and chrome_child.dll on Windows |
||||
Issue descriptionNico's asked & forgotten a couple times, and I'm little fuzzy on the concrete reasons at this point, so I thought I should make a bug to track and log current/future status of the chrome.dll and chrome_child.dll split. The reasons that come to mind: - PGO is more tractable when we don't have one big blob so that we can only PGO part of the code that we care about - cl/link are prone to falling over (.pdb, .ilk, thursday, etc.) , this gives us a modicum of escape potential, or at least a smaller starting point - ISTR win32k lockdown required chrome_child.dll to avoid dll loads (but I'm not sure why /delayload * wouldn't work) - Potentially some impact on startup metrics, though we haven't A/B'd that in so long that it'd be hard to predict which way it'd move. - It seems maybe logically nice that browser code isn't in the children and vice versa, but there's quite a lot in "common/" subdirs anyway. If/when we switch to clang/lld, some of those constraints might change, so maybe we should consider merging the cleaving then. It would simplify some icky #ifs, and likely reduce the overall binary footprint. Please add any pros/cons you feel are important.
,
May 24 2017
I was asking Nico about it this morning, thanks for writing it down. I was thinking about it in the context of start-up time: to open the browser and the first renderer, we're loading both dll's. If there is common code between them (but I don't know how much there is), that means loading some code twice, and I was wondering if just having one dll might be faster.
,
May 24 2017
I think this would be perf positive as well, possibly memory as well. Especially for subsequent renderer process loads.
,
May 25 2017
another factor to consider is courgette patch memory usage on client, the amount of memory needed to apply a differential patch is dependent on binary size, so having one larger DLL might mean more failures to patch on client. zucchini might help with this.
,
May 25 2017
I think merging also reduce total DLL sizes and patch sizes, by removing duplicates (e.g., base)? Would this only be for 64-bit? I recall there was some 2 GB PDB limit that contributed the the separation in the first place -- is that still an issue? Re. patching: Yes merging will increase peak RAM for Courgette. However, recently ( http://crbug.com/660980#c11 ) I decreased Courgette "RAM floor" (RAM allowance boundary before failure) and "disk churn" (total TempMapping usage) each by 40%, really for the sake of reduction. If you want to consume the improvement and cite it as a catalyst, please do. :) And yes, Zucchini would push patching RAM requirements down even more.
,
May 25 2017
SyzyAsan also requires the Split DLL so we can only instrument the browser process. Last time we talked about this Justin WontFix'd the bug because of the win32k lockdown mitigation : crbug.com/448218
,
May 25 2017
And I'm almost certainly gonna WontFix it again! 😜 Because, delay loads don't matter when Windows decides to invoke spooky action at a distance because it thinks you're a GUI process (if they did we wouldn't have to hook). Also, I'm dubious of any claimed performance advantages by merging DLLs. I'm honestly far more inclined to go in the other direction to reduce duplicate binary code, by break everything out into maybe half a dozen logical DLLs.
,
May 25 2017
Dammit. Should have been "if they *didn't* we wouldn't have to hook". My attempt at humor has now failed.
,
May 25 2017
It would be interesting to see how much memory we save by doing this - I would WAG it as maybe 10 MB of shared-memory saved? There would also be some performance gain because the i-cache would be shared between renderers and the browser process, which would make our IPC cheaper. The actual benefit from this would be monumentally hard to predict, and might be tough to measure. I'd love to get some measurements for the benefits of this change but I fear it would be a non-trivial project. > Would this only be for 64-bit? I would assume this would be for 32-bit and 64-bit. The increased per-process code footprint should not be enough to cause 32-bit processes any problems. > I recall there was some 2 GB PDB limit that contributed the the separation in the first place The 32-bit toolchain was part of the reason for splitting the DLLs, along with various file-size limits. I think those have been raised or eliminated such that we *should* be able to do a unified build, especially with VS 2017. Build times might be hurt because currently we link chrome.dll and chrome_child.dll in parallel, and linking the unified DLL would probably be slower, especially on PGO/LTCG builds.
,
Jan 10
Archiving P3s older than 1 year with no owner or component. |
||||
►
Sign in to add a comment |
||||
Comment 1 by wfh@chromium.org
, May 24 2017