Courgette: Replace old Label allocation code with LabelManager |
|||
Issue descriptionThis is a tracking issue for the plan to reduce Courgette peak memory using LabelManager. Implementation details: go/courgette-use-label-manager.
,
May 19 2016
For (1), the "48.0.2563.0 to 49.0.2623.0" under Courgette-apply are: - 32-bit courgette.exe: - Baseline: Fails for 69 MiB, works for 70 MiB. - New: Fails for 51 MiB, works for 52 MiB. - So 70 MiB to 52 MiB => 18 MiB saving, ~25% reduction. - 64-bit courgette - Baseline: Fails for 112 MiB, works for 113 MiB. - New: Fails for 77 MiB, works for 78 MiB. - So 113 MiB to 78 MiB => 35 MiB saving, ~30% reduction. Therefore we claim 25% peak RAM reduction for Courgette-apply under "choke ram until failure" experiment. Note that the ~18 MiB or 19 MB saving is in line with 17 MB saving predicted.
,
May 19 2016
For (2), we still have to gather data from the field. But we experiment *locally* by modifying courgette_tools.cc:
********
#include "base/process/process_metrics.h"
...
int main(int argc, const char* argv[]) {
...
// Dump peak memory usage.
std::unique_ptr<base::ProcessMetrics> process_metrics(
base::ProcessMetrics::CreateProcessMetrics(
base::GetCurrentProcessHandle()));
fprintf(stderr, "Peak page file: %d\n",
process_metrics->GetPeakPagefileUsage());
fprintf(stderr, "Peak working set: %d\n",
process_metrics->GetPeakWorkingSetSize());
return 0;
}
********
*Local experiment* result (32-bit courgette.exe only)
- Baseline:
- No RAM limit
- Peak page file: 105,226,240
- Peak working set: 638,951,424
- Constrain RAM to 70 MiB (fails for 69 MiB)
- Peak page file: 73,400,320
- Peak working set: 638,910,464
- New:
- No RAM limit
- Peak page file: 85,381,120
- Peak working set: 639,246,336
- Constrain RAM to 52 MiB (fails for 51 MiB)
- Peak page file: 54,304,768
- Peak working set: 639,700,992
We see that "Peak working set" is high regardless, likely because it accounts for memory-mapped IO stuff. Meanwhile, "Peak page file", when under choked RAM, is consistent with (1), but without constraint (on my computer) shows a reduction from 105 MB to 85 MB => 20 MB saving, or ~19% reduction.
We've collected ~1.5 weeks of baseline data on UMA now, including Canary (useless?) and Dev (will use). I'm committing the final CL http://crrev.com/1935203002/ soon, and will be able to compare data from the field soon for (2).
,
May 19 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/c803763b7f4ee725717839b679cadeb1637b6be5 commit c803763b7f4ee725717839b679cadeb1637b6be5 Author: huangs <huangs@chromium.org> Date: Thu May 19 18:16:40 2016 [Courgette] Using LabelManager to reduce Courgette-apply peak RAM by 25%. AssemblyProgram previously allocates new Label instances as it parses an executable and emits instructions. This CL replaces the flow by using LabelManager to precompute Labels in one array. This allows us to reduce Courgette-apply peak RAM by 25%, measured by "choke RAM until failure" method. Details: - We precompute Labels in AssemblyProgram::PrecomputeLabels(), which relies on RvaVisitor inherited classes for architecture-specific extraction of abs32 and rel32 targets. - TrimLabel()'s complex post-processing flow is simplified using PrecomputeLabels(), which runs before main file parse. - This requires RemoveUnusedRel32Locations() to update rel32. - Deprecating C_TRIM_FAILED error message. - Moving more common functionality to Disassembler, but duplicating some code for win32-x86 and win32-x64 to follow existing pattern. BUG= 613216 Review-Url: https://codereview.chromium.org/1935203002 Cr-Commit-Position: refs/heads/master@{#394815} [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/adjustment_method_unittest.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/assembly_program.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/assembly_program.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/courgette.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_elf_32.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_elf_32.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_elf_32_arm.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_elf_32_arm.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_elf_32_x86.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_elf_32_x86.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_win32_x64.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_win32_x64.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_win32_x86.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/disassembler_win32_x86.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/encoded_program.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/encoded_program.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/encoded_program_unittest.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/image_utils.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/label_manager.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/label_manager.h [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/program_detector.cc [modify] https://crrev.com/c803763b7f4ee725717839b679cadeb1637b6be5/courgette/rel32_finder_win32_x86.cc
,
Mar 24 2017
|
|||
►
Sign in to add a comment |
|||
Comment 1 by hua...@chromium.org
, May 19 2016Main idea is to precompute Label instances into a sorted array, instead of computing-on-demand and storing things in map. Analysis on chrome.dll and chrome_child.dll estimates ~17 MB reduction in peak *RAM* for Courgette-apply. Courgette-apply still uses 100's of MB to store intermediate data. However, these large chunks use memory-mapped IO, and are moved to disk if RAM is constrained. We use 2 methods to measure peak memory consumption (Windows): (1) "Choke RAM until failure": - Write a program (lomem.exe) that invokes executable via ::CreateProcess(), while assigning ProcessMemoryLimit in JOBOBJECT_EXTENDED_LIMIT_INFORMATION. - Choose chrome.7z benchmark, with pre-generated Courgette patch (we used 48.0.2563.0 to 49.0.2623.0). - For {baseline, new} Courgette, call Courgette-apply with different RAM limits, and find the boundary above which it works, and below which it fails. (2) Track peak memory from installer, and collect from field via UMA - http://crrev.com/1959463002/ - Using base::ProcessMetrics functions GetPeakPagefileUsage() and GetPeakWorkingSetSize(). - Metrics: Setup.Install.PeakPagefileUsage and Setup.Install.PeakWorkingSetSize.