Issue metadata
Sign in to add a comment
|
580kb regression in resource_sizes (MonochromePublic.apk) at 541745:541745 |
||||||||||||||||||||
Issue descriptionCaused by an AFDO roll c2b84ad191135be504a0abf4860b97b59f9d8b52 gbiv - was there anything odd about this roll?
,
Mar 8 2018
Note: this is on the heels of a 75kb increase from a roll: 10ba697d817145582da1aef2ef014be1dbd26b54 My (somewhat made up in my head) understanding is that as code within -O2 targets get renamed / added, the symbol names no longer match up to those in the profile marked as "hot". Then, once the roll happens, all the hot functions come back to being built with -O2. Is that about how it works?
,
Mar 8 2018
From tools/binary_size/diagnose_bloat.py c2b84ad1 --cloud:
85% of the increase is from just 25 symbols. 180kb from just one!:
Section Legend: t=.text, r=.rodata, R=.data.rel.ro, d=.data, b=.bss, p=.pak.translations, P=.pak.nontranslated, o=.other
Index | Running Total | Section@Address | Δ PSS (Δ size_without_padding) | Path
------------------------------------------------------------
~ 0) 180464 (34.7%) t@0x1332470 180464 (24408->204872) v8/src/compiler/simplified-lowering.cc
v8::internal::compiler::RepresentationSelector::VisitNode
~ 1) 255376 (49.0%) t@0x1374d60 74912 (5780->80692) v8/src/compiler/typer.cc
v8::internal::compiler::Typer::Visitor::Reduce
~ 2) 272096 (52.2%) t@0xbc20e0 16720 (12736->29456) third_party/sqlite/amalgamation/sqlite3.c
yy_reduce
~ 3) 286328 (55.0%) t@0xbb1648 14232 (22236->36468) third_party/sqlite/amalgamation/sqlite3.c
sqlite3VdbeExec
~ 4) 299284 (57.5%) t@0x141421c 12956 (11240->24196) v8/src/heap/mark-compact.cc
v8::internal::MarkCompactCollector::ProcessMarkingWorklist
~ 5) 311782 (59.9%) t@0x1ee4ec0 12498 (44020->56518) third_party/WebKit/Source/core/html/parser/HTMLTokenizer.cpp
blink::HTMLTokenizer::NextToken
~ 6) 324124 (62.2%) t@0x12b195c 12342 (6118->18460) v8/src/compiler/escape-analysis.cc
v8::internal::compiler::EscapeAnalysis::Reduce
~ 7) 335700 (64.5%) t@0x15ad1f8 11576 (6444->18020) v8/src/wasm/function-body-decoder.cc
v8::internal::wasm::WasmFullDecoder<>::DecodeFunctionBody
~ 8) 346840 (66.6%) t@0x145d9f0 11140 (2016->13156) v8/src/interpreter/bytecode-generator.cc
v8::internal::interpreter::BytecodeGenerator::VisitNoStackOverflowCheck
~ 9) 357592 (68.7%) t@Group 10752 (349952->360704) {{no path}}
** lld merge strings (count=2)
~ 10) 366192 (70.3%) t@0x12fe990 8600 (6376->14976) v8/src/compiler/machine-operator-reducer.cc
v8::internal::compiler::MachineOperatorReducer::Reduce
~ 11) 373764 (71.8%) t@0x9baaa0 7572 (1608->9180) third_party/skia/src/shaders/gradients/Sk4fLinearGradient.cpp
SkLinearGradient::LinearGradient4fContext::shadeSpan
~ 12) 380312 (73.0%) t@0x6ecfb4 6548 (4596->11144) third_party/icu/source/i18n/regexcmp.cpp
icu_60::RegexCompile::doParseActions
~ 13) 386364 (74.2%) t@0x12bf3d0 6052 (10756->16808) v8/src/compiler/instruction-selector.cc
v8::internal::compiler::InstructionSelector::VisitNode
~ 14) 392240 (75.3%) t@0xbd1e58 5876 (2800->8676) third_party/sqlite/amalgamation/sqlite3.c
sqlite3ExprCodeTarget
~ 15) 397324 (76.3%) t@0x1371e50 5084 (4140->9224) v8/src/compiler/typer.cc
v8::internal::compiler::Typer::Visitor::TypeNode
~ 16) 402400 (77.3%) t@0x12e6034 5076 (556->5632) v8/src/compiler/js-native-context-specialization.cc
v8::internal::compiler::JSNativeContextSpecialization::Reduce
~ 17) 407366 (78.2%) t@0x12df878 4966 (1030->5996) v8/src/compiler/js-generic-lowering.cc
v8::internal::compiler::JSGenericLowering::Reduce
~ 18) 412238 (79.2%) t@0x14d1474 4872 (3752->8624) v8/src/parsing/parser.cc
v8::internal::ParserBase<>::ParseAssignmentExpression
~ 19) 416638 (80.0%) t@0x14302bc 4400 (5100->9500) v8/src/heap/scavenger.cc
v8::internal::Scavenger::ScavengeObject
~ 20) 420942 (80.8%) t@0x6819a8 4304 (5324->9628) base/third_party/dmg_fp/dtoa_wrapper.cc
dmg_fp::dtoa
~ 21) 425186 (81.6%) t@0x938bec 4244 (9284->13528) third_party/skia/src/core/SkScan_AAAPath.cpp
SkScan::AAAFillPath
~ 22) 429286 (82.4%) t@0x1432ef4 4100 (6104->10204) v8/src/heap/scavenger.cc
v8::internal::Scavenger::EvacuateShortcutCandidate
~ 23) 433302 (83.2%) t@0x12f6020 4016 (944->4960) v8/src/compiler/load-elimination.cc
v8::internal::compiler::LoadElimination::Reduce
~ 24) 437166 (83.9%) t@0x14955cc 3864 (2348->6212) v8/src/objects.cc
v8::internal::JSObject::SetNormalizedProperty
~ 25) 440978 (84.7%) t@0x14f10b0 3812 (4692->8504) v8/src/parsing/preparser.cc
v8::internal::ParserBase<>::ParseLeftHandSideExpression
,
Mar 8 2018
The previous roll:
507 symbols added (+), 1945 changed (~), 827 removed (-), 848381 unchanged (not shown)
Of changed symbols, 1588 grew, 1691 shrank
Number of unique symbols 500559 -> 500533 (-26)
0 paths added, 0 removed, 789 changed
Showing 3,279 symbols (1,848 -> 1,843 unique) with total pss: 70406 bytes
Histogram of symbols based on PSS:
(-4096,-2048]: 3 (-128,-64]: 145 (-4,-2]: 157 [2,4): 156 [64,128): 157 [2048,4096): 8
(-2048,-1024]: 11 (-64,-32]: 155 (-2,-1]: 155 [4,8): 150 [128,256): 121 [4096,8192): 1
(-1024,-512]: 18 (-32,-16]: 145 (-1,0): 407 [8,16): 163 [256,512): 89 [16384,32768): 1
(-512,-256]: 47 (-16,-8]: 142 (0,1): 320 [16,32): 178 [512,1024): 26
(-256,-128]: 82 (-8,-4]: 224 [1,2): 47 [32,64): 160 [1024,2048): 11
.text=68.6kb .rodata=112 bytes .data.rel.ro=0 bytes .data=0 bytes .bss=0 bytes .pak.translations=0 bytes .pak.nontranslated=0 bytes .other=-2 bytes total=68.8kb
Number of unique paths: 828
Section Legend: t=.text, r=.rodata, R=.data.rel.ro, d=.data, b=.bss, p=.pak.translations, P=.pak.nontranslated, o=.other
Index | Running Total | Section@Address | Δ PSS (Δ size_without_padding) | Path
------------------------------------------------------------
~ 0) 27500 (37.1%) t@0x1390250 27500 (20088->47588) v8/src/heap/concurrent-marking.cc
v8::internal::ConcurrentMarking::Run
~ 1) 32156 (43.4%) t@0x13ba0d8 4656 (6584->11240) v8/src/heap/mark-compact.cc
v8::internal::MarkCompactCollector::ProcessMarkingWorklist
+ 2) 36216 (48.9%) t@0x139d370 4060 (0->4060) v8/src/heap/concurrent-marking.cc
v8::internal::ConcurrentMarkingVisitor::VisitMap
~ 3) 40064 (54.1%) t@0x13c7c3c 3848 (4280->8128) v8/src/heap/mark-compact.cc
v8::internal::BodyDescriptorApply<>
~ 4) 43480 (58.7%) t@0x1e6e2b0 3416 (40604->44020) third_party/WebKit/Source/core/html/parser/HTMLTokenizer.cpp
blink::HTMLTokenizer::NextToken
~ 5) 40140 (54.2%) t@0x1330a4c -3340 (9120->5780) v8/src/compiler/typer.cc
v8::internal::compiler::Typer::Visitor::Reduce
~ 6) 36960 (49.9%) t@0x1d0dd40 -3180 (5416->2236) third_party/WebKit/Source/core/css/parser/CSSSelectorParser.cpp
blink::CSSSelectorParser::RecordUsageAndDeprecations
~ 7) 39508 (53.3%) t@0x13d3210 2548 (2552->5100) v8/src/heap/scavenger.cc
v8::internal::Scavenger::ScavengeObject
~ 8) 37014 (50.0%) t@0x148d834 -2494 (13568->11074) v8/src/parsing/scanner.cc
v8::internal::Scanner::Scan
~ 9) 39382 (53.2%) t@0x1491260 2368 (1412->3780) v8/src/parsing/scanner.cc
v8::internal::Scanner::ScanNumber
~ 10) 41722 (56.3%) t@Group 2340 (347348->349688) {{no path}}
** lld merge strings (count=2)
~ 11) 43884 (59.3%) t@0x12a28fc 2162 (3956->6118) v8/src/compiler/escape-analysis.cc
v8::internal::compiler::EscapeAnalysis::Reduce
+ 12) 45998 (62.1%) t@0x1490a10 2114 (0->2114) v8/src/parsing/scanner.cc
v8::internal::Scanner::ScanString
~ 13) 44158 (59.6%) t@0x8159c0 -1840 (2264->424) third_party/skia/src/gpu/ops/GrAAFillRectOp.cpp
AAFillRectOp::onPrepareDraws
~ 14) 45758 (61.8%) t@0x12acc40 1600 (9156->10756) v8/src/compiler/instruction-selector.cc
v8::internal::compiler::InstructionSelector::VisitNode
~ 15) 44238 (59.7%) t@0x14b14d0 -1520 (4668->3148) v8/src/regexp/regexp-parser.cc
v8::internal::RegExpParser::ParseDisjunction
~ 16) 42734 (57.7%) t@0x1345f54 -1504 (1884->380) v8/src/conversions.cc
v8::internal::NumberParseIntHelper::HandleSpecialCases
~ 17) 44174 (59.7%) t@0x144f820 1440 (748->2188) v8/src/objects.cc
v8::internal::StringTable::LookupString
~ 18) 42758 (57.7%) t@0x15238f0 -1416 (4132->2716) v8/src/transitions.cc
v8::internal::TransitionsAccessor::Insert
~ 19) 41372 (55.9%) t@0x1b8108a -1386 (1690->304) blink/core/ComputedStyleBase.cpp
blink::ComputedStyleBase::ScrollAnchorDisablingPropertyChanged
~ 20) 39992 (54.0%) t@0x1285e68 -1380 (4746->3366) v8/src/compiler/bytecode-graph-builder.cc
v8::internal::compiler::BytecodeGraphBuilder::VisitSingleBytecode
+ 21) 41368 (55.9%) t@0x815c30 1376 (0->1376) third_party/skia/src/gpu/ops/GrAAFillRectOp.cpp
generate_aa_fill_rect_geometry
~ 22) 42716 (57.7%) t@0x1439f40 1348 (140->1488) v8/src/objects.cc
v8::internal::Object::SetProperty
~ 23) 44030 (59.5%) t@0x144954c 1314 (566->1880) v8/src/objects.cc
v8::internal::Map::GetOrCreatePrototypeChainValidityCell
~ 24) 42718 (57.7%) t@0x1e656b4 -1312 (2004->692) third_party/WebKit/Source/core/html/parser/HTMLEntityParser.cpp
blink::ConsumeHTMLEntity
~ 25) 41452 (56.0%) t@0x120f44e -1266 (3100->1834) v8/src/api-natives.cc
v8::internal::InstantiateFunction
~ 26) 40222 (54.3%) t@0x1f356a6 -1230 (3260->2030) third_party/WebKit/Source/core/layout/LayoutBlockFlowLine.cpp
,
Mar 8 2018
Digging now. I plan to leave the roller on while I'm doing this in hopes that it'll correct itself. If the regressions keep getting worse, I'll turn it off + revert us to the profile before 10ba697d817145582da1aef2ef014be1dbd26b54. > gbiv - was there anything odd about this roll? I don't know of any changes to the pipeline recently. +180KB from a single symbol definitely sounds off, though. > My (somewhat made up in my head) understanding is that as code within -O2 targets get renamed / added, the symbol names no longer match up to those in the profile marked as "hot". Then, once the roll happens, all the hot functions come back to being built with -O2. > > Is that about how it works? Yeah. I know there are some places where AFDO takes the shape of functions into account (so if the function was previously `{ return &foo; }` but now does substantial work, you'll probably get a mismatch), though I don't know in detail how that interacts with the hotness calculations we do.
,
Mar 9 2018
So... this new profile is pretty great. I triaged the size increase of VisitNode (the +180KB symbol), and it appears that we just try to inline the world into it. Given that it's a massive function to begin with, this ends very poorly. :) The new AFDO profile says that VisitNode is 57x hotter than it was yesterday, and is where Chrome spends > 1.3% of its total execution time. Old AFDO profiles say that Chrome spends more like 0.02%-0.04% of its time in this function, which is far more believable to me. Assuming all goes well, we should get a new profile later tonight. I'll compare the two tomorrow to see if we get similarly crazy values.
,
Mar 9 2018
The next roll indeed fixed it: -544700 bytes (back to the normal regression size expected by profile updates). |
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Mar 8 2018