New issue
Advanced search Search tips

Issue 820107 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Mar 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

580kb regression in resource_sizes (MonochromePublic.apk) at 541745:541745

Project Member Reported by agrieve@chromium.org, Mar 8 2018

Issue description

Caused by an AFDO roll
c2b84ad191135be504a0abf4860b97b59f9d8b52

gbiv - was there anything odd about this roll?
 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=820107

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=44b0f3aa197843528234273e7c4425f0bcff85160c47f83463a06617dc48ef34


Bot(s) for this bug's original alert(s):

Android Builder
Note: this is on the heels of a 75kb increase from a roll:
10ba697d817145582da1aef2ef014be1dbd26b54

My (somewhat made up in my head) understanding is that as code within -O2 targets get renamed / added, the symbol names no longer match up to those in the profile marked as "hot". Then, once the roll happens, all the hot functions come back to being built with -O2.

Is that about how it works?


From tools/binary_size/diagnose_bloat.py c2b84ad1 --cloud:


85% of the increase is from just 25 symbols. 180kb from just one!:

Section Legend: t=.text, r=.rodata, R=.data.rel.ro, d=.data, b=.bss, p=.pak.translations, P=.pak.nontranslated, o=.other
Index | Running Total | Section@Address | Δ PSS (Δ size_without_padding) | Path
------------------------------------------------------------
~ 0)     180464 (34.7%) t@0x1332470  180464 (24408->204872) v8/src/compiler/simplified-lowering.cc
               v8::internal::compiler::RepresentationSelector::VisitNode
~ 1)     255376 (49.0%) t@0x1374d60  74912 (5780->80692) v8/src/compiler/typer.cc
               v8::internal::compiler::Typer::Visitor::Reduce
~ 2)     272096 (52.2%) t@0xbc20e0   16720 (12736->29456) third_party/sqlite/amalgamation/sqlite3.c
               yy_reduce
~ 3)     286328 (55.0%) t@0xbb1648   14232 (22236->36468) third_party/sqlite/amalgamation/sqlite3.c
               sqlite3VdbeExec
~ 4)     299284 (57.5%) t@0x141421c  12956 (11240->24196) v8/src/heap/mark-compact.cc
               v8::internal::MarkCompactCollector::ProcessMarkingWorklist
~ 5)     311782 (59.9%) t@0x1ee4ec0  12498 (44020->56518) third_party/WebKit/Source/core/html/parser/HTMLTokenizer.cpp
               blink::HTMLTokenizer::NextToken
~ 6)     324124 (62.2%) t@0x12b195c  12342 (6118->18460) v8/src/compiler/escape-analysis.cc
               v8::internal::compiler::EscapeAnalysis::Reduce
~ 7)     335700 (64.5%) t@0x15ad1f8  11576 (6444->18020) v8/src/wasm/function-body-decoder.cc
               v8::internal::wasm::WasmFullDecoder<>::DecodeFunctionBody
~ 8)     346840 (66.6%) t@0x145d9f0  11140 (2016->13156) v8/src/interpreter/bytecode-generator.cc
               v8::internal::interpreter::BytecodeGenerator::VisitNoStackOverflowCheck
~ 9)     357592 (68.7%) t@Group      10752 (349952->360704) {{no path}}
               ** lld merge strings (count=2)
~ 10)    366192 (70.3%) t@0x12fe990  8600 (6376->14976) v8/src/compiler/machine-operator-reducer.cc
               v8::internal::compiler::MachineOperatorReducer::Reduce
~ 11)    373764 (71.8%) t@0x9baaa0   7572 (1608->9180)  third_party/skia/src/shaders/gradients/Sk4fLinearGradient.cpp
               SkLinearGradient::LinearGradient4fContext::shadeSpan
~ 12)    380312 (73.0%) t@0x6ecfb4   6548 (4596->11144) third_party/icu/source/i18n/regexcmp.cpp
               icu_60::RegexCompile::doParseActions
~ 13)    386364 (74.2%) t@0x12bf3d0  6052 (10756->16808) v8/src/compiler/instruction-selector.cc
               v8::internal::compiler::InstructionSelector::VisitNode
~ 14)    392240 (75.3%) t@0xbd1e58   5876 (2800->8676)  third_party/sqlite/amalgamation/sqlite3.c
               sqlite3ExprCodeTarget
~ 15)    397324 (76.3%) t@0x1371e50  5084 (4140->9224)  v8/src/compiler/typer.cc
               v8::internal::compiler::Typer::Visitor::TypeNode
~ 16)    402400 (77.3%) t@0x12e6034  5076 (556->5632)   v8/src/compiler/js-native-context-specialization.cc
               v8::internal::compiler::JSNativeContextSpecialization::Reduce
~ 17)    407366 (78.2%) t@0x12df878  4966 (1030->5996)  v8/src/compiler/js-generic-lowering.cc
               v8::internal::compiler::JSGenericLowering::Reduce
~ 18)    412238 (79.2%) t@0x14d1474  4872 (3752->8624)  v8/src/parsing/parser.cc
               v8::internal::ParserBase<>::ParseAssignmentExpression
~ 19)    416638 (80.0%) t@0x14302bc  4400 (5100->9500)  v8/src/heap/scavenger.cc
               v8::internal::Scavenger::ScavengeObject
~ 20)    420942 (80.8%) t@0x6819a8   4304 (5324->9628)  base/third_party/dmg_fp/dtoa_wrapper.cc
               dmg_fp::dtoa
~ 21)    425186 (81.6%) t@0x938bec   4244 (9284->13528) third_party/skia/src/core/SkScan_AAAPath.cpp
               SkScan::AAAFillPath
~ 22)    429286 (82.4%) t@0x1432ef4  4100 (6104->10204) v8/src/heap/scavenger.cc
               v8::internal::Scavenger::EvacuateShortcutCandidate
~ 23)    433302 (83.2%) t@0x12f6020  4016 (944->4960)   v8/src/compiler/load-elimination.cc
               v8::internal::compiler::LoadElimination::Reduce
~ 24)    437166 (83.9%) t@0x14955cc  3864 (2348->6212)  v8/src/objects.cc
               v8::internal::JSObject::SetNormalizedProperty
~ 25)    440978 (84.7%) t@0x14f10b0  3812 (4692->8504)  v8/src/parsing/preparser.cc
               v8::internal::ParserBase<>::ParseLeftHandSideExpression


Summary: 580kb regression in resource_sizes (MonochromePublic.apk) at 541745:541745 (was: 450kb regression in resource_sizes (MonochromePublic.apk) at 541745:541745)
The previous roll:

507 symbols added (+), 1945 changed (~), 827 removed (-), 848381 unchanged (not shown)
Of changed symbols, 1588 grew, 1691 shrank
Number of unique symbols 500559 -> 500533 (-26)
0 paths added, 0 removed, 789 changed

Showing 3,279 symbols (1,848 -> 1,843 unique) with total pss: 70406 bytes
Histogram of symbols based on PSS:
    (-4096,-2048]: 3    (-128,-64]: 145   (-4,-2]: 157     [2,4): 156      [64,128): 157     [2048,4096): 8
    (-2048,-1024]: 11    (-64,-32]: 155   (-2,-1]: 155     [4,8): 150     [128,256): 121     [4096,8192): 1
     (-1024,-512]: 18    (-32,-16]: 145    (-1,0): 407    [8,16): 163     [256,512): 89    [16384,32768): 1
      (-512,-256]: 47     (-16,-8]: 142     (0,1): 320   [16,32): 178    [512,1024): 26
      (-256,-128]: 82      (-8,-4]: 224     [1,2): 47    [32,64): 160   [1024,2048): 11
.text=68.6kb     .rodata=112 bytes  .data.rel.ro=0 bytes    .data=0 bytes    .bss=0 bytes    .pak.translations=0 bytes    .pak.nontranslated=0 bytes    .other=-2 bytes   total=68.8kb
Number of unique paths: 828

Section Legend: t=.text, r=.rodata, R=.data.rel.ro, d=.data, b=.bss, p=.pak.translations, P=.pak.nontranslated, o=.other
Index | Running Total | Section@Address | Δ PSS (Δ size_without_padding) | Path
------------------------------------------------------------
~ 0)      27500 (37.1%) t@0x1390250  27500 (20088->47588) v8/src/heap/concurrent-marking.cc
               v8::internal::ConcurrentMarking::Run
~ 1)      32156 (43.4%) t@0x13ba0d8  4656 (6584->11240) v8/src/heap/mark-compact.cc
               v8::internal::MarkCompactCollector::ProcessMarkingWorklist
+ 2)      36216 (48.9%) t@0x139d370  4060 (0->4060)     v8/src/heap/concurrent-marking.cc
               v8::internal::ConcurrentMarkingVisitor::VisitMap
~ 3)      40064 (54.1%) t@0x13c7c3c  3848 (4280->8128)  v8/src/heap/mark-compact.cc
               v8::internal::BodyDescriptorApply<>
~ 4)      43480 (58.7%) t@0x1e6e2b0  3416 (40604->44020) third_party/WebKit/Source/core/html/parser/HTMLTokenizer.cpp
               blink::HTMLTokenizer::NextToken
~ 5)      40140 (54.2%) t@0x1330a4c  -3340 (9120->5780) v8/src/compiler/typer.cc
               v8::internal::compiler::Typer::Visitor::Reduce
~ 6)      36960 (49.9%) t@0x1d0dd40  -3180 (5416->2236) third_party/WebKit/Source/core/css/parser/CSSSelectorParser.cpp
               blink::CSSSelectorParser::RecordUsageAndDeprecations
~ 7)      39508 (53.3%) t@0x13d3210  2548 (2552->5100)  v8/src/heap/scavenger.cc
               v8::internal::Scavenger::ScavengeObject
~ 8)      37014 (50.0%) t@0x148d834  -2494 (13568->11074) v8/src/parsing/scanner.cc
               v8::internal::Scanner::Scan
~ 9)      39382 (53.2%) t@0x1491260  2368 (1412->3780)  v8/src/parsing/scanner.cc
               v8::internal::Scanner::ScanNumber
~ 10)     41722 (56.3%) t@Group      2340 (347348->349688) {{no path}}
               ** lld merge strings (count=2)
~ 11)     43884 (59.3%) t@0x12a28fc  2162 (3956->6118)  v8/src/compiler/escape-analysis.cc
               v8::internal::compiler::EscapeAnalysis::Reduce
+ 12)     45998 (62.1%) t@0x1490a10  2114 (0->2114)     v8/src/parsing/scanner.cc
               v8::internal::Scanner::ScanString
~ 13)     44158 (59.6%) t@0x8159c0   -1840 (2264->424)  third_party/skia/src/gpu/ops/GrAAFillRectOp.cpp
               AAFillRectOp::onPrepareDraws
~ 14)     45758 (61.8%) t@0x12acc40  1600 (9156->10756) v8/src/compiler/instruction-selector.cc
               v8::internal::compiler::InstructionSelector::VisitNode
~ 15)     44238 (59.7%) t@0x14b14d0  -1520 (4668->3148) v8/src/regexp/regexp-parser.cc
               v8::internal::RegExpParser::ParseDisjunction
~ 16)     42734 (57.7%) t@0x1345f54  -1504 (1884->380)  v8/src/conversions.cc
               v8::internal::NumberParseIntHelper::HandleSpecialCases
~ 17)     44174 (59.7%) t@0x144f820  1440 (748->2188)   v8/src/objects.cc
               v8::internal::StringTable::LookupString
~ 18)     42758 (57.7%) t@0x15238f0  -1416 (4132->2716) v8/src/transitions.cc
               v8::internal::TransitionsAccessor::Insert
~ 19)     41372 (55.9%) t@0x1b8108a  -1386 (1690->304)  blink/core/ComputedStyleBase.cpp
               blink::ComputedStyleBase::ScrollAnchorDisablingPropertyChanged
~ 20)     39992 (54.0%) t@0x1285e68  -1380 (4746->3366) v8/src/compiler/bytecode-graph-builder.cc
               v8::internal::compiler::BytecodeGraphBuilder::VisitSingleBytecode
+ 21)     41368 (55.9%) t@0x815c30   1376 (0->1376)     third_party/skia/src/gpu/ops/GrAAFillRectOp.cpp
               generate_aa_fill_rect_geometry
~ 22)     42716 (57.7%) t@0x1439f40  1348 (140->1488)   v8/src/objects.cc
               v8::internal::Object::SetProperty
~ 23)     44030 (59.5%) t@0x144954c  1314 (566->1880)   v8/src/objects.cc
               v8::internal::Map::GetOrCreatePrototypeChainValidityCell
~ 24)     42718 (57.7%) t@0x1e656b4  -1312 (2004->692)  third_party/WebKit/Source/core/html/parser/HTMLEntityParser.cpp
               blink::ConsumeHTMLEntity
~ 25)     41452 (56.0%) t@0x120f44e  -1266 (3100->1834) v8/src/api-natives.cc
               v8::internal::InstantiateFunction
~ 26)     40222 (54.3%) t@0x1f356a6  -1230 (3260->2030) third_party/WebKit/Source/core/layout/LayoutBlockFlowLine.cpp

Comment 5 by g...@chromium.org, Mar 8 2018

Digging now. I plan to leave the roller on while I'm doing this in hopes that it'll correct itself. If the regressions keep getting worse, I'll turn it off + revert us to the profile before 10ba697d817145582da1aef2ef014be1dbd26b54.

> gbiv - was there anything odd about this roll?

I don't know of any changes to the pipeline recently. +180KB from a single symbol definitely sounds off, though.

> My (somewhat made up in my head) understanding is that as code within -O2 targets get renamed / added, the symbol names no longer match up to those in the profile marked as "hot". Then, once the roll happens, all the hot functions come back to being built with -O2.
>
> Is that about how it works?

Yeah. I know there are some places where AFDO takes the shape of functions into account (so if the function was previously `{ return &foo; }` but now does substantial work, you'll probably get a mismatch), though I don't know in detail how that interacts with the hotness calculations we do.

Comment 6 by g...@chromium.org, Mar 9 2018

So... this new profile is pretty great.

I triaged the size increase of VisitNode (the +180KB symbol), and it appears that we just try to inline the world into it. Given that it's a massive function to begin with, this ends very poorly. :)

The new AFDO profile says that VisitNode is 57x hotter than it was yesterday, and is where Chrome spends > 1.3% of its total execution time. Old AFDO profiles say that Chrome spends more like 0.02%-0.04% of its time in this function, which is far more believable to me.

Assuming all goes well, we should get a new profile later tonight. I'll compare the two tomorrow to see if we get similarly crazy values.
Status: Fixed (was: Assigned)
The next roll indeed fixed it: -544700 bytes (back to the normal regression size expected by profile updates).

Sign in to add a comment