Issue metadata
Sign in to add a comment
|
bogus list of perf-report |
||||||||||||||||||||||
Issue descriptionWhen working on crbug.com/860029, I got this perf profile after running speedometer2: 2.15% 3977 chrome [.] vp8_decode_mb_row_no_filter 1.14% 2121 TaskSchedulerFo [.] v8::internal::ParserBase<v8::internal::Parser>::ParsePrimaryExpression 0.63% 1171 chrome [.] blink::HTMLTokenizer::NextToken 0.55% 1025 chrome [.] v8::internal::ConcurrentMarking::Run 0.40% 744 TaskSchedulerFo [.] v8::internal::KeywordOrIdentifierToken 0.40% 735 chrome [.] quant_all_bands 0.36% 660 chrome [.] v8::internal::Scanner::Scan 0.35% 654 chrome [.] v8::internal::V8HeapExplorer::GetStrongGcSubrootName 0.33% 616 chrome [.] v8::internal::ParserBase<v8::internal::Parser>::ParsePrimaryExpression 0.32% 591 chrome [.] v8::internal::ParserBase<v8::internal::Parser>::ParseClassLiteral ... The hottest function is vp8_decode_mb_row_no_filter. As speedometer2 doesn't play any video, I assume that it is a bug in either perf-report or debug info. The test image is here: gs://chromeos-image-archive/reef-release-tryjob/R68-10718.73.0-b2811777/ perf.data is attached.
,
Sep 14
I built a local image and ran speedometer2 and captured the perf profile. However, I do not see vp8 symbols in chrome. gbiv@ had similar comments. lsazio@ Can you share the steps you used so that we can try to repro it.
,
Sep 14
,
Sep 14
Manoj ninja'd me :).
FWIW, my haphazard attempted repro steps, in case I'm doing anything obviously wrong:
- Grabbed the given image
- `unzip`ed image.zip
- `./unpack_partitions.sh`
- Copied dir_3/* (the root partition, apparently) to $writable_place
- Unzipped debug.tgz into ${writable_place}/usr/lib/debug
- `perf report -i perf.data --dso=chrome --symfs=$writable_place > $out`
$out looks like:
```
1.23% chrome [.] v8::internal::LookupIterator::Start<false>
0.82% TaskSchedulerFo [.] v8::internal::compiler::Scheduler::ComputeSchedule
0.75% chrome [.] blink::HTMLTokenizer::NextToken
0.66% TaskSchedulerFo [.] v8::internal::ConcurrentMarking::Run
0.56% TaskSchedulerFo [.] v8::internal::compiler::GraphReducer::ReduceNode
0.40% chrome [.] (anonymous namespace)::do_malloc
0.36% chrome [.] v8::internal::JsonParser<true>::ParseJsonValue
0.34% chrome [.] v8::internal::StringTable::LookupStringIfExists_NoAllocate
0.31% TaskSchedulerFo [.] v8::internal::Scavenger::ScavengeObject
0.31% TaskSchedulerFo [.] v8::internal::compiler::GraphTrimmer::TrimGraph
0.27% chrome [.] v8::internal::Runtime_LoadIC_Miss
...
```
`grep 'vp8|quant' $out` gives 0 relevant results.
I also tried generating a local perf.data on falco + checking that, but got similar results to ^.
Might you have used a different perf than the one we're using? If so, that would scream "perf bug" quite loudly. :)
,
Sep 18
Ping :)
,
Sep 18
Sorry I missed this thread. Please feel free to ping me if you need my attention. Here is my steps to reproduce: $ #download the attached perf.data $ gsutil cp gs://chromeos-image-archive/reef-release-tryjob/R68-10718.73.0-b2811777/debug.tgz . $ tar zxf debug.tgz $ pushd debug/opt/google/chrome $ ln -s chrome.debug chrome $ popd $ perf report --symfs Both perf in CrOS SDK and glinux have this problem.
,
Sep 18
Hmmm, there is a obvious difference in #4 and my steps: the way chrome.debug comes into play. Probably it is the problem.
,
Sep 19
chrome.debug is a "split" debug file resulted by dwp and the section offsets are different from a normal executable with debug info. I guess the problem might be that perf doesn't know that when it is specified directly in #6. Probably the output in #4 is correct and there's nothing to be fixed.
,
Sep 19
I have not done this in a while, but I think the split debug file used to work. Also, if that is the reason, it is strange that it works for most of the symbols and it does not for a few of them...
,
Sep 19
My guess is that there's a shift in the addresses. So the report may still look legitimate, but completely useless. BTW, it may probably be worth also checking *AFDO*, debugger, minidump/breakpad and anything else that needs debug info as well.
,
Sep 19
yes, it would be really bad if this problem extends to other tools. Anyway, I think this deserves a "perf" bug.. at the least.
,
Sep 19
Yeah, I can confirm that I can now repro both laszio's and my results now. The difference being that laszio's results happen if you `ln -s chrome.debug chrome`, and mine happen if you `ln -s /path/to/dir_3/opt/google/chrome/chrome chrome`. IOW, having the stripped Chrome binary where perf expects it meaningfully changes the output. Will continue to dig, since the fact that perf appears silently OK with this is concerning :) Thanks for the notes about where else to check, too
,
Sep 19
Further confirmation that laszio's guess is right :) Unsurprisingly, `perf script` is reporting identical sample addrs with different symbols. When we match these samples up with the addrs that `nm` reports about chrome.debug, the symbols that the 'bad' perf invocation reports are all at similar offsets from the good symbols. Concretely, the first 17 offsets (good_symbol_addr - bad_symbol_addr) were: -0x6b820, -0x6b390, -0x68a10, -0x6b850, -0x6b710, -0x6b990, -0x6b7d0, -0x6b760, -0x6b1b0, -0x6b440, -0x6a990, -0x6ba00, -0x6b100, -0x6b320, -0x6b6d0, -0x6b3f0, -0x69460 So, when handed chrome.debug as chrome, perf starts handing us symbols that are around 0x6b000-0x6c000 bytes farther in the binary than if we handed it chrome.stripped. `readelf` notes that the .text segment for chrome.stripped starts at 0x6bb40 bytes into the binary. It notes the same for chrome.debug, but there's some different markup around the 0x6bb40 in the readelf for chrome.debug. Given that: - this guess would be really convenient if it's correct, - the numbers here are eerily close, and - I don't know what other symbol/debug info perf would want to extract from chrome.stripped ...I'm going down this path. I have a copy of perf's source locally; I'll briefly try poking it and we'll see what falls out.
,
Sep 19
(FWIW, I also hand-symbolized a few samples from the 'good' perf invocation to be sure that it all looks OK. All the sample addr+perf-reported-symbol pairs I tried matched up with what `nm`s said the symbol addrs should be)
,
Sep 19
+gmx and lakshmana fyi. I'm going to poke create_llvm_prof to be sure it doesn't have the same issue (I'm willing to bet it doesn't). Will keep this open until investigation of create_llvm_prof is done. tl;dr: yeah, user error that perf silently accepts. Don't have enough context on the creative uses of ELF + perf to say whether detection and warning on this misuse is possible/desirable. https://github.com/torvalds/linux/blob/master/tools/perf/util/symbol-elf.c#L1089 is where the symbol address adjustment is happening. For chrome.stripped, .text's shdr.sh_offset == shdr.sh_addr, so the adjustment is a nop. For chrome.debug, .text's shdr.sh_addr == 0x6bb40 and shdr.sh_offset == 0x2d0, so we end up shifting the address of each symbol back by 0x6b870 bytes. The .text section (among many others) in chrome.debug is a special SHT_NOBITS section, which makes it (and, importantly, all SHT_NOBITS sections before it) occupy zero bytes. Hence the difference in sh_offsets. The issue is that perf is told that `chrome.debug` is the binary we're loading into memory, so it treats those offsets as final, even though they have no bits. Maaaybe we can put a special warning/heuristic in perf to the effect of "hey, this section is NOBITS, yet it's .text. Maybe you're pointing us at split debuginfo. Don't do that." I don't know how common it is for sh_addr != sh_offset; if it's really uncommon, we may even be able to say "Maybe you have split debuginfo, but we're going to try our best anyway." But I don't have enough context on all of this to guarantee whether {that's a good idea,that'll be accepted by upstream}.
,
Sep 19
these used to be the instructions I followed long time ago: https://wiki.corp.google.com/twiki/bin/view/Main/ProfilingWithPerf we should put these somewhere more "modern" to make sure we are not making the same mistake again.
,
Sep 19
and definetly having an automated way to detect this is a good idea.
,
Sep 20
|
|||||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||||
Comment 1 by laszio@chromium.org
, Sep 3