New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 873847 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner:
Last visit > 30 days ago
Closed: Sep 3
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug
Build-Toolchain



Sign in to add a comment

Investigate how AFDO profile gathering works with CFI/ThinLTO

Project Member Reported by g...@chromium.org, Aug 13

Issue description

(Assigning to you, laszio, at llozano's request. If you can't get to it, I'm happy to pick it up :) )

When doing other AFDO investigations, I found that CFI can cause us to emit different mangled symbols (`chrome` is Chrome/Linux in this case):

~/l/l/AFDO-3481 [ 3.92s ] ~> nm chrome | wc -l
617054
~/l/l/AFDO-3481 [ 4.00s ] ~> nm chrome | grep '\.cfi$' | wc -l
75477

It appears that ThinLTO will also (potentially) tag symbols with a hash-like suffix:

~/l/l/AFDO-3481 [ 4.14s ] ~> nm chrome | grep '\.cfi$' | head -n20 | cut -d' ' -f3
a2i_ipadd.cfi
aac_decode_close$936cd9b7879c207b48f2913effcacacf.cfi
aac_decode_frame$936cd9b7879c207b48f2913effcacacf.cfi
aac_decode_init$936cd9b7879c207b48f2913effcacacf.cfi
aac_parse_init$cbfa93a464bf2b41dc7181fd39bfa2f6.cfi
aac_static_table_init$936cd9b7879c207b48f2913effcacacf.cfi
aac_sync$cbfa93a464bf2b41dc7181fd39bfa2f6.cfi
ABGRToI420.cfi
ABGRToUVRow_Any_SSSE3.cfi
ABGRToUVRow_C.cfi
ABGRToUVRow_SSSE3.cfi
ABGRToYRow_Any_SSSE3.cfi
ABGRToYRow_C.cfi
ABGRToYRow_SSSE3.cfi
abort_message.cfi
absFunc$c0ebdb0c05b4ae8b8d2321e6f41a5205.cfi
ACCESS_DESCRIPTION_free.cfi
access_virt_barray$6d492c98ee523c524cb677da1abfc522.cfi
access_virt_sarray$6d492c98ee523c524cb677da1abfc522.cfi
AddGreenToBlueAndRed_SSE2$2f46114d91e02a1a72b3290089da97d9.cfi

It would likely be good to verify that neither of these keeps our `perf` traces from matching with symbols in AFDO (since profile data matching presumably happens before cfi/this hash are applied, and "this is `aac_decode_close`, not `aac_decode_close$936cd9b7879c207b48f2913effcacacf.cfi`! Guess I don't have any profile data for it, ..." seems like a subtle issue)

It's also interesting to note that we may emit both a cfi'ed and non-cfi'ed variant of the same symbol:
~/l/l/AFDO-3481 [ 7.91s ] ~> nm chrome | cut -d' ' -f3 | sort -u | wc -l
611644
~/l/l/AFDO-3481 [ 8.12s ] ~> nm chrome | cut -d' ' -f3 | sed 's/\.cfi$//' | sort -u | wc -l
536504

So uh, even if we see samples from a function that was CFI'ed in the final profile, it may be that we're calling the non-CFI'ed variant? I dunno how CFI works in enough detail to say anything about that.
 
Cc: cmt...@chromium.org
+cmtice

Not sure if I'm doing it right, but

$ nm -a chrome.debug | grep '\.cfi$'

gives me nothing on latest chell dev image. Trying a few symbols in the above:

$ nm -a chrome.debug | grep aac_decode_close
000000000348efc0 t aac_decode_close

$ nm -a chrome.debug | grep ABGRToUVRow_Any_SSSE3
00000000033cafb0 t ABGRToUVRow_Any_SSSE3

$ nm -a chrome.debug | grep  absFunc
0000000004214cb0 t absFunc

Everything looks fine :) Looking at the ebuild, we only enabled "is_cfi" and "use_cfi_cast". A few cfi flags available in gn are not enabled in Chrome OS. Perhaps it's them making the differences and the problem hasn't come to Chrome OS yet.

Because I can't reproduce it with Chrome OS, I got a few questions that I can't easily verify:
1. Are they additional symbols, or are they transformed from "the original" symbols?
2. Can you show addresses and sizes of them?
3. (probably wait for cmtice@ to answer) how are those cfi symbols used?
> 1. Are they additional symbols, or are they transformed from "the original" symbols?

To be clear, do they replace the original symbols?
If we don't see this, that's great. :)

Like said, I dunno much about cfi's implementation details, so it would be really nice if this isn't something we'll ever encounter. If we do decide to rely on that (and decide that it does potentially break afdo/etc), it may be worth a test somewhere?

> 1. Are they additional symbols, or are they transformed from "the original" symbols?

It appears to be a mixture of both. e.g. from my Linux chrome build:

00000000019603f8 t a2i_ipadd
0000000003e8ca90 t a2i_ipadd.cfi

a2i_ipadd boils down to tailcall to a2i_ipadd.cfi: "19603f8:       e9 93 c6 52 02          jmpq   3e8ca90 <a2i_ipadd.cfi>". a2i_ipadd.cfi looks like it does a lot of work, so I assume that's the "real" version of this function.

There's quite a few cases where we don't have both symbols. As noted above, I count 75,477 CFI-like symbols (all in .text). In 75,140 cases, we have a non-".cfi" suffix symbol to match. So, in 337 cases, we only have the .cfi variant.

10 examples of this are:
BIO_snprintf.cfi
WelsSleep.cfi
_Z13SkStrEndsWithPKcS0_.cfi
_Z15FX_CharSetIsCJKh.cfi
_Z26SkNotifyBitmapGenIDIsStalej.cfi
_ZN10DevToolsUI11GetProxyURLERKNSt3__112basic_stringIcNS0_11char_traitsIcEENS0_9allocatorIcEEEE.cfi
_ZN10SkTextBlob14MakeFromBufferER12SkReadBuffer.cfi
_ZN10disk_cache11simple_util15GetEntryHashKeyERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE.cfi
_ZN10extensions9Extension25GetBaseURLFromExtensionIdERKNSt3__112basic_stringIcNS1_11char_traitsIcEENS1_9allocatorIcEEEE.cfi
_ZN10spellcheck27GetSpellCheckLanguageRegionEN4base16BasicStringPieceINSt3__112basic_stringIcNS2_11char_traitsIcEENS2_9allocatorIcEEEEEE.cfi

...I'm failing to see why these are special, though. The non-.cfi versions of the above have no references (obviously), but a2i_ipadd has no grepable references in disassembly, either (though I can't tell what references may lie in e.g. tables sitting in rodata).

> 2. Can you show addresses and sizes of them?

I think the notes above gives you what you were looking for with this? If not, happy to dump some output here (or you can come poke me and I'm happy to chat/look at chrome interactively)
> (since profile data matching presumably happens before cfi/this hash ...

Is cfi applied before most of the optimization passes? If not,

IIUC this is the life of a sample in afdo:
1. sampled on target
2. saved in virtual address in perf.data
3. resolved into a symbol + offset and then source line by create_llvm_prof 
4. corresponding source line count += 1

then compiler uses the aggregated profile, which is represented in source line counts, to make decisions.

Chances are, if cfi handles debug info correctly (s.t. offsets can be correctly mapped to source lines), we should be fine.


If cfi is applied before the optimization passes, I guess some translation has to be done.

Having said that, only performance test tells. We should definitely do performance tests before enabling more cfi flags.
Cc: p...@chromium.org
The purpose of the .cfi symbols is to point the "real" definition of the function. The non-.cfi symbols point to jump tables that are used by indirect calls. The debug info is preserved when the real function definition is renamed so assuming that the lookup is based on source lines I don't see why things wouldn't work.
Status: WontFix (was: Untriaged)
According to #5, it doesn't look like a problem. Marking it won't fix for now.

Sign in to add a comment