CodeStubAssembler is breaking bad when it sees a deferred block |
||
Issue descriptionCodeStubAssembler is breaking bad when it sees a deferred block. Lets try to improve the below code by "deferring" the smi case (instruction at 33): 0x13237a31fda0 0 488bdd REX.W movq rbx,rbp 0x13237a31fda3 3 488b5bf0 REX.W movq rbx,[rbx-0x10] 0x13237a31fda7 7 488b5b2f REX.W movq rbx,[rbx+0x2f] 0x13237a31fdab 11 488b5b0f REX.W movq rbx,[rbx+0xf] 0x13237a31fdaf 15 f6c201 testb rdx,0x1 0x13237a31fdb2 18 0f8409000000 jz 33 (0x13237a31fdc1) 0x13237a31fdb8 24 488b7aff REX.W movq rdi,[rdx-0x1] 0x13237a31fdbc 28 e904000000 jmp 37 (0x13237a31fdc5) 0x13237a31fdc1 33 498b7df8 REX.W movq rdi,[r13-0x8] <-- Lets defer this unlikely Smi case. 0x13237a31fdc5 37 4c8bc0 REX.W movq r8,rax 0x13237a31fdc8 40 49c1e81d REX.W shrq r8, 29 <... cut ...> The result is surprisingly awesome: 0x13237a31fdc0 0 488bdd REX.W movq rbx,rbp 0x13237a31fdc3 3 488b5bf0 REX.W movq rbx,[rbx-0x10] 0x13237a31fdc7 7 488b5b2f REX.W movq rbx,[rbx+0x2f] 0x13237a31fdcb 11 488b5b0f REX.W movq rbx,[rbx+0xf] 0x13237a31fdcf 15 f6c201 testb rdx,0x1 0x13237a31fdd2 18 0f84bd010000 jz 469 (0x13237a31ff95) 0x13237a31fdd8 24 488b7aff REX.W movq rdi,[rdx-0x1] <-- Normal fall through to likely non-Smi case. 0x13237a31fddc 28 4c8bc0 REX.W movq r8,rax 0x13237a31fddf 31 49c1e81d REX.W shrq r8, 29 <... cut ...> 0x13237a31ff95 469 4c8bd0 REX.W movq r10,rax | 0x13237a31ff98 472 488bc2 REX.W movq rax,rdx | 0x13237a31ff9b 475 498bd2 REX.W movq rdx,r10 | WAT? o_O 0x13237a31ff9e 478 4c8bd6 REX.W movq r10,rsi | 0x13237a31ffa1 481 488bf0 REX.W movq rsi,rax | 0x13237a31ffa4 484 498bc2 REX.W movq rax,r10 | 0x13237a31ffa7 487 4c8bd0 REX.W movq r10,rax | 0x13237a31ffaa 490 488bc6 REX.W movq rax,rsi | 0x13237a31ffad 493 498bf2 REX.W movq rsi,r10 | 0x13237a31ffb0 496 4c8bd2 REX.W movq r10,rdx | 0x13237a31ffb3 499 488bd0 REX.W movq rdx,rax | 0x13237a31ffb6 502 498bc2 REX.W movq rax,r10 | 0x13237a31ffb9 505 498b7df8 REX.W movq rdi,[r13-0x8] 0x13237a31ffbd 509 e91afeffff jmp 28 (0x13237a31fddc) 0x13237a31ffc2 514 0f0b ud2 I was told it's a register allocator issue, therefore assigning to Mircea. To reproduce it, defer |load_smi_map| label in CodeStubAssembler::LoadReceiverMap() in this CL: https://codereview.chromium.org/2031753003/.
,
Jun 3 2016
I'm not sure I understand what the perceived problem is, and what the expected outcome would be. What I see is what I'd expect: the deferred path is penalized by the register allocator to potentially have more moves, in favor of avoiding that elsewhere in the hot code. This has the tradeoff of potentially larger and less efficient code on the deferred path.
,
Jun 3 2016
What I'm complaining about is that this whole block is one big NOP which shuffles the registers for no actual reason (note also that none of these registers is used in the block itself): rax -> r10 -> rdx -> r10 -> rax rdx -> rax -> rsi -> rax -> rdx rsi -> r10 -> rax -> r10 -> rsi
,
Jun 3 2016
Thanks for the clarification. I think this is related to how we deal with cycles in moves. It's on my radar, and I want to get to it as soon as higher priority items get out of the picture.
,
Oct 11 2016
This should be addressed in https://crrev.com/33629651584b669c0bd24a9be7bad868f01ea809 I validated it on similar occurrences in Octance. The patch mentioned in this issue does not apply cleanly anymore, and I'm missing context on what happened since. For example - did this change make it in under a different CL. Igor, do you have another repro I may want to use, or do we believe the evidence for Octane to be sufficient validation? Thanks! |
||
►
Sign in to add a comment |
||
Comment 1 by ishell@chromium.org
, Jun 3 2016