New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 601948 link

Starred by 3 users

Issue metadata

Status: Fixed
Owner:
Closed: Sep 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

Courgette incorrectly disassembles nacl_irt_x86_32.nexe

Project Member Reported by hua...@chromium.org, Apr 8 2016

Issue description

Version: 51
OS: Windows

What steps will reproduce the problem?

(1) Go to Chrome's install directly and copy nacl_irt_x86_32.nexe to temp directory. It's an ELF-x86 executable, about 3.1 MB.
(2) Use Courgette to convert this to assembly:
  courgette.exe -dis nacl_irt_x86_32.nexe nacl_irt_x86_32.nexe.asm
(3) Examine the size of output file nacl_irt_x86_32.nexe.asm
(4) Convert the assembly file back to .exe
  courgette.exe -asm nacl_irt_x86_32.nexe.asm test
(5) Compare final output with original nacl_irt_x86_32.nexe:
  fc /b nacl_irt_x86_32.nexe test

What is the expected output?
The assembly output should only be slightly bigger than the original. The final comparison should yield no (or little; since -dis might discard data, like in PE) difference.

What do you see instead?
Assembly program output is about double the size (6.3 MB).  Final output is also about that size.

Previously the ELF flow was broken, so nacl_irt_x86_32.nexe would just be disassembled.  But now that support for the file may be "fixed", Courgette may try to read it and do a poor job.  In particular, patch sizes may increase.  Although the final Courgette pass fix discrepancies, this would involve storing extra bytes.


 
Cc: chrisha@chromium.org
I think the root cause is that Courgette assumes ELF file sections are sorted by sh_offset , but this does not hold for nacl_irt_x86_32.nexe.  Dumping the 51.0.2700.0 version of the file yields:

*** Section Headers ***
                  : o:00000000 +00000000 => r:00000000
             .text: o:00090000 +00262680 => r:0FD80000
.note.gnu.build-id: o:000000F4 +00000024 => r:3EF600F4
           .rodata: o:00000120 +0003B3EC => r:3EF60120
 .gcc_except_table: o:0003B50C +0000941C => r:3EF9B50C
         .eh_frame: o:00044928 +0004604C => r:3EFB4928
            .tdata: o:0008A978 +00000478 => r:3EFFA978
             .tbss: o:0008ADF0 +00000014 => r:3EFFADF0
       .init_array: o:0008ADF0 +00000004 => r:3EFFADF0
       .fini_array: o:0008ADF4 +00000004 => r:3EFFADF4
          .got.plt: o:0008ADF8 +0000000C => r:3EFFADF8
             .data: o:0008AE08 +00000CD8 => r:3EFFAE08
              .bss: o:0008BAE0 +00002540 => r:3EFFBAE0
          .comment: o:00300000 +000000ED => r:00000000
    .gnu_debuglink: o:003000ED +00000020 => r:00000000
         .shstrtab: o:0030010D +00000099 => r:00000000

(o:XXXXXXXX => file offsets).  Note that between .bss and .comment, the current offset and size went
from: o:0008BAE0 +00002540
  to: o:00300000 +000000ED

So the bytse between are copied.  But really this was already covered in the .text section!

Solution would be to sort the sections first.  Experimenting with that.
Project Member

Comment 2 by bugdroid1@chromium.org, Apr 9 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/8cffb28385dffb40c2c773b5b4c0730a533cd025

commit 8cffb28385dffb40c2c773b5b4c0730a533cd025
Author: huangs <huangs@chromium.org>
Date: Sat Apr 09 19:43:50 2016

[Courgette] Sort section headers by sh_offset in ELF flows.

Courgette ELF flows assumes that sections are sorted by |sh_offset|, but
this assumption may flow. In particular, nacl_irt_x86_32.nexe breaks
this, and this impacts Windows x86 Chrome because the .nexe file is
included. Solution is to do sort sections so we process them in the
order in file offset order.

This does not affect unittests. All test ELF data have "properly" sorted
sections, except elf-32-high-bss. And for this file, the offending .bss
section has sh_type=SHT_NOBITS, so DisassemblerElf32::ParseFile()
would ignore it.

BUG= 601948 

Review URL: https://codereview.chromium.org/1870293002

Cr-Commit-Position: refs/heads/master@{#386306}

[modify] https://crrev.com/8cffb28385dffb40c2c773b5b4c0730a533cd025/courgette/disassembler_elf_32.cc
[modify] https://crrev.com/8cffb28385dffb40c2c773b5b4c0730a533cd025/courgette/disassembler_elf_32.h

Cc: andrewhayden@chromium.org

Comment 4 by hua...@chromium.org, Apr 11 2016

Some numbers from running Courgette on nacl_irt_x86_32.nexe, from 48.0.2563.0 to 49.0.2623.0 (size 3,081,256 for both).  Patching works before fix and after fix, but patch sizes changed:

Before fix: 662,353
After fix:  288,442

So there's benefit to patch size!  Running experiment on chrome.7z now.


Comment 5 by hua...@chromium.org, Apr 11 2016

... but to be fair, the large patch size was a recent regression: Before ELF flow got fixed ( http://crbug.com/579206 ), Courgette would not recognize the file and use BSDiff, so before before I started touching ELF flow we had:

BSDiff: 369,034

Therefore the real saving in this case was 369,034 -> 288,442, or ~21.8% for this file.

Comment 6 by wfh@chromium.org, Apr 11 2016

We missed the M51 cut but I think this could be merged to M51 branch.

Comment 7 by hua...@chromium.org, Apr 11 2016

Compressed patch sizes:

             BSDiff: 114,541  (from 369,034)
Sections not sorted: 157,570  (from 662,353)
    Sections sorted:  83,522  (from 288,442)

so ~27% reduction, although not a big % of the ensemble patch.

Comment 8 by hua...@chromium.org, Apr 18 2016

There are more problems with ELF (focusing on DisassemblerElf32)
(1) ParseProgbitsSection() now visits section sorted by file offset order.  It should also visit abs32 and rel32 in the same order. As seen in RVAsToFileOffsets() (2 copies) this is done for abs32, but NOT rel32.

As seen in nacl_irt_x86_32.nexe sections from comment #1, (o: = file offset, r: = RVA):

           .rodata: o:00000120 +0003B3EC => r:3EF60120
             .text: o:00090000 +00262680 => r:0FD80000

So RVA order does not necessarily match file offset order.

(2) ParseRel32RelocsFromSections() is too permissive in deciding which section to parse. Sometimes it extracts bogus rel32 pointers from non-code sections, and bloats patches.
For nacl_irt_x86_32.nexe, an example is the .rodata section.

Project Member

Comment 9 by bugdroid1@chromium.org, Apr 28 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/3da0dd9308f66ae1122bbaefda2c99a4f90372d4

commit 3da0dd9308f66ae1122bbaefda2c99a4f90372d4
Author: huangs <huangs@chromium.org>
Date: Thu Apr 28 22:14:45 2016

[Courgette] ELF: Fix abs32 / rel32 ordering in ParseFile() and restrict rel32 parsing to .text.

This CL fixes 2 problems in Courgette ELF parsing:
1. ParseFile() scans through file bytes in file offset order while
   visiting abs32 and rel32 locations in lockstep. However, these
   locations were previously not sorted by file offset, resulting in
   some abs32 and rel32 locations being ignored.
2. ParseRel32RelocsFromSections() is too permissive, and extracts bogus
   rel32 addresses from non-code.

To solve (1) we sort abs32 and rel32 addresses by file offset in
ParseFile(). To solve (2) we restrict rel32 parsing to ".text" section
(heuristic). Also updating ELF test results.

BUG= 601948 

Review-Url: https://codereview.chromium.org/1928683002
Cr-Commit-Position: refs/heads/master@{#390506}

[modify] https://crrev.com/3da0dd9308f66ae1122bbaefda2c99a4f90372d4/courgette/disassembler_elf_32.cc
[modify] https://crrev.com/3da0dd9308f66ae1122bbaefda2c99a4f90372d4/courgette/disassembler_elf_32.h
[modify] https://crrev.com/3da0dd9308f66ae1122bbaefda2c99a4f90372d4/courgette/disassembler_elf_32_x86_unittest.cc
[modify] https://crrev.com/3da0dd9308f66ae1122bbaefda2c99a4f90372d4/courgette/encode_decode_unittest.cc

Status: Fixed (was: Assigned)

Sign in to add a comment