Courgette incorrectly disassembles nacl_irt_x86_32.nexe |
|||
Issue descriptionVersion: 51 OS: Windows What steps will reproduce the problem? (1) Go to Chrome's install directly and copy nacl_irt_x86_32.nexe to temp directory. It's an ELF-x86 executable, about 3.1 MB. (2) Use Courgette to convert this to assembly: courgette.exe -dis nacl_irt_x86_32.nexe nacl_irt_x86_32.nexe.asm (3) Examine the size of output file nacl_irt_x86_32.nexe.asm (4) Convert the assembly file back to .exe courgette.exe -asm nacl_irt_x86_32.nexe.asm test (5) Compare final output with original nacl_irt_x86_32.nexe: fc /b nacl_irt_x86_32.nexe test What is the expected output? The assembly output should only be slightly bigger than the original. The final comparison should yield no (or little; since -dis might discard data, like in PE) difference. What do you see instead? Assembly program output is about double the size (6.3 MB). Final output is also about that size. Previously the ELF flow was broken, so nacl_irt_x86_32.nexe would just be disassembled. But now that support for the file may be "fixed", Courgette may try to read it and do a poor job. In particular, patch sizes may increase. Although the final Courgette pass fix discrepancies, this would involve storing extra bytes.
,
Apr 9 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/8cffb28385dffb40c2c773b5b4c0730a533cd025 commit 8cffb28385dffb40c2c773b5b4c0730a533cd025 Author: huangs <huangs@chromium.org> Date: Sat Apr 09 19:43:50 2016 [Courgette] Sort section headers by sh_offset in ELF flows. Courgette ELF flows assumes that sections are sorted by |sh_offset|, but this assumption may flow. In particular, nacl_irt_x86_32.nexe breaks this, and this impacts Windows x86 Chrome because the .nexe file is included. Solution is to do sort sections so we process them in the order in file offset order. This does not affect unittests. All test ELF data have "properly" sorted sections, except elf-32-high-bss. And for this file, the offending .bss section has sh_type=SHT_NOBITS, so DisassemblerElf32::ParseFile() would ignore it. BUG= 601948 Review URL: https://codereview.chromium.org/1870293002 Cr-Commit-Position: refs/heads/master@{#386306} [modify] https://crrev.com/8cffb28385dffb40c2c773b5b4c0730a533cd025/courgette/disassembler_elf_32.cc [modify] https://crrev.com/8cffb28385dffb40c2c773b5b4c0730a533cd025/courgette/disassembler_elf_32.h
,
Apr 11 2016
,
Apr 11 2016
Some numbers from running Courgette on nacl_irt_x86_32.nexe, from 48.0.2563.0 to 49.0.2623.0 (size 3,081,256 for both). Patching works before fix and after fix, but patch sizes changed: Before fix: 662,353 After fix: 288,442 So there's benefit to patch size! Running experiment on chrome.7z now.
,
Apr 11 2016
... but to be fair, the large patch size was a recent regression: Before ELF flow got fixed ( http://crbug.com/579206 ), Courgette would not recognize the file and use BSDiff, so before before I started touching ELF flow we had: BSDiff: 369,034 Therefore the real saving in this case was 369,034 -> 288,442, or ~21.8% for this file.
,
Apr 11 2016
We missed the M51 cut but I think this could be merged to M51 branch.
,
Apr 11 2016
Compressed patch sizes:
BSDiff: 114,541 (from 369,034)
Sections not sorted: 157,570 (from 662,353)
Sections sorted: 83,522 (from 288,442)
so ~27% reduction, although not a big % of the ensemble patch.
,
Apr 18 2016
There are more problems with ELF (focusing on DisassemblerElf32)
(1) ParseProgbitsSection() now visits section sorted by file offset order. It should also visit abs32 and rel32 in the same order. As seen in RVAsToFileOffsets() (2 copies) this is done for abs32, but NOT rel32.
As seen in nacl_irt_x86_32.nexe sections from comment #1, (o: = file offset, r: = RVA):
.rodata: o:00000120 +0003B3EC => r:3EF60120
.text: o:00090000 +00262680 => r:0FD80000
So RVA order does not necessarily match file offset order.
(2) ParseRel32RelocsFromSections() is too permissive in deciding which section to parse. Sometimes it extracts bogus rel32 pointers from non-code sections, and bloats patches.
For nacl_irt_x86_32.nexe, an example is the .rodata section.
,
Apr 28 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3da0dd9308f66ae1122bbaefda2c99a4f90372d4 commit 3da0dd9308f66ae1122bbaefda2c99a4f90372d4 Author: huangs <huangs@chromium.org> Date: Thu Apr 28 22:14:45 2016 [Courgette] ELF: Fix abs32 / rel32 ordering in ParseFile() and restrict rel32 parsing to .text. This CL fixes 2 problems in Courgette ELF parsing: 1. ParseFile() scans through file bytes in file offset order while visiting abs32 and rel32 locations in lockstep. However, these locations were previously not sorted by file offset, resulting in some abs32 and rel32 locations being ignored. 2. ParseRel32RelocsFromSections() is too permissive, and extracts bogus rel32 addresses from non-code. To solve (1) we sort abs32 and rel32 addresses by file offset in ParseFile(). To solve (2) we restrict rel32 parsing to ".text" section (heuristic). Also updating ELF test results. BUG= 601948 Review-Url: https://codereview.chromium.org/1928683002 Cr-Commit-Position: refs/heads/master@{#390506} [modify] https://crrev.com/3da0dd9308f66ae1122bbaefda2c99a4f90372d4/courgette/disassembler_elf_32.cc [modify] https://crrev.com/3da0dd9308f66ae1122bbaefda2c99a4f90372d4/courgette/disassembler_elf_32.h [modify] https://crrev.com/3da0dd9308f66ae1122bbaefda2c99a4f90372d4/courgette/disassembler_elf_32_x86_unittest.cc [modify] https://crrev.com/3da0dd9308f66ae1122bbaefda2c99a4f90372d4/courgette/encode_decode_unittest.cc
,
Sep 20 2016
|
|||
►
Sign in to add a comment |
|||
Comment 1 by hua...@chromium.org
, Apr 8 2016I think the root cause is that Courgette assumes ELF file sections are sorted by sh_offset , but this does not hold for nacl_irt_x86_32.nexe. Dumping the 51.0.2700.0 version of the file yields: *** Section Headers *** : o:00000000 +00000000 => r:00000000 .text: o:00090000 +00262680 => r:0FD80000 .note.gnu.build-id: o:000000F4 +00000024 => r:3EF600F4 .rodata: o:00000120 +0003B3EC => r:3EF60120 .gcc_except_table: o:0003B50C +0000941C => r:3EF9B50C .eh_frame: o:00044928 +0004604C => r:3EFB4928 .tdata: o:0008A978 +00000478 => r:3EFFA978 .tbss: o:0008ADF0 +00000014 => r:3EFFADF0 .init_array: o:0008ADF0 +00000004 => r:3EFFADF0 .fini_array: o:0008ADF4 +00000004 => r:3EFFADF4 .got.plt: o:0008ADF8 +0000000C => r:3EFFADF8 .data: o:0008AE08 +00000CD8 => r:3EFFAE08 .bss: o:0008BAE0 +00002540 => r:3EFFBAE0 .comment: o:00300000 +000000ED => r:00000000 .gnu_debuglink: o:003000ED +00000020 => r:00000000 .shstrtab: o:0030010D +00000099 => r:00000000 (o:XXXXXXXX => file offsets). Note that between .bss and .comment, the current offset and size went from: o:0008BAE0 +00002540 to: o:00300000 +000000ED So the bytse between are copied. But really this was already covered in the .text section! Solution would be to sort the sections first. Experimenting with that.