New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 826305 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: May 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocking:
issue 836663
issue 759794
issue 789981



Sign in to add a comment

Find a way to avoid hard errors during merge when some dumps are corrupted

Project Member Reported by mmoroz@chromium.org, Mar 27 2018

Issue description

An example of that:

error: dumps/service_manager_unittests/dump.14855357469612657648_3.profraw: Invalid instrumentation profile data (file header is corrupt)


After that message, "llvm-profdata merge" errors out and we have .profdata file of 0 bytes. I think we need either make that error to be a warning, or find a way to do quick sanity check on each *.profraw file and exclude corrupted ones.

I'm also not sure why some files get corrupted, but it happens very rare. Even in that particular case only one file out of ~650 is corrupted:

$ ls -l dumps/service_manager_unittests
total 126256
-rw-rw-r-- 1 mmoroz mmoroz  3739816 Mar 27 06:33 dump.14787693995747527664_3.profraw
-rw-rw-r-- 1 mmoroz mmoroz  3910440 Mar 27 06:33 dump.14841300326566677488_1.profraw
-rw-rw-r-- 1 mmoroz mmoroz  3952848 Mar 27 06:33 dump.14853985015889742832_0.profraw
-rw-rw-r-- 1 mmoroz mmoroz  3962680 Mar 27 06:33 dump.14855357469612657648_0.profraw
-rw-rw-r-- 1 mmoroz mmoroz  3962680 Mar 27 06:33 dump.14855357469612657648_2.profraw
-rw-rw-r-- 1 mmoroz mmoroz    81920 Mar 27 06:33 dump.14855357469612657648_3.profraw
-rw-rw-r-- 1 mmoroz mmoroz  3958896 Mar 27 06:33 dump.14855955461059202032_2.profraw
-rw-rw-r-- 1 mmoroz mmoroz  3958896 Mar 27 06:33 dump.14855955461059202032_3.profraw
-rw-rw-r-- 1 mmoroz mmoroz  3965736 Mar 27 06:33 dump.14857732386835483632_1.profraw
-rw-rw-r-- 1 mmoroz mmoroz  3961808 Mar 27 06:33 dump.14858998896365226992_1.profraw
-rw-rw-r-- 1 mmoroz mmoroz  3961808 Mar 27 06:33 dump.14858998896365226992_3.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4006720 Mar 27 06:33 dump.14864647133376891888_2.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4040416 Mar 27 06:33 dump.14872828073745927152_0.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4040416 Mar 27 06:33 dump.14872828073745927152_2.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4023320 Mar 27 06:33 dump.14878122654581476336_0.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4023320 Mar 27 06:33 dump.14878122654581476336_1.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4023320 Mar 27 06:33 dump.14878122654581476336_2.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4023320 Mar 27 06:33 dump.14878122654581476336_3.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4086128 Mar 27 06:33 dump.14883876727446391792_0.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4086128 Mar 27 06:33 dump.14883876727446391792_1.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4086128 Mar 27 06:33 dump.14883876727446391792_3.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4111184 Mar 27 06:33 dump.14891371443556535280_0.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4111184 Mar 27 06:33 dump.14891371443556535280_1.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4111184 Mar 27 06:33 dump.14891371443556535280_2.profraw
-rw-rw-r-- 1 mmoroz mmoroz  4111184 Mar 27 06:33 dump.14891371443556535280_3.profraw
-rw-rw-r-- 1 mmoroz mmoroz 10965896 Mar 27 06:33 dump.18442437479409972138_0.profraw
-rw-rw-r-- 1 mmoroz mmoroz 10965896 Mar 27 06:33 dump.18442437479409972138_1.profraw
-rw-rw-r-- 1 mmoroz mmoroz 10965896 Mar 27 06:33 dump.18442437479409972138_3.profraw
-rw-rw-r-- 1 mmoroz mmoroz    16806 Mar 27 06:33 log.txt


 

Comment 1 by mmoroz@chromium.org, Apr 20 2018

Experienced that problem once again while working on  issue 834781 :

$ time third_party/llvm-build/Release+Asserts/bin/llvm-profdata merge -sparse out/full2/new*.profraw -o out/full2/new.profdata
error: out/full2/new.7052560453082142202_1.profraw: Invalid instrumentation profile data (file header is corrupt)

real	0m32.218s
user	4m5.920s
sys	0m38.440s

$ ls -l out/full2/new.7052560453082142202_*.profraw
-rw-r--r-- 1 mmoroz primarygroup 81989480 Apr 20 08:09 out/full2/new.7052560453082142202_0.profraw
-rw-r--r-- 1 mmoroz primarygroup 16379904 Apr 20 08:33 out/full2/new.7052560453082142202_1.profraw <--- !
-rw-r--r-- 1 mmoroz primarygroup 81989480 Apr 20 08:33 out/full2/new.7052560453082142202_2.profraw
-rw-r--r-- 1 mmoroz primarygroup 81989480 Apr 20 08:03 out/full2/new.7052560453082142202_3.profraw
-rw-r--r-- 1 mmoroz primarygroup 81989480 Apr 20 08:04 out/full2/new.7052560453082142202_4.profraw
-rw-r--r-- 1 mmoroz primarygroup 81989480 Apr 20 08:08 out/full2/new.7052560453082142202_5.profraw
-rw-r--r-- 1 mmoroz primarygroup 81989480 Apr 20 08:09 out/full2/new.7052560453082142202_6.profraw
-rw-r--r-- 1 mmoroz primarygroup 81989480 Apr 20 08:09 out/full2/new.7052560453082142202_7.profraw


$ ls -l out/full2/new.profdata
-rw-r--r-- 1 mmoroz primarygroup 0 Apr 20 08:35 out/full2/new.profdata


Comment 2 by mmoroz@chromium.org, Apr 20 2018

Labels: -Pri-2 Coverage-v1-Blocker Pri-1
That feels quite important to avoid hard error when using the script, I'm adding Coverage-v1-Blocker label here.

I see 3 potential solutions;

1) Fix in llvm-profdata to treat such case as a warning rather than error. Probably a bad idea as users of llvm-profdata should not pass invalid inputs, it's their fault if they do.

2) Fix in the script. Check error code and output of "llvm-profdata merge". If there is an error like this, print a WARNING and exclude corrupted *.profdata file. Re-try "llvm-profdata merge" after that.

3) Implement additional sanity check function in the script (bad idea as we'll have to understand the .profraw format and hope that it's not going to change) or add another option to llvm-profdata for performing a check.


I'm in favor of option 2), since llvm-profdata merge works pretty fast, and corrupted profile error happens not often.

Comment 3 by mmoroz@chromium.org, Apr 24 2018

Owner: mmoroz@chromium.org
Status: Assigned (was: Untriaged)
Gave it another thought, we should fix it in the following way (similar to my previous comment, but more detailed):

1) _GetProfileRawDataPathsByExecutingCommands should group .profraw files for each (target, command). Now we have a single list of .profraw files generated by all the commands/targets.

2) _CreateCoverageProfileDataFromProfRawData should merge those groups of .profraw files separately rather then all of them via a single command

3) after every per-target merge command we should check the return code and the output. If there is an error like "Invalid instrumentation profile data (file header is corrupt)", go to step 4).

4) Call "llvm-profdata merge" for every individual .profraw file corresponding to the given target. Find corrupted profile(s), exclude them from further merge.

5) Repeat step 2) if needed.

6) Call "llvm-profdata merge" for the resulting .profdata files (if we have more than one). This step is super fast and is less error-prone compared to merge of .profraw files.


Assigning to myself, hopefully will start working on this later today.

Comment 4 by mmoroz@chromium.org, Apr 25 2018

Blocking: 836663
Owner: infe...@chromium.org
Passing this over to Abhishek. Please see comment #3 for the solution proposed.
Project Member

Comment 6 by bugdroid1@chromium.org, May 4 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/c19bc5ef3e4c85dd5f314f53e71d80bb4d51e5e4

commit c19bc5ef3e4c85dd5f314f53e71d80bb4d51e5e4
Author: Abhishek Arya <inferno@chromium.org>
Date: Fri May 04 22:10:02 2018

Make profraw merge error resilient in code coverage script.

R=mmoroz@chromium.org

Bug:  826305 
Change-Id: I3e3c196ad0f792a3aa94b3cb31fbe8d8df8fb5e1
Reviewed-on: https://chromium-review.googlesource.com/1044952
Commit-Queue: Abhishek Arya <inferno@chromium.org>
Reviewed-by: Max Moroz <mmoroz@chromium.org>
Cr-Commit-Position: refs/heads/master@{#556217}
[modify] https://crrev.com/c19bc5ef3e4c85dd5f314f53e71d80bb4d51e5e4/tools/code_coverage/coverage.py

Project Member

Comment 7 by bugdroid1@chromium.org, May 7 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome/tools/code-coverage/+/c89e355320aad3823c9cce7f8785ec8cee5cea81

commit c89e355320aad3823c9cce7f8785ec8cee5cea81
Author: Abhishek Arya <inferno@chromium.org>
Date: Mon May 07 15:03:23 2018

Project Member

Comment 8 by bugdroid1@chromium.org, May 7 2018

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chrome/tools/code-coverage/+/8c8b1ee19e7085c18937d077446fe51170fe19d2

commit 8c8b1ee19e7085c18937d077446fe51170fe19d2
Author: Abhishek Arya <inferno@chromium.org>
Date: Mon May 07 15:19:35 2018

Status: Fixed (was: Assigned)

Sign in to add a comment