New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 606413 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: Android
Pri: 3
Type: Bug



Sign in to add a comment

Chrome crashes when trying to read pak files of which size are zero.

Project Member Reported by agrieve@chromium.org, Apr 25 2016

Issue description

When Chrome crashes, we can see that the size of pak files is zero and think that they are corrupted.
But unfortunately we didn't get them so I can't share the real pak files now. 

[ Issue Case ]
root@m209n:/data/data/com.android.chrome/app_chrome/paks # ls -l
rw------ u0_a79 u0_a79 0 1970-01-02 10:27 en-US.pak
rw------ u0_a79 u0_a79 0 1970-01-02 10:27 ko.pak
rw------ u0_a79 u0_a79 0 1970-01-02 10:27 pak_timestamp-249007601-1455593443000

[ Normal Case ]
root@m209n:/data/data/com.android.chrome/app_chrome/paks # ls -l
rw------ u0_a79 u0_a79 78902 1970-01-01 14:59 en-US.pak
rw------ u0_a79 u0_a79 97509 1970-01-01 14:59 ko.pak
rw------ u0_a79 u0_a79 0 1970-01-01 14:59 pak_timestamp-249007601-1455593443000 

2. Is there any repro steps for the corruption? 
Reproducible step is simple but reproduce rate is very low.
Just try to launch Chrome.
 
Looking at ResourceExtractor.java:
https://code.google.com/p/chromium/codesearch#chromium/src/base/android/java/src/org/chromium/base/ResourceExtractor.java

- When Chrome is updated, we delete all extracted files
- When extracting files, we explicitly check for .length != 0 right after extracting
- If any part of extracting fails, we delete all extracted files.

So, the only guess i have here is that extraction finishes successfully, OS / filesystem-level caching is happening, causing our explicit .length != 0 check to pass, and then something crashes and the OS / filesystem cache never actually flushes to disk.

Possible workarounds:
1. Make the timestamp file size != 0, and then consider it invalid if it turns out to be 0.
2. On start-up, check each file to see if its length is 0.
3. On start-up, check each file to see if its length is wrong.

#3 seems like the most robust, but to do efficiently requires some build rules to generate the expected sizes as a .java file. Since our effort to use .apk splits failed, and having locale .pak files as android resources (instead of assets) causes excess .arsc bloat, I think it probably is worth doing. However, it at least makes sense to wait for GYP to be gone before messing with the build scripts related to this.

As a quick fix, I'm going to implement #2, then revisit for #3.

Comment 3 by torne@chromium.org, Apr 26 2016

I looked at the ResourceExtractor code and tried to dig up my recollection of how Java file stuff maps to syscalls (then googled everything to see if the internet knows either), and I have a few thoughts that I believe are right:

1) Calling os.flush() is pointless in the code at present, you should just close the file once you're done writing it and then check the length after you close it. Close implies flush, the only reason to call flush is if you are going to do more with the file afterwards. On *nix this *probably* doesn't make any practical difference but it's at least within the scope of possible behaviour people worry about (possibly on some windows configs? internet is unclear) that the file metadata isn't updated until close even if you flush the stream, so at worst this change makes the function shorter and at best it might fix something :)

2) Once you call flush() (directly or implicitly in close() the JVM is supposed to have guaranteed that it's submitted all the writes to the kernel. You're right that this doesn't guarantee they are flushed to the actual disk, but after this point the *only* way for these writes to not eventually end up on disk is if the *kernel* crashes, the device loses power, or the disk itself fails the writes. Chrome crashing, or even the entire Android java framework dying, would *not* prevent the disk writes. So, while this is possible, it's pretty unlikely, that would have to be a very unfortunately timed kernel crash.

3) The scheme with having different timestamp files generated from version info seems complicated and while I don't see a bug there might be one. It might be better to just have a fixed name for the "done" file and write the string we expect to find into it as data?

4) If we think the case we care about *actually is* the system dying before the kernel's buffer cache is written to disk then you need to be *much* more careful about things here than the proposals above: disk writes aren't ordered (so a particular file being okay doesn't mean others written earlier are) and checking length isn't enough to decide if they're valid (files can have the correct length but empty or garbage contents, as metadata writes aren't guaranteed to be ordered wrt. data writes either). The algorithm you'd want to follow would be something like:

- For each file that should be extracted:
  a) Write the correct contents into it
  b) Call flush() on the output stream to ensure all writes are issued to the kernel
  c) Call getFD().sync() on the output stream to issue an fsync() on the underlying file descriptor, which will block until the kernel believes the data is *actually on disk* (note: disk may still be lying but this is not fixable from userspace)
  d) Close it
- Once all files are done without any of them throwing an exception:
  a) Open (for create) the file used to mark that it's done
  b) Call sync() on the file descriptor to make sure that the existence of the file is actually on disk
  c) Close it

Comment 4 by torne@chromium.org, Apr 26 2016

Cc: torne@chromium.org
Project Member

Comment 5 by bugdroid1@chromium.org, Apr 27 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b28607d4c72261657ea9e24ea9195bc23a472b3f

commit b28607d4c72261657ea9e24ea9195bc23a472b3f
Author: agrieve <agrieve@chromium.org>
Date: Wed Apr 27 14:43:11 2016

Check for 0-length files in ResourceExtractor

Also moves from an on-disk version timestamp file to SharedPreferences
(should be faster).

BUG= 606413 

Review URL: https://codereview.chromium.org/1920893003

Cr-Commit-Position: refs/heads/master@{#390068}

[modify] https://crrev.com/b28607d4c72261657ea9e24ea9195bc23a472b3f/base/android/java/src/org/chromium/base/ResourceExtractor.java

Status: Fixed (was: Started)
Didn't go to quite the extremes as detailed 4) above, but at least simplified the timestamp file logic and made it re-extract when it encounters a 0-length file.
Project Member

Comment 7 by bugdroid1@chromium.org, May 11 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/a3f2dd5d896ca693224403ad072cfad70882174b

commit a3f2dd5d896ca693224403ad072cfad70882174b
Author: qyearsley <qyearsley@chromium.org>
Date: Wed May 11 19:24:23 2016

Revert of Check for 0-length files in ResourceExtractor (patchset #3 id:40001 of https://codereview.chromium.org/1920893003/ )

Reason for revert:
Possibly regressed memory usage on android ( http://crbug.com/607598 )

Speculative revert to see if vm_private_dirty_final_browser values go back down for android bots on the perf waterfall

Original issue's description:
> Check for 0-length files in ResourceExtractor
>
> Also moves from an on-disk version timestamp file to SharedPreferences
> (should be faster).
>
> BUG= 606413 
>
> Committed: https://crrev.com/b28607d4c72261657ea9e24ea9195bc23a472b3f
> Cr-Commit-Position: refs/heads/master@{#390068}

TBR=yfriedman@chromium.org,torne@chromium.org,wnwen@chromium.org,agrieve@chromium.org
# Not skipping CQ checks because original CL landed more than 1 days ago.
BUG= 606413 

Review-Url: https://codereview.chromium.org/1948033004
Cr-Commit-Position: refs/heads/master@{#393023}

[modify] https://crrev.com/a3f2dd5d896ca693224403ad072cfad70882174b/base/android/java/src/org/chromium/base/ResourceExtractor.java

Project Member

Comment 8 by bugdroid1@chromium.org, May 19 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f17984d06c7a736aa2cf7aac3d8aa6dc2628ef9a

commit f17984d06c7a736aa2cf7aac3d8aa6dc2628ef9a
Author: agrieve <agrieve@chromium.org>
Date: Thu May 19 21:17:06 2016

Reland of Check for 0-length files in ResourceExtractor

Reason for reland:
No longer using ZipFile. Hopefully this was the reason
for the dirty memory.

TBR=yfriedman@chromium.org,torne@chromium.org,wnwen@chromium.org,qyearsley@chromium.org
BUG= 606413 , 607598 

Review-Url: https://codereview.chromium.org/1996663002
Cr-Commit-Position: refs/heads/master@{#394866}

[modify] https://crrev.com/f17984d06c7a736aa2cf7aac3d8aa6dc2628ef9a/base/android/java/src/org/chromium/base/ResourceExtractor.java

Sign in to add a comment