New issue
Advanced search Search tips

Issue 821262 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner: ----
Closed: Jun 2018
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

ReadFileToStringWithMaxSize should be optimized

Reported by misto...@yandex-team.ru, Mar 13 2018

Issue description

1. Currently ReadFileToStringWithMaxSize uses constant buffer size (64kb). Function can't know actual file size and has to read file sequentially till EOF. We can use GetFileSize function as a hint for buffer size, to speed up file reading (avoid reading to small buffer and appending that buffer to string in a loop). Also many fread syscalls may lead to frequent rescheduling if there are not enough idle threads (for example on browser startup) which leads to performance degradation.

2. Currently reading loop ends when fread returns 0. The latest fread syscall can be avoided using feof function which is just a flag check.

Performance results of these optimizations:
https://docs.google.com/spreadsheets/d/1P8x9_k8aEKlWs2kd8ltlMWKh4r9sJo4MpWUuIJZ9l7M/edit?usp=sharing
 
A brief glance at the speedups don't seem to show significant gains. A typical speedup is from 39.7 to 38.7. I'm not sure what the units are but either way the gain is modest. If the optimized version is simpler as well as faster then it still makes sense, but mostly for the simplification rather than the speedup.
All measurements in doc are in milliseconds (forgot to add it to doc).

The gain is valuable when ReadFileToString appears for large files (larger than buffer size) while all CPU threads are busy. For 5% of measurements there are notable slowdowns up to 2x times from mean (see 13+ threads, 1mb in performance results - 95 percentile).
I guess many users have 2-core CPUs and situation when all threads are busy and reading occurs can easily happen for them. And this fix may save about 20-50ms on file reading for 5% of them.
It sounds like you have a fix ready, do you want to create and upload a CL? I can't really comment on the proposed implementation without seeing it.

Since the benefits are greatest with larger buffers, maybe test with a 2 MB buffer, or even larger?

It looks like the mean is not capturing the improvement well, maybe add 50th percentile and 95th percentile numbers? Although, I'm still not sure why the mean doesn't show more movement on the 13-thread 1 MB version.

Note: if one advantage of this change is that it avoids resizing the buffer, with all of the extra allocating, copying, and freeing that this implies. On Windows this maps to reduced CPU time in the Chrome process (faulting in pages and faulting them out again is expensive) and also reduced CPU time in the system process (which is responsible for zeroing recently freed pages). It would be interesting to see if this affects CPU usage more than it affects elapsed time.

I would expect the fastest way to read a file to be either to memory map it or to read it in one read call. Feel free to CC me on your CL when you upload it so that I can see what you are proposing.

Components: Internals>Core
Cc: mark@chromium.org brucedaw...@chromium.org
Labels: Pri-3
Status: Started (was: Unconfirmed)
The CL looks reasonable, just needs some cleanup. Marking as assigned.
Project Member

Comment 7 by bugdroid1@chromium.org, May 16 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1ede155664606236fe8958b0a4c7e695bb378c39

commit 1ede155664606236fe8958b0a4c7e695bb378c39
Author: Mikhail Istomin <mistomin@yandex-team.ru>
Date: Wed May 16 17:40:24 2018

Optimize ReadFileToStringWithMaxSize function

GetFileSize can be used to estimate buffer size if file size is
available. So use it as a hint for buffer size for file reading.
Also avoid latest fread syscall in reading loop using feof.

Bug:  821262 
Change-Id: I7dc30483a2f95d4da2c0d3cb0049941d2634a281
Reviewed-on: https://chromium-review.googlesource.com/958861
Commit-Queue: Daniel Cheng <dcheng@chromium.org>
Reviewed-by: Daniel Cheng <dcheng@chromium.org>
Cr-Commit-Position: refs/heads/master@{#559168}
[modify] https://crrev.com/1ede155664606236fe8958b0a4c7e695bb378c39/base/files/file_util.cc
[modify] https://crrev.com/1ede155664606236fe8958b0a4c7e695bb378c39/base/files/file_util_unittest.cc

If this is done then you can mark it as fixed. It looks like it's working smoothly.
Status: Fixed (was: Started)
Marking as fixed. Please reopen if there is more work to do, with a description of the work remaining.

Sign in to add a comment