Excessive disk I/O when rewriting History file
Reported by
lightlyf...@gmail.com,
Apr 17 2017
|
||||||
Issue descriptionChrome Version: 58.0.3004.3 (Developer Build) Over the past couple of months, I've begun to notice Chrome causing significant disk I/O for somewhere around 3-6 minutes. The browser remains fully responsive during this time, but the disk LED is on pretty solid. I'm on a fairly old (okay, 11 year old) machine with a spinning HDD. Naturally I was curious as to the cause of this continuous, sustained disk I/O (and why the browser remained responsive during it :P) and did a bit of digging. I looked in htop to find the source of the I/O, and straced the couple of processes I found. I was rewarded with: Process 646 attached Process 695 attached [pid 646] restart_syscall(<... resuming interrupted call ...> <unfinished ...> [pid 695] ftruncate64(215, 0) = 0 [pid 695] fdatasync(215) = 0 [pid 695] pread64(65, "SQLite format 3\0\20\0\1\1\0@ \0\0M\32\0\0 \2"..., 4096, 0) = 4096 [pid 695] pread64(65, "\5\0\0\0\2\17\364\0\0\0\2<\17\372\17\364\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 40960) = 4096 [pid 695] pread64(65, "\5\6\246\1\200\4&\0\0\0\20\357\17\373\17\366\6\241\17\361\17\354\17\347\6\234\17\342\17\334\17\326"..., 4096, 634880) = 4096 [pid 695] pread64(65, "\r\0\0\0\17\1r\0\1r\17\26\16*\r@\fT\vl\n\256\t\321\10T\7g\5\356\4\360"..., 4096, 655360) = 4096 [pid 695] pwrite64(215, "\0\0\0\0\0\0\0\0\0\0\0\0$\251\31\221\0\0 \10\0\0\2\0\0\0\20\0\0\0\0\0"..., 512, 0) = 512 [pid 695] pwrite64(215, "\0\0\0\241", 4, 512) = 4 [pid 695] pwrite64(215, "\r\0\0\0\17\1r\0\1r\17\26\16*\r@\fT\vl\n\256\t\321\10T\7g\5\356\4\360"..., 4096, 516) = 4096 [pid 695] pwrite64(215, "$\251\37\217", 4, 4612) = 4 [pid 695] pread64(215, "", 8, 5120) = 0 [pid 695] fdatasync(215) = 0 [pid 695] pwrite64(215, "\331\325\5\371 \241c\327\0\0\0\1", 12, 0) = 12 [pid 695] fdatasync(215) = 0 [pid 695] pwrite64(65, "\r\0\0\0\17\1r\0\1r\17\26\16*\r@\fT\vl\n\256\t\321\10T\7g\5\356\4\360"..., 4096, 655360) = 4096 [pid 695] fdatasync(65) = 0 [pid 695] ftruncate64(215, 0) = 0 [pid 695] fdatasync(215) = 0 [pid 695] pread64(65, "SQLite format 3\0\20\0\1\1\0@ \0\0M\32\0\0 \2"..., 4096, 0) = 4096 [pid 695] pread64(65, "\5\0\0\0\2\17\364\0\0\0\2<\17\372\17\364\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 4096, 40960) = 4096 (...) The next obvious step was to chase down the file descriptors: $ ls /proc/695/fd/65 -l lrwx------ 1 i336 users 64 Apr 17 19:08 /proc/695/fd/65 -> /home/i336/.config/chromium/Default/History $ ls /proc/695/fd/215 -l lrwx------ 1 i336 users 64 Apr 17 19:08 /proc/695/fd/215 -> /home/i336/.config/chromium/Default/History-journal Hmm. I did some reading about SQLite3 some time ago, and among other things I learned about how its atomicity mechanisms (described at https://sqlite.org/atomiccommit.html) worked. In short (to quote a random paragraph), > SQLite does a "flush" or "fsync" operation at key points. This means that it explicitly forces the OS to sync the data written to disk; it doesn't trust the OS-level periodically-written I/O buffers. This is primarily to mitigate power loss. The full details of this process can be found in sections - 3.5, "Creating A Rollback Journal File" - 3.6, "Flushing The Rollback Journal File To Mass Storage" - 3.10, "Flushing Changes To Mass Storage", and - 3.11, "Deleting The Rollback Journal". I was able to observe this process happening in real time as I ran "watch --i 0 ls -lh History-journal": the journal file size continuously jumped between 4.6K and 0 bytes. The top of the above-linked page is informative: > Atomic commit means that either all database changes within a single transaction occur or none of them occur. Now, I'm not sure where I read this, but I also learned that a straightforward "bare" UPDATE or INSERT is treated like a discrete transaction. Taking into account the amount of data my History file contains... $ grep -o 'INSERT INTO "[^"]\+"' History.txt | uniq -c 4 INSERT INTO "meta" 35854 INSERT INTO "urls" 42735 INSERT INTO "visits" 14980 INSERT INTO "visit_source" 4827 INSERT INTO "keyword_search_terms" 15367 INSERT INTO "downloads" 15690 INSERT INTO "downloads_url_chains" 1042 INSERT INTO "segments" 2140 INSERT INTO "segment_usage" ...the numbers I see in the strace log I created while I observed the I/O happening only suggest one thing. $ grep -o 'fdatasync\|ftruncate' chromium-hdd2.txt | sort | uniq -c 31836 fdatasync 5892 ftruncate And that's that a significant number of writes being made to this file are being done as an independent transaction. I assume that the fdatasync()s are database commits and the ftruncate()s are journal manipulations, but I'm not sure how to break down the difference between them. Regardless, /ouch/. That being said, this only runs every few days, and my pathological worst-case is that it runs for a few minutes. So what? Well, regardless of how well my old laptop can keep up, this is kind of really inefficient. It would be nice to fix it. Clearing the SQLite3 flag that enforces atomicity (I'm not sure which it is) is probably not the greatest way to go about it though, as I'm sure you'll agree: Chrome is "just" a consumer desktop application, not a high-assurance system, but you've already got SQLite3 working to handle unexpected power loss - which probably occurs thousands of times a day :) - so killing that would be a huge setback. Rather, I think the (fairly obvious) solution would be to identify whatever code is causing these writes, and see if you can add more BEGINs and COMMITs than are currently being employed, if any. (That won't lose any atomicity, note.) You may have noticed I said "History.txt", not "History", a couple of commands ago. I thought I should dump the SQL out to a text file before grepping it, figuring the export would take a bit: I'm typing this on a ThinkPad T43, with - a single-core Pentium M (I'm using the unofficial 32-bit Chromium build I found on Launchpad) - 2GB RAM - a 5400RPM HDD (TIL! - I just Googled the model... I had no idea) Well, it took all of 1.48 seconds for sqlite3 to .dump the entire schema, which it wrote to disk at 18.1MB/s according to pipeviewer. While watching htop's I/O column, I noted that Chrome was slowly - painfully - iterating over my History file at around 100-300KB/s. It just doesn't make sense.
,
Apr 20 2017
,
Apr 24 2017
,
Sep 16 2017
Adding a small note that this behavior appears to be being noticed by someone else as issue 765881 , which I've tentatively suggested is a duplicate of this one. I'm making a note of this issue because that user is pointing out that activity described this and the other issue is causing poor system performance. I don't notice performance issues myself, but that's very likely because I realize my computer is old and don't push it too much when I see the disk spinning a lot. In all fairness, if there's a ton of disk I/O - and there is, in this issue I mention above that the HDD LED is _on solid_ - then that's going to have an orthogonal effect in any case.
,
Nov 2 2017
May not be related, but there is a significant jump in R58 in io activity. Espcially, 99% percentile of the instaneous IO load doubled: https://uma.googleplex.com/p/chrome/timeline_v2/?sid=e7743122c922d88bfd12220e0315afcb Link to 768851 for reference
,
Nov 10 2017
,
Nov 12
Issue has not been modified or commented on in the last 365 days, please re-open or file a new bug if this is still an issue. For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Dec 15
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/overlays/portage-stable/+/900cef8c0cb027d281077e3aa8a4bb25b645fa7d commit 900cef8c0cb027d281077e3aa8a4bb25b645fa7d Author: Gwendal Grignou <gwendal@chromium.org> Date: Sat Dec 15 06:40:33 2018 iotop: upgraded package to upstream Upgraded sys-process/iotop to version 0.6 on amd64, arm, x86 Used cros_portage_upgrade --board=eve:kevin:x86-generic --upgrade --unstable-ok sys-process/iotop Note: - Needs to be installed in /usr/local (to find python libraries) - Needs 4.14 or I/O accounting support enabled. (CONFIG_TASKSTATS, CONFIG_TASK_DELAY_ACCT, CONFIG_TASK_IO_ACCOUNTING) Added. BUG= chromium:712166 TEST=Instal on eve [move from /usr to /usr/local] Change-Id: Icabb1342d53fd80bf92143e35361369fe1846187 Signed-off-by: Gwendal Grignou <gwendal@chromium.org> Reviewed-on: https://chromium-review.googlesource.com/804495 Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com> Reviewed-by: Mike Frysinger <vapier@chromium.org> [add] https://crrev.com/900cef8c0cb027d281077e3aa8a4bb25b645fa7d/sys-process/iotop/metadata.xml [add] https://crrev.com/900cef8c0cb027d281077e3aa8a4bb25b645fa7d/sys-process/iotop/files/iotop-0.6-setup.py3.patch [add] https://crrev.com/900cef8c0cb027d281077e3aa8a4bb25b645fa7d/sys-process/iotop/iotop-0.6.ebuild [add] https://crrev.com/900cef8c0cb027d281077e3aa8a4bb25b645fa7d/metadata/md5-cache/sys-process/iotop-0.6 [add] https://crrev.com/900cef8c0cb027d281077e3aa8a4bb25b645fa7d/sys-process/iotop/Manifest |
||||||
►
Sign in to add a comment |
||||||
Comment 1 by ranjitkan@chromium.org
, Apr 18 2017