New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 923700 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
hobby only
Closed: Yesterday
Components:
EstimatedDays: ----
NextAction: ----
OS: Linux
Pri: 1
Type: Bug



Sign in to add a comment

Out-of-memory in csv_reader_fuzzer

Project Member Reported by ClusterFuzz, Jan 20 (3 days ago)

Issue description

Detailed report: https://clusterfuzz.com/testcase?key=5141169822629888

Fuzzer: libFuzzer_csv_reader_fuzzer
Fuzz target binary: csv_reader_fuzzer
Job Type: libfuzzer_chrome_msan
Platform Id: linux

Crash Type: Out-of-memory (exceeds 2048 MB)
Crash Address: 
Crash State:
  csv_reader_fuzzer
  
Sanitizer: memory (MSAN)

Regressed: https://clusterfuzz.com/revisions?job=libfuzzer_chrome_msan&range=623778:623783

Reproducer Testcase: https://clusterfuzz.com/download?testcase_id=5141169822629888

Issue filed automatically.

See https://www.chromium.org/developers/testing/memorysanitizer#TOC-Reproducing-ClusterFuzz-Bugs for instructions to reproduce this bug locally.
 
Project Member

Comment 1 by ClusterFuzz, Jan 20 (3 days ago)

Cc: vabr@chromium.org
Labels: ClusterFuzz-Auto-CC
Automatically adding ccs based on OWNERS file / target commit history.

If this is incorrect, please add ClusterFuzz-Wrong label.

Comment 2 by vabr@chromium.org, Jan 20 (2 days ago)

Cc: -vabr@chromium.org
Components: UI>Browser>Passwords
Owner: vabr@chromium.org
Status: Started (was: Untriaged)
I can confirm that this is caused by my recent https://crrev.com/c/1409457.
I will work on fixing this.

Comment 3 by vabr@chromium.org, Jan 20 (2 days ago)

And because running msan is a little complicated due to the use of docker, I am glad this can be reproduced in other configurations (asan debug in my case) with lowering the memory limit using -rss_limit_mb=

Comment 4 by vabr@chromium.org, Jan 20 (2 days ago)

I spoke too soon. On asan (as opposed to msan), the memory consumed after my CL is actually lower. Will investigate further.

Comment 5 by vabr@chromium.org, Jan 20 (2 days ago)

It seems that the parser actually changed the behaviour quite significantly. On the fuzzer's input, before the CL it produced about 30 thousands of records, after the CL it was over 500 thousand (!). The input contains a lot of line breaks, which seems to match this. I'm investigating what exactly happens, how to fix it and what regression test to add.

Comment 6 by vabr@chromium.org, Jan 20 (2 days ago)

== Reason of breakage ==

The input starts with the first row followed by a lot of '\r' (CR) characters and then a '\n' (LF).

The old parser would understand the '\r's as part of the string value, while the new one would just treat '\r' the same as '\n' -- end of line.

The RFC 4180 seems to only specify the line break as "\r\n", so actually neither of the parsers follows it properly (both accept single "\n" as line breaks). But for good portability and because the line break is represented differently on Win (CRLF), Mac (CR) and Linux (LF), it seems more desirable to accept just about any combination of CR and LF as a line break.

Following that idea, the second parser is correct not to regard the CRs as part of the field value. However, adding "empty" records, i.e., records where there is only one field, with an empty string as a value, is useless (remember: this should be used for importing passwords, which are always non-empty).

== Plan ==

I'll keep the parsing of the new parser, but make it skip "empty" records.

Comment 7 by vabr@chromium.org, Jan 20 (2 days ago)

For the record, after implementing the plan from #6, the number of records extracted from the fuzzer's input dropped to about 1 thousand.

Comment 8 by vabr@chromium.org, Jan 20 (2 days ago)

CL in https://crrev.com/c/1424302
Project Member

Comment 9 by bugdroid1@chromium.org, Yesterday (45 hours ago)

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/fac4c96fb07f09b54011efc9a7fc0ab61be1d1de

commit fac4c96fb07f09b54011efc9a7fc0ab61be1d1de
Author: Vaclav Brozek <vabr@chromium.org>
Date: Mon Jan 21 08:26:29 2019

Skip empty lines in CSV parser

If the input to the CSV parser contains a sequence of characters from
the set {'\r', '\n'}, the parser sees a lot of empty rows.
Transforming those into the structured output creates a lot of
single-element maps with the empty string as the value. Those are not
useful for the task the parser is used for (importing passwords) but
adding them costs memory and performance.

Therefore this CL starts skipping empty rows.

Bug:  923700 
Change-Id: I1ec5ec10c2425d497c4c611167240469e38ede2f
Reviewed-on: https://chromium-review.googlesource.com/c/1424302
Reviewed-by: Jan Wilken Dörrie <jdoerrie@chromium.org>
Commit-Queue: Vaclav Brozek <vabr@chromium.org>
Cr-Commit-Position: refs/heads/master@{#624531}
[modify] https://crrev.com/fac4c96fb07f09b54011efc9a7fc0ab61be1d1de/components/password_manager/core/browser/import/csv_reader.cc
[modify] https://crrev.com/fac4c96fb07f09b54011efc9a7fc0ab61be1d1de/components/password_manager/core/browser/import/csv_reader_unittest.cc

Comment 10 by vabr@chromium.org, Yesterday (45 hours ago)

Status: Fixed (was: Started)
On my machine I could confirm that r624531 above fixed the issue on msan.

Comment 11 by jdoerrie@chromium.org, Yesterday (45 hours ago)

Re #c6: Mac switched to LF endings with Mac OS X, released in 2001 [1]. Because of this, I think we should restore the old parser's behavior, and only treat '\r\n' and '\n' as new lines. Implementation wise this should be straight forward, as it's effectively splitting on '\n', and removing a single trailing '\r' if present.

[1] https://superuser.com/q/439440

Comment 12 by vabr@chromium.org, Yesterday (44 hours ago)

Thanks for pointing this out, Jan! Not sure where I picked up the CR myth, I'm pretty sure I have not even seen a Mac before 2001. :)

I filed bug 923811 and am writing on my virtual blackboard: "I will check my assumptions" 100 times. :)

Sign in to add a comment