New issue
Advanced search Search tips

Issue 624752 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2018
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 539572



Sign in to add a comment

Research on dictionary generation

Project Member Reported by mmoroz@chromium.org, Jun 30 2016

Issue description

Experiment with automated dictionary generation 
 

Comment 1 by mmoroz@chromium.org, Jun 30 2016

Cc: och...@chromium.org mbarbe...@chromium.org infe...@chromium.org metzman@google.com kcc@chromium.org aizatsky@chromium.org
Labels: Restrict-View-EditIssue
Spreadsheet with benchmarks of different experiments: https://docs.google.com/a/google.com/spreadsheets/d/1r-dz8b5nG7SIRT1p1LDwF_r9J68UTkHgSU0_jq1cOJM/edit?usp=sharing

I guess it may be not much understandable at the moment, but I'll provide additional description about methodology of testing and things I've tried to do.

I have a design doc also, but need to update it before sharing.

Comment 2 by mmoroz@chromium.org, Jun 30 2016

Uploaded generation script for review: https://codereview.chromium.org/2115563002/

Now my plan is to do some work at CF side for continuous dictionary update based on a feedback from libFuzzer.

Then, to update libFuzzer to provide more feedback (on recommended and  useless dictionary elements).

After that we can play more with generation techniques or may be get rid of static generation thing and use dynamic analysis by libFuzzer + continuous dictionary updating at CF.


Basically, the script uploaded for a review is useful for text formats when you don't have any dictionary. It will generate something in ~1 second and in most of the cases it doesn't make things worse :)
Project Member

Comment 3 by bugdroid1@chromium.org, Jul 7 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1a6bef1675e05626a4692ab6fa43cbbc5515299b

commit 1a6bef1675e05626a4692ab6fa43cbbc5515299b
Author: mmoroz <mmoroz@chromium.org>
Date: Thu Jul 07 12:07:44 2016

[libfuzzer] Added script for dictionary generation.

BUG= 624752 
R=inferno@chromium.org, ochang@chromium.org, sky@chromium.org

Review URL: https://codereview.chromium.org/2115563002 .

Cr-Commit-Position: refs/heads/master@{#404133}

[modify] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/content/test/BUILD.gn
[add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/content/test/data/fuzzer_dictionaries/renderer_fuzzer.dict
[add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/dictionary_generator.py
[modify] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/BUILD.gn
[add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/dicts/generated/libxml_xml_read_memory_fuzzer.dict
[add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/dicts/generated/sqlite3_prepare_v2_fuzzer.dict
[add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/dicts/generated/url_parse_fuzzer.dict
[add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/dicts/generated/v8_script_parser_fuzzer.dict
[delete] https://crrev.com/45a5dde9824768ca016b4abae13a05abbd99014c/testing/libfuzzer/fuzzers/dicts/js.dict
[delete] https://crrev.com/45a5dde9824768ca016b4abae13a05abbd99014c/testing/libfuzzer/fuzzers/dicts/sql.dict

Project Member

Comment 5 by bugdroid1@chromium.org, Jul 7 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/4a561d36be6ea398dd54313ed552e1d154b21661

commit 4a561d36be6ea398dd54313ed552e1d154b21661
Author: mmoroz <mmoroz@chromium.org>
Date: Thu Jul 07 17:45:12 2016

[libfuzzer] Add or update dictionaries for //net fuzzers.

BUG= 624752 

Review-Url: https://codereview.chromium.org/2128583006
Cr-Commit-Position: refs/heads/master@{#404175}

[modify] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/BUILD.gn
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_dns_hosts_parse_fuzzer.dict
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_dns_record_fuzzer.dict
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_get_domain_and_registry_fuzzer.dict
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_host_resolver_impl_fuzzer.dict
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_http_proxy_client_socket_fuzzer.dict
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_http_stream_parser_fuzzer.dict
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_mime_sniffer_fuzzer.dict
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_parse_data_url_fuzzer.dict
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_url_request_fuzzer.dict
[add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_websocket_frame_parser_fuzzer.dict
[delete] https://crrev.com/5f24850ea0b9b4b4a8573c97f11758507e22ebde/net/data/http/http.dict

Reminder to myself: address the comments from https://codereview.chromium.org/2128583006/ during next update of generation script.

Comment 7 by ricea@chromium.org, Aug 30 2016

I just noticed the net_websocket_frame_parser_fuzzer.dict file and I'm quite confused. The parser parses frames with a 2, 4 or 10-byte binary header followed by opaque data. I don't understand why the dictionary is helping.

Comment 8 by mmoroz@chromium.org, Aug 30 2016

This is a fair notice. Our intern metzman@ had played with net_mime_sniffer_fuzzer and different dictionaries.

10 minutes of fuzzing:
----------------------------------------------------------------
 coverage | dictionary type
----------------------------------------------------------------
   1318   | first 200 words from Romeo and Juliet
    317   | no dictionary
   1208   | automatically generated dictionary
   1296   | manually written dictionary
   1374   | combination of automatically + manually created ones
----------------------------------------------------------------




overnight fuzzing:
----------------------------------------------------------------
 coverage | dictionary type
----------------------------------------------------------------
   1518   | first 200 words from Romeo and Juliet
   1522   | no dictionary
   1524   | automatically generated dictionary
   1535   | manually written dictionary
   1549   | combination of automatically + manually created ones
----------------------------------------------------------------

Comment 9 by mmoroz@chromium.org, Aug 30 2016

I would say that it proves how intelligent is the feedback-driven approach of LibFuzzer. Though numbers for some fuzzers are pretty impressive, my conclusion is to put more effort into LibFuzzer dynamic analysis than in static generation of dictionaries.

I'll update the script and clean up the dictionaries later this week.
I took a quick look at the stats last week and it seemed that recommended dictionary doesn't help much anymore on internal ClusterFuzz.

On OSS-Fuzz it seems to be broken since we moved to trusted / untrusted architecture: issue 797310.

I should take a look at stats breakdown by fuzz target, and either disable recommended dictionary strategy for now OR keep it enabled for some fuzzers only.
Labels: -Restrict-View-EditIssue
Cc: -aizatsky@chromium.org
Status: Fixed (was: Started)
This is done a while ago. The final piece was https://reviews.llvm.org/D30940

Sign in to add a comment