Research on dictionary generation |
|||
Issue descriptionExperiment with automated dictionary generation
,
Jun 30 2016
Uploaded generation script for review: https://codereview.chromium.org/2115563002/ Now my plan is to do some work at CF side for continuous dictionary update based on a feedback from libFuzzer. Then, to update libFuzzer to provide more feedback (on recommended and useless dictionary elements). After that we can play more with generation techniques or may be get rid of static generation thing and use dynamic analysis by libFuzzer + continuous dictionary updating at CF. Basically, the script uploaded for a review is useful for text formats when you don't have any dictionary. It will generate something in ~1 second and in most of the cases it doesn't make things worse :)
,
Jul 7 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1a6bef1675e05626a4692ab6fa43cbbc5515299b commit 1a6bef1675e05626a4692ab6fa43cbbc5515299b Author: mmoroz <mmoroz@chromium.org> Date: Thu Jul 07 12:07:44 2016 [libfuzzer] Added script for dictionary generation. BUG= 624752 R=inferno@chromium.org, ochang@chromium.org, sky@chromium.org Review URL: https://codereview.chromium.org/2115563002 . Cr-Commit-Position: refs/heads/master@{#404133} [modify] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/content/test/BUILD.gn [add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/content/test/data/fuzzer_dictionaries/renderer_fuzzer.dict [add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/dictionary_generator.py [modify] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/BUILD.gn [add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/dicts/generated/libxml_xml_read_memory_fuzzer.dict [add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/dicts/generated/sqlite3_prepare_v2_fuzzer.dict [add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/dicts/generated/url_parse_fuzzer.dict [add] https://crrev.com/1a6bef1675e05626a4692ab6fa43cbbc5515299b/testing/libfuzzer/fuzzers/dicts/generated/v8_script_parser_fuzzer.dict [delete] https://crrev.com/45a5dde9824768ca016b4abae13a05abbd99014c/testing/libfuzzer/fuzzers/dicts/js.dict [delete] https://crrev.com/45a5dde9824768ca016b4abae13a05abbd99014c/testing/libfuzzer/fuzzers/dicts/sql.dict
,
Jul 7 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/a767005aeaabee98139f75be37ffabea0f19fd62 commit a767005aeaabee98139f75be37ffabea0f19fd62 Author: mmoroz <mmoroz@chromium.org> Date: Thu Jul 07 13:03:08 2016 [libfuzzer] Fix escaping of quotes in dictionary_generator.py and dicts affected. BUG= 624752 TBR=inferno@chromium.org,sky@chromium.org Review URL: https://codereview.chromium.org/2130463004 . Cr-Commit-Position: refs/heads/master@{#404143} [modify] https://crrev.com/a767005aeaabee98139f75be37ffabea0f19fd62/content/test/data/fuzzer_dictionaries/renderer_fuzzer.dict [modify] https://crrev.com/a767005aeaabee98139f75be37ffabea0f19fd62/testing/libfuzzer/dictionary_generator.py [modify] https://crrev.com/a767005aeaabee98139f75be37ffabea0f19fd62/testing/libfuzzer/fuzzers/dicts/generated/libxml_xml_read_memory_fuzzer.dict [modify] https://crrev.com/a767005aeaabee98139f75be37ffabea0f19fd62/testing/libfuzzer/fuzzers/dicts/generated/url_parse_fuzzer.dict [modify] https://crrev.com/a767005aeaabee98139f75be37ffabea0f19fd62/testing/libfuzzer/fuzzers/dicts/generated/v8_script_parser_fuzzer.dict
,
Jul 7 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4a561d36be6ea398dd54313ed552e1d154b21661 commit 4a561d36be6ea398dd54313ed552e1d154b21661 Author: mmoroz <mmoroz@chromium.org> Date: Thu Jul 07 17:45:12 2016 [libfuzzer] Add or update dictionaries for //net fuzzers. BUG= 624752 Review-Url: https://codereview.chromium.org/2128583006 Cr-Commit-Position: refs/heads/master@{#404175} [modify] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/BUILD.gn [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_dns_hosts_parse_fuzzer.dict [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_dns_record_fuzzer.dict [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_get_domain_and_registry_fuzzer.dict [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_host_resolver_impl_fuzzer.dict [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_http_proxy_client_socket_fuzzer.dict [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_http_stream_parser_fuzzer.dict [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_mime_sniffer_fuzzer.dict [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_parse_data_url_fuzzer.dict [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_url_request_fuzzer.dict [add] https://crrev.com/4a561d36be6ea398dd54313ed552e1d154b21661/net/data/fuzzer_dictionaries/net_websocket_frame_parser_fuzzer.dict [delete] https://crrev.com/5f24850ea0b9b4b4a8573c97f11758507e22ebde/net/data/http/http.dict
,
Jul 8 2016
Reminder to myself: address the comments from https://codereview.chromium.org/2128583006/ during next update of generation script.
,
Aug 30 2016
I just noticed the net_websocket_frame_parser_fuzzer.dict file and I'm quite confused. The parser parses frames with a 2, 4 or 10-byte binary header followed by opaque data. I don't understand why the dictionary is helping.
,
Aug 30 2016
This is a fair notice. Our intern metzman@ had played with net_mime_sniffer_fuzzer and different dictionaries.
10 minutes of fuzzing:
----------------------------------------------------------------
coverage | dictionary type
----------------------------------------------------------------
1318 | first 200 words from Romeo and Juliet
317 | no dictionary
1208 | automatically generated dictionary
1296 | manually written dictionary
1374 | combination of automatically + manually created ones
----------------------------------------------------------------
overnight fuzzing:
----------------------------------------------------------------
coverage | dictionary type
----------------------------------------------------------------
1518 | first 200 words from Romeo and Juliet
1522 | no dictionary
1524 | automatically generated dictionary
1535 | manually written dictionary
1549 | combination of automatically + manually created ones
----------------------------------------------------------------
,
Aug 30 2016
I would say that it proves how intelligent is the feedback-driven approach of LibFuzzer. Though numbers for some fuzzers are pretty impressive, my conclusion is to put more effort into LibFuzzer dynamic analysis than in static generation of dictionaries. I'll update the script and clean up the dictionaries later this week.
,
Jan 8 2018
I took a quick look at the stats last week and it seemed that recommended dictionary doesn't help much anymore on internal ClusterFuzz. On OSS-Fuzz it seems to be broken since we moved to trusted / untrusted architecture: issue 797310. I should take a look at stats breakdown by fuzz target, and either disable recommended dictionary strategy for now OR keep it enabled for some fuzzers only.
,
Apr 9 2018
,
Apr 9 2018
This is done a while ago. The final piece was https://reviews.llvm.org/D30940 |
|||
►
Sign in to add a comment |
|||
Comment 1 by mmoroz@chromium.org
, Jun 30 2016Labels: Restrict-View-EditIssue