New issue
Advanced search Search tips

Issue 603964 link

Starred by 1 user

Issue metadata

Status: WontFix
Owner: ----
Closed: Apr 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Fuzzers don't seem to be using dictionaries?

Project Member Reported by mmenke@chromium.org, Apr 15 2016

Issue description

I've written a fuzzer (Not yet landed) for HTTP proxy code, and have made a dictionary for it, but it seems to ignore the dictionary when I tell it to use it.  I'm checking this by looking at the corpus it creates, which completely lacks the strings I'm giving it.  It could be it doesn't think my dictionary is interesting enough, but it thinks enough weird inputs are interesting that this seems unlikely to be the case.

The command line I'm using:
out/Fuzzer/net_http_proxy_client_socket_fuzzer --dict=net/data/http/http.dict /tmp/fuzz/

My dictionary (No idea if longer or shorter strings work better, just started experimenting):
# Copyright 2016 The Chromium Authors. All rights reserved.
# Use of this source code is governed by a BSD-style license that can be
# found in the LICENSE file.

"HTTP/1.1 200 OK\x0a\x0a" # Tried both \n's and \x0a's.
"HTTP/1.1 401 Unauthorized\r\nWWW-Authenticate: Basic realm=\"Middle-Earth\"\r\n\r\n"
"HTTP/1.1 407 Proxy Authentication Required\r\Proxy-Authenticate: Digest realm=\"Middle-Earth\", nonce=\"aaaaaaaaaa\"\r\n\r\n"

"Content-Length: 0\r\n"
"Content-Length: 500\r\n"
"Content-Encoding: Chunked\r\n\r\n5\r\n12345\r\n0\r\n\r\n"
 

Comment 1 by mmenke@chromium.org, Apr 15 2016

And a link to the fuzzer, in its current state:  https://codereview.chromium.org/1890193002/

I've verified that my fuzzer works, with the right input, so the issue doesn't seem to be a bug on that side of things.

Comment 2 by kcc@chromium.org, Apr 15 2016

Can you copy-paste the libfuzzer output after a few minutes of run? 
If the dictionary is successfully read, you will see this: 
Dictionary: 122 entries


If the dictionary entries are being used for successful mutations
you will see "AddFromManualDict" in the output, something like this:
#134	NEW    cov: 13163 bits: 16395 indir: 115 units: 69 exec/s: 0 L: 27 MS: 3 ShuffleBytes-AddFromManualDict-AddFromManualDict- DE: "inline"-"<<="-


Comment 3 by mmenke@chromium.org, Apr 15 2016

Status: WontFix (was: Untriaged)
Gah, it wants "-dict" instead of "--dict" - the docs are right, I've just internalized the "-- for multiple character parameters rule", and missed the warning about it in the wall of text.  Sorry about that.

Comment 4 by kcc@chromium.org, Apr 15 2016

I've added a slightly better warning against such confusions. 
libFuzzer will now tell this: 
WARNING: did you mean '-dict=foo' (single dash)?

As for your question (No idea if longer or shorter strings work better, just started experimenting):

I'd expect that shorter strings would work a bit better.
Also make sure to provide good seed corpus.


Comment 5 by mmenke@chromium.org, Apr 15 2016

What makes you say that?

Just so we're on the same page, I'm thinking of things like "Content-Length" vs "Content-Length:" vs "\x0AContent-Length: 0" and "\x0AContent-Length: 20".  And putting the entire header string needed for proxy authentication in one string (407 challenge + realm string).  With smaller strings, the fuzzer may have to make several related changes in very specific place to make "interesting" changes, while with the larger strings, it would just have to make one change, and then could decompose to find interesting variations.

I'm not disagreeing with you - you doubtless know better then I do, just curious what the intuition is.

Comment 6 by kcc@chromium.org, Apr 15 2016

I actually have no good data to support my statement, just gut feeling based on previous experiments with similar targets. 
Also, if you have a good test corpus, the long strings will be pulled from there. 

Note that we should not be trying to minimize the time to find the first bug, 
but rather maximize the number of bugs we find by running the target for long time. And I *think* that adding 3 smaller tokens (e.g. "Content-Length", ":", "20") is better in the long run than adding 1 large token ("Content-Length: 20").
You may actually do both. 

 

Comment 7 by mmoroz@chromium.org, Apr 18 2016

 + 1 to have both :)

I agree that seed corpus is very important for a target like HTTP.
Also I want to implement a custom mutator for HTTP this week to have a sort of reference example before the Fixit week.

Sign in to add a comment