Currently, bisect-kit has "noisy (boolean) binary search" which classify test result as boolean (old/new). This is good for functional test but not enough for performance test.
1. Performance tests return numeric numbers (such as time or fps number). Converting them to boolean is lost information. Take value distribution into probability calculation should get more accurate result and shorter bisect runs.
2. Currently we classify results by a fixed threshold (middle between "old" and "new" values).
a. "middle point" threshold is not good enough
b. it's better to adjust threshold than using fixed threshold after observed more test results
If the threshold is wrong, noisy boolean binary search may overcome with more runs, maybe not if data is too noisy (lead to wrong result).
Owen already provided me the algorithm of noisy value binary search. It should be integrated into bisect-kit.
https://paste.googleplex.com/5275351815028736
This feature should help to deal regression of noisy tests like issue 877919, issue 877028, issue 878622, issue 875659
Comment 1 by kcwu@chromium.org
, Oct 3