Today, we are relying on each perf test being run on a physical device consistently so we have good perf coverage.
However, this model makes it hard to scale because hardware is inherently unreliable, especially consumer hardware.
We have a proposal how to do perf regression prevention without relying on device affinity as follows:
1) At every build, we will run perf test twice: one at the previous build revision and one at the current build revision.
2) We tune "--repeat" param enough so that we can detect statistical difference of perf results between 2 build revisions within an acceptable range. More machines can be added as needed to make the total cycle time acceptable.
A more detailed doc that discuss pros & cons of this approach will be sent out near the end of this quarter (since we have our plate full & this is a big direction change).
I just use this bug as a reminder for now, no immediate action need to be taken on this yet.
Comment 1 by monor...@bugs.chromium.org
, Jul 1 2017