New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 791118 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug-Regression



Sign in to add a comment

2%-27.5% regression in webrtc_perf_tests at 20874:20874

Project Member Reported by terelius@chromium.org, Dec 1 2017

Issue description

Mix of improvements and regressions. Please discuss with sprang@ for severity assessment, but e.g. a clear drop from 21 to 19 fps seems like it would warrant an investigation.

The indicated CL is almost certainly not responsible; it just removes an (audio) unit test. If I had to guess, I'd say it was a device/infrastructure change. Is there a log of all maintenance and upgrades that have been performed on the devices? Maybe we could set a flag on the device when any infra change has been performed, report the flag through chromeperf.appspot.com on the next run and then clear the flag?

 
All graphs for this bug:
  https://chromeperf.appspot.com/group_report?bug_id=791118

(For debugging:) Original alerts at time of bug-filing:
  https://chromeperf.appspot.com/group_report?sid=e3fa3e1430caeb12cb3503d0b7569637805471088da7f701302cec50cc0899d1


Bot(s) for this bug's original alert(s):

webrtc-android-tests-nexus5-kitkat
webrtc-android-tests-nexus6-nougat
webrtc-android-tests-nexus9
Owner: sprang@chromium.org
If you click through to the buildbot status page for two builds before and after the regression, we can see that the same devices run in both cases:

https://build.chromium.org/p/client.webrtc.perf/builders/Android32%20Tests%20%28N%20Nexus6%29/builds/4318:
shamu NRD91G ZY222X5LBG
shamu NRD91G ZX1G22WWWD

https://build.chromium.org/p/client.webrtc.perf/builders/Android32%20Tests%20%28N%20Nexus6%29/builds/4333
shamu NRD91G ZY222X5LBG
shamu NRD91G ZX1G22WWWD

Therefore, there has been no device change. We generally try to tell you when this happens, but we are mere humans, so sometimes we forget. In this case that is not the problem, at least.

That leaves the possibilities of 1) it's a legit regression, 2) the test doesn't measure the right thing and 3) cosmic rays/unrelated changes/bad luck. 

I'll leave it to sprang@ to check if it's 1) or 2), otherwise we just write it off as 3).
Cc: phoglund@chromium.org
This is incredibly time consuming to triage though, especially since it seems we cannot trust the CL range. That's why I am reluctant to write it off as bad luck.

Apart from device change, could the test be affected by something else such as OS upgrades?
With that said:

1) Edward will look into the CL range error. I agree this is important to fix.
2) Maybe we should run less tests on Android? We should periodically clean up tests that turn out to be unstable or otherwise not providing value. Could you file a bug to do such a cleanup? It helps a lot if you have data here, so we're not just basing off anecdotes. If you don't have any data then give us the anecdotes we have, and we'll try to work with that.
Thanks.

Unfortunately I don't have any solid data, only anecdotal evidence. My experience is that the test works fine in general, but that this specific bot (and possibly the nexus 4) are unreliable. I suggest we gather more input from the upcoming sheriffs to determine the right course of action.

Sign in to add a comment