Issue metadata
Sign in to add a comment
|
loading.mobile and media.mse_cases failing on chromium.perf Android One Perf (3) |
||||||||||||||||||||
Issue descriptionRevision range: 435222-435255 Link to failing step log: https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_One_Perf__3_%2F4509%2F%2B%2Frecipes%2Fsteps%2Floading.mobile%2F0%2Fstdout Loading.mobile also failed two runs before this revision range, but it seems that it was due to AdbCommandFailed errors, whereas all failures afterwards were due to javascript expression timeouts. I'm going to kick off a bisect. If the test is disabled, please downgrade to Pri-2.
,
Dec 12 2016
,
Dec 12 2016
Also going to lump in the media.mse_cases failure on the Android Ones: both loading.mobile and media.mse_cases started failing in the same revision range with the same error (timed out waiting for IsJavascriptExpressionTrue), so I highly suspect that fixing one will fix the other.
,
Dec 12 2016
===== BISECT JOB RESULTS ===== Status: started ===== TESTED REVISIONS ===== Revision Exit Code Std Dev N Good? chromium@435221 0 N/A 5 good chromium@435255 0 N/A 5 bad Bisect job ran on: android_one_perf_bisect Bug ID: 673461 Test Command: src/tools/perf/run_benchmark -v --browser=android-chromium --output-format=chartjson --upload-results --pageset-repeat=1 --also-run-disabled-tests --story-filter=flower loading.mobile Test Metric: timeToFirstContentfulPaint_avg/https___www.google.com_search?q_flower_q_flower+delivery Relative Change: 0.00% Buildbot stdio: http://build.chromium.org/p/tryserver.chromium.perf/builders/android_one_perf_bisect/builds/1873 Job details: https://chromeperf.appspot.com/buildbucket_job_status/8993450393562116736 Not what you expected? We'll investigate and get back to you! https://chromeperf.appspot.com/bad_bisect?try_job_id=5883138138439680 | O O | Visit http://www.chromium.org/developers/speed-infra/perf-bug-faq | X | for more information addressing perf regression bugs. For feedback, | / \ | file a bug with component Tests>AutoBisect. Thank you!
,
Dec 13 2016
Weird: when I look through the logs for the latest run (https://luci-logdog.appspot.com/v/?s=chrome%2Fbb%2Fchromium.perf%2FAndroid_One_Perf__3_%2F4512%2F%2B%2Frecipes%2Fsteps%2Floading.mobile%2F0%2Fstdout), I'm seeing some strange results. When I search for failed ] https://www.google.com/search?q=flower#q=flower+delivery I get two results. When I search for ok ] https://www.google.com/search?q=flower#q=flower+delivery I also get two results. Intuitively, it looks like it makes sense that there's one "failed" and one "ok": it looks like we're trying to run the test twice, once with traffic shaping and one w/o traffic shaping. But why are we running it four times?
,
Dec 13 2016
Ah, it's obvious that this regression was caused by the traffic-shaped 3G benchmarks being introduced in the first place: https://codereview.chromium.org/2493063002 Here's an example stack trace from a failure: https://paste.googleplex.com/4615545282363392 This certainly makes it look like we're timing out waiting for the page to load. This screenshot also seems to confirm this: https://console.developers.google.com/m/cloudstorage/b/chrome-telemetry-output/o/profiler-file-id_59-2016-12-12_08-42-2061933.png Strangely, in the stack trace I linked to, one of the error messages reads: TimeoutException: Timed out while waiting 26s for IsJavaScriptExpressionTrue. I also found: "TimeoutException: Timed out while waiting 0s for IsJavaScriptExpressionTrue." This doesn't make any frickin' sense. I need to look into how this timeout reaches the error message in order to make sure that we're doing the right thing here.
,
Dec 15 2016
It's worth noting that some of these timeouts have messages like "TimeoutException: Timed out while waiting 0s for IsJavaScriptExpressionTrue." See: https://paste.googleplex.com/5494691936075776 My suspicion is that we're passing along a duration that means something like "time that this test has left to run", and eventually that number hits zero before we even get a chance to run WaitForDocumentReadyStateToBeInteractiveOrBetter.
,
Dec 15 2016
And here's (https://paste.googleplex.com/5845565799661568) another similar, but slightly different stack trace. Both of these stack traces indicate that Telemetry is timing out while waiting for a Javascript expression to become true, but in two different ways.
,
Dec 15 2016
It looks like this (https://cs.corp.google.com/github/catapult-project/catapult/telemetry/telemetry/internal/browser/web_contents.py?q=p:github+f:%5Ecatapult-project/catapult/+%2290%22+f:.py&dr=C&l=13) is the source of the timeout from https://paste.googleplex.com/5845565799661568. Still looking for the timeout associated with the longer stack trace.
,
Dec 15 2016
,
Dec 16 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/1d47683961dc5963f758463508404ff114133d9a commit 1d47683961dc5963f758463508404ff114133d9a Author: charliea <charliea@chromium.org> Date: Fri Dec 16 00:48:10 2016 Double the navigation timeouts for page cycler stories This is in response to problems with loading.mobile on Android One/3G, where almost all stories were timing out. BUG= 673461 Review-Url: https://codereview.chromium.org/2577243002 Cr-Commit-Position: refs/heads/master@{#438969} [modify] https://crrev.com/1d47683961dc5963f758463508404ff114133d9a/tools/perf/page_sets/page_cycler_story.py
,
Dec 16 2016
My CL yesterday seems to have drastically helped the situation, but it doesn't seem to have made the test green: we're down from 23 failing stories to just one, http://www.ibicn.com. This story is listed as "tough_tti", so it's not too surprising that this one is the one that's causing problems. I'm going to play around with timeouts on my local Android One device in order to see what a good timeout for this story might be.
,
Dec 16 2016
I've attached the failure image. This thing's ridiculously close to being done loading. I talked to Ned and we decided that increasing the timeout by another 50% should probably be sufficient to resolve the issue.
,
Dec 16 2016
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/33dea85c0961db70473126d594815c5f1739bc84 commit 33dea85c0961db70473126d594815c5f1739bc84 Author: charliea <charliea@chromium.org> Date: Fri Dec 16 22:18:01 2016 Further increase the navigation timeout to 180s for page cycler stories The increase done in https://codereview.chromium.org/2577243002 was effective in decreasing the number of failing 3G/Android One stories from 23 to 1, and hopefully this increase gets that number to zero. BUG= 673461 Review-Url: https://codereview.chromium.org/2576403005 Cr-Commit-Position: refs/heads/master@{#439214} [modify] https://crrev.com/33dea85c0961db70473126d594815c5f1739bc84/tools/perf/page_sets/page_cycler_story.py
,
Dec 20 2016
|
|||||||||||||||||||||
►
Sign in to add a comment |
|||||||||||||||||||||
Comment 1 by 42576172...@developer.gserviceaccount.com
, Dec 12 2016