New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 676304 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Jan 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: ----
Type: ----

Blocked on:
issue 664765
issue 670316

Blocking:
issue 671156



Sign in to add a comment

"run_benchmark try" hangs on "Performance Test" with no output

Project Member Reported by perezju@chromium.org, Dec 21 2016

Issue description

I've started a couple of try jobs with the command:

    tools/perf/run_benchmark try android-nexus5 system_health.memory_mobile

And both failed timing out after one hour on the "Performance Test (With Patch) 1 of 1" without producing any output.

The two try jobs:
- https://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5_perf_bisect/builds/4475
- https://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5_perf_bisect/builds/4476

An the CL (which doesn't do much really, I was in fact writing the docs on how to run this sort of try jobs):
https://codereview.chromium.org/2595613002
 
Cc: simonhatch@chromium.org
Components: Tests>AutoBisect
Can you try re-running with -v?
I thought that was always implicit, but will retry with it now.
Blocking: 671156
Cc: hjd@chromium.org
Build failed again with no output.

Same thing happened here from another CL:
https://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5_perf_bisect/builds/4477
Command you appear to be running is:

src/tools/perf/run_benchmark --browser=android-chromium system_health.memory_mobile --pageset-repeat 5 --verbose


Pulling the timing info from a normal run of system_health.memory_mobile from the nexus 5 perf waterfall, it takes ~7000s to run one iteration (I think the default is 3 repeats?), or close to 2 hours, so with --pageset-repeat=5 this would take over 3.

Anybody know if there's a 1 hour limit on a single step?
I think there is a timeout if the step _does not_ produce any output for a period of time. It's unclear to me, however, why we're not seeing any output at all.

I kicked-off another try job with a story filter so, in theory, the job should be able to complete in less than 1 hour.

Running here:
https://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5_perf_bisect/builds/4481
Cc: primiano@chromium.org
Interesting, the try job in #6 did work since each run of "Performance Test" took only a few minutes.

I guess the problem is how this step works. By "capturing" all output and only producing it all in one go when it's done, the lack of output makes buildbot think that the step is stuck and kills it.

This is bad for system health, since running the entire plan to test the effects of memory for a CL on a variety of scenarios is one of the main reasons devs would want to run try jobs.

Additionally:

- The step failed later at "Post bisect results" with the dashboard responding "Error response: 400". That should probably be a separate bug?

- I was expecting to see a link to a results2.html file somewhere. Wasn't it supposed to be there?
Blockedon: 670316 664765
Cc: mikec...@chromium.org
+mikecase

Adding mikecase@ since he's done some work in this area while investigating the mysterious timeouts. I vaguely recall the old perf/test_runner.py used to have a heartbeat logger that prevented timeouts for long running tests.

Blocking on:

crbug.com/664765 for the failed post to dashboard
 crbug.com/670316  for the missing results file
Not sure heartbeat will work since we intercept all stdout from the test run. Even the heartbeat would be intercepted unless we found some clever way to do it.

Howabout a change like this (pretty rough CL):
https://chromium-review.googlesource.com/c/423387/1/scripts/slave/recipe_modules/bisect_tester/perf_test.py

Basically, instead of intercepting stdout, just do...

test_run_cmd.py | tee captured_stdout.txt


Then stdout should get streamed to buildbot to avoid timeouts. And we still get stdout in recipes to parse and stuff.
link to cleaned up CL: https://chromium-review.googlesource.com/c/423387/

Owner: mikec...@chromium.org
Status: Fixed (was: Untriaged)
This is awesome! Thanks mikecase for the fix!

1. Some random test CL: https://codereview.chromium.org/2651713004
2. Try job build: https://build.chromium.org/p/tryserver.chromium.perf/builders/android_nexus5_perf_bisect/builds/4566
3. With easy link to Results HTML (thanks simonhatch for fixing  bug 670316 !)
4. Results FULL system health: https://console.developers.google.com/m/cloudstorage/b/chromium-telemetry/o/html-results/results-2017-01-23_22-44-51

This is going to make the lives of developers chasing regressions so much easier!
ESpJsvB59Z5.png
331 KB View Download
Components: Speed>Bisection

Sign in to add a comment