New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 661745 link

Starred by 2 users

Issue metadata

Status: Assigned
Owner:
OOO until 2019-01-24
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Feature

Blocked on:
issue 628464



Sign in to add a comment

It's hard to reason about GPU test output.

Project Member Reported by machenb...@chromium.org, Nov 2 2016

Issue description

V8 sometimes breaks gpu tests or optional gpu tests when rolling. We need to reason about the nature of the failures (e.g. check if we see V8 stack traces).

Unlike with many other tests on Chromium trybots, gpu tests seem to require an analysis of the test runner's stdout - which is huge.

Example link:
https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_optional_gpu_tests_rel/builds/3653/steps/webgl2_conformance_tests%20on%20NVIDIA%20GPU%20on%20Linux%20%28with%20patch%29%20on%20Linux/logs/stdio

Or logdog, if link gets outdated:
https://luci-logdog.appspot.com/v/?s=chromium%2Fbb%2Ftryserver.chromium.linux%2Flinux_optional_gpu_tests_rel%2F3653%2F%2B%2Frecipes%2Fsteps%2Fwebgl2_conformance_tests_on_NVIDIA_GPU_on_Linux__with_patch__on_Linux%2F0%2Fstdout

Other Chromium tests, e.g. browser tests provide links to failure output containing all the relevant things one needs to know (and nothing else). Can we get that for gpu tests too?
 

Comment 1 by kbr@chromium.org, Nov 2 2016

Blocking: 628464
Cc: nedngu...@google.com eyaich@chromium.org
Components: Internals>GPU>Testing Tests>Telemetry
What's the exact change desired here? There have been attempts to trim down the GPU tests' output -- see Issue 628464. At this point most of the output for passing tests has already been eliminated. Searching through the logs for the failed test name works for me, though I agree the process could be faster?

Is the desire to provide the log excerpt from just the failed test as a link in the waterfall results? I know the gtest harness provides this, but I'm not sure exactly how it works and how to do the same for Telemetry.

Yes, just like the gtest harness. So that at each test step for each failing test there is an extra link containing the log of only that test.

The maximum number of such logs should be limited though, to not kill the master's disk pickles after CLs causing numerous test failures.

Technically this would require the following (not sure how much of the gtest logic could be reused):
1. The test driver needs to provide formal failure output in a file, e.g. a json file (this could be in addition to the stdout).
2. The recipe calls the driver with a new flag pointing to a tmp file, e.g. --json-output=/tmp/sometmpfile.json
3. The recipe reads the json file after step execution and changes the step presentation. E.g. api.step.active_result.presentation.logs[failure_name] = failure_lines.

Example failure on chromium:
https://build.chromium.org/p/chromium.win/builders/Win7%20%2832%29%20Tests/builds/12169/steps/browser_tests%20on%20Windows-7-SP1

Example failure on V8 (using different logic):
https://build.chromium.org/p/client.v8.ports/builders/V8%20Linux%20-%20arm64%20-%20sim%20-%20nosnap%20-%20debug/builds/2918/steps/Check
Cc: dpranke@chromium.org

Comment 4 by kbr@chromium.org, Nov 3 2016

Status: Available (was: Untriaged)
These tests already produce a JSON format file, but it's abbreviated. Does Telemetry have to capture the failing tests' output and put it in that file? What is the dictionary entry (by convention) into which these logs go?

Can you point me to the recipe logic which handles this presentation? I don't see anything putting entries into step_result.presentation.logs in tools/build/scripts/slave/recipe_modules/chromium_tests/steps.py .

Looks like the browser tests don't use the recipe modules for the logic, but write the annotations directly.

The browser test stdout has STEP_LOG_LINE annotations, and annotator creates log files based on these lines.

I have no idea about any conventions. I remember that slashes in the file name migth not be escaped and that screws up the links, but that's all.

I also checked this sample (case with really many failures):
https://build.chromium.org/p/chromium.mac/builders/Mac10.11%20Tests/builds/3208/steps/browser_tests%20on%20Mac-10.11

The code location that restricts the number of logs automatically seems to be:
https://cs.chromium.org/chromium/build/scripts/master/chromium_step.py?q=%22More+Logs%22&sq=package:chromium&l=607&dr=C
@machenbach - 

I don't think that's the browser_tests binary; all of the @STEP_LOG_LINE output is showing up after the binary has already exited. 

I think that's the wrapper adding that, and it probably shouldn't be. It might be runit.py ? or maybe something in swarming is getting confused?
Your're right. I think now I found it.

The swarming collect script processes a json with test results:
https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/swarming/resources/collect_gtest_task.py?l=133

And for failures (and some other things), logs are written:
https://cs.chromium.org/chromium/build/scripts/slave/annotation_utils.py?l=87

Comment 8 by kbr@chromium.org, Nov 8 2016

Labels: -Pri-1 Pri-2
Owner: kbr@chromium.org
Status: Assigned (was: Available)
Taking this but I can't treat it as P1. You can search the logs for the failing test names in the interim. Hopefully tests are not that flaky that you have to do this frequently. Please let me know otherwise.

Components: -Infra Infra>Client>Chrome

Comment 10 by kbr@chromium.org, Jan 12 2017

Cc: zmo@chromium.org kainino@chromium.org
zmo@ pointed out this morning that the non-Swarmed GPU FYI bots like this one:
https://build.chromium.org/p/chromium.gpu.fyi/builders/Win10%20Release%20(Intel%20HD%20530)

aren't summarizing even the names of the failed tests, like in this build:
https://build.chromium.org/p/chromium.gpu.fyi/builders/Win10%20Release%20%28Intel%20HD%20530%29/builds/9

Need to look at the LocalIsolatedScriptTest recipe step and see if it's processing things significantly differently than SwarmingIsolatedScriptTest.

Comment 11 by kbr@chromium.org, May 24 2017

Blockedon: 628464

Comment 12 by kbr@chromium.org, May 24 2017

Blocking: -628464
Components: Test>Telemetry
Components: -Tests>Telemetry

Sign in to add a comment