New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 730250 link

Starred by 0 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocking:
issue 703837



Sign in to add a comment

Getting "TEST RESULTS WERE INVALID" for linux_chromium_rel_ng

Project Member Reported by jeffcarp@chromium.org, Jun 6 2017

Issue description

Hi, not sure if this is a tooling failure or my fault. In this CL:
https://chromium-review.googlesource.com/c/461722/

linux_chromium_rel_ng keeps failing with "TEST RESULTS WERE INVALID"
https://build.chromium.org/p/tryserver.chromium.linux/builders/linux_chromium_rel_ng/builds/472864

Can someone more familiar with Linux swarming take a look? I don't see any layout test failures.
 
Cc: dpranke@chromium.org
webkitpy.layout_tests.merge_results: [DEBUG] Creating merged /b/rr/tmpJlIM80/w/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt from ['/tmp/tmpcmBbi1/1/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt', '/tmp/tmpcmBbi1/2/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt', '/tmp/tmpcmBbi1/4/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt']
Traceback (most recent call last):
  File "/b/c/b/linux/src/third_party/WebKit/Tools/Scripts/merge-layout-test-results", line 209, in <module>
    main(sys.argv[1:])
  File "/b/c/b/linux/src/third_party/WebKit/Tools/Scripts/merge-layout-test-results", line 191, in main
    merger.merge(args.output_directory, args.input_directories)
  File "/b/c/b/linux/src/third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py", line 498, in merge
    merge_func(out_path, to_merge)
  File "/b/c/b/linux/src/third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py", line 291, in __call__
    to_merge)
webkitpy.layout_tests.merge_results.MergeFailure: Failure merging /b/rr/tmpJlIM80/w/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt:  File contents don't match:
/tmp/tmpcmBbi1/2/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt
/tmp/tmpcmBbi1/4/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt
Trying to merge ['/tmp/tmpcmBbi1/1/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt', '/tmp/tmpcmBbi1/2/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt', '/tmp/tmpcmBbi1/4/layout-test-results/external/wpt/html/syntax/parsing/html5lib_menuitem-element-diff.txt'].
WARNING:root:merge_cmd had non-zero return code: 1
step returned non-zero exit code: 1
This means that the external/wpt/html/syntax/parsing/html5lib_menuitem-element.html test is being run on multiple shards.

The only way this happens is if the test is somehow included twice. The last time I saw a problem like this the test was listed in TestExpectations twice in some way.
Owner: jeffcarp@chromium.org
Aha, so that file is not listed in TestExpectations but this CL does change the way we locate and collect tests, so the new behavior is probably returning duplicate tests in some cases. Thanks for looking into this, I'll try to see why it's returning duplicates.
Blocking: 703837
So after investigating it looks like duplicate tests show up between shards even on master. Here's the output of a script I wrote that compares the test files ran between two shards (using --shard-index=i --total-shards=6):

shard tests-to-run-branch-0-of-6 has 9819 tests
shard tests-to-run-branch-1-of-6 has 9861 tests
found 1594 dupes across 2 shards

On master:

shard tests-to-run-master-0-of-6 has 9925 tests
shard tests-to-run-master-1-of-6 has 9913 tests
found 1650 dupes across 2 shards

I'm confused why duplicate tests on master don't cause the same problem.
Oops, I realized those are being run in random order without a consistent seed, which explains the result.
Status: Started (was: Untriaged)
Update on this bug: the duplicate test problem is weirdly being caused by 180 tests all in this folder:
virtual/layout_ng/external/wpt/css/CSS2/floats-clear/

I'm looking into why that might be the case and what is different about those tests.
So I think the root problem here is that the CL I'm working on causes one file to be represented as potentially multiple test files, but upon merge the script tries to merge them back into one file, which isn't possible.
I don't quite understand what you mean?

The point of the merge script is that it merges multiple files of the same kind together. A single test which creates multiple outputs should be fine?
@tansell my CL enables running WPT tests that don't correspond to actual files - for instance:

html/syntax/parsing/html5lib_adoption01.html

Now becomes 3 tests:
/html/syntax/parsing/html5lib_adoption01.html?run_type=uri
/html/syntax/parsing/html5lib_adoption01.html?run_type=write
/html/syntax/parsing/html5lib_adoption01.html?run_type=write_single

They have different outputs, but the merge script is trying to merge them all into one file, which causes the failure.

I was able to get the LUCI runners to pass by returning early here:
https://chromium-review.googlesource.com/c/461722/47/third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/merge_results.py#286

I'm trying to find what determines the filename for the test results files, because I think it's currently using the base filename and not the (virtual) filename of the test that's actually being run.

Ah I see what's happening - when a test result output filename is made, it strips off the extension (including anything after, like ?run_type=uri) and adds -expected.txt or -stderr.txt or any other suffix. This effectively turns the 3 different tests above into one results file, but the contents don't match so it errors out.
This will be happening when you run it locally too right - it'll just be overwriting the existing file and you'll randomly end up with one of the results?
I'd imagine that's correct, yes - if it weren't for the merge script throwing an exception we might not have caught this behavior.
Status: Fixed (was: Started)

Sign in to add a comment