New issue
Advanced search Search tips

Issue 866520 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Jul 24
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Mac
Pri: 0
Type: Bug



Sign in to add a comment

webkit_layout_tests step completely fail on Mac

Project Member Reported by nednguyen@chromium.org, Jul 23

Issue description

There is no clear test failure. I am guessing there is some problem with the results merging.

https://ci.chromium.org/p/chromium/builders/luci.chromium.try/mac_chromium_rel_ng/100391

This is blocking https://chromium-review.googlesource.com/c/chromium/src/+/1146495

p/s Dirk: this is an example which a CL doesn't touch blink code or anything related but trigger the layout test :-)

P0 since this is blocking CQ
 
Cc: bcwh...@chromium.org dvadym@chromium.org timloh@chromium.org lazyboy@chromium.org
CC chromium sherrifs
Failure logs:


  File "/b/s/w/ir/cache/builder/src/third_party/blink/tools/merge_web_test_results.py", line 12, in <module>
    main(sys.argv[1:])
  File "/b/s/w/ir/cache/builder/src/third_party/blink/tools/blinkpy/web_tests/merge_results.py", line 823, in main
    merger.merge(args.output_directory, args.input_directories)
  File "/b/s/w/ir/cache/builder/src/third_party/blink/tools/blinkpy/web_tests/merge_results.py", line 514, in merge
    merge_func(out_path, to_merge)
  File "/b/s/w/ir/cache/builder/src/third_party/blink/tools/blinkpy/web_tests/merge_results.py", line 307, in __call__
    to_merge)
blinkpy.web_tests.merge_results.MergeFailure: Failure merging /b/s/w/ir/kitchen-workdir/layout-test-results/external/wpt/html/rendering/non-replaced-elements/tables/form-in-tables-stderr.txt:  File contents don't match:
/b/s/w/ir/tmp/t/tmptphIor/5/layout-test-results/external/wpt/html/rendering/non-replaced-elements/tables/form-in-tables-stderr.txt
Trying to merge ['/b/s/w/ir/tmp/t/tmptphIor/0/layout-test-results/external/wpt/html/rendering/non-replaced-elements/tables/form-in-tables-stderr.txt', '/b/s/w/ir/tmp/t/tmptphIor/5/layout-test-results/external/wpt/html/rendering/non-replaced-elements/tables/form-in-tables-stderr.txt'].
Command ['/b/s/w/ir/cache/vpython/cbb391/bin/python', '/b/s/w/ir/cache/builder/src/third_party/blink/tools/merge_web_test_results.py', '--build-properties', '{"attempt_start_ts": 1532345229000000, "blamelist": ["nednguyen@google.com"], "bot_id": "vm1000-m4", "buildbucket": {"build": {"bucket": "luci.chromium.try", "created_by": "user:5071639625-1lppvbtck1morgivc6sq4dul7klu27sd@developer.gserviceaccount.com", "created_ts": 1532348466812370, "id": "8940209890022033296", "project": "chromium", "tags": ["builder:mac_chromium_rel_ng", "buildset:patch/gerrit/chromium-review.googlesource.com/1146495/1", "cq_experimental:false", "master:luci.chromium.try", "user_agent:cq"]}, "hostname": "cr-buildbucket.appspot.com"}, "buildername": "mac_chromium_rel_ng", "buildnumber": 100391, "category": "cq", "got_angle_revision": "ba365939b90c76e5692aa97d2825dea6fe8bb200", "got_buildtools_revision": "0dd5c6f980d22be96b728155249df2da355989d9", "got_nacl_revision": "aebc9e5205dcaad5d5954a18abc532316a954644", "got_revision": "353108ea6461c59370af55fb019f361c0cdd7e63", "got_revision_cp": "refs/heads/master@{#577153}", "got_swarming_client_revision": "9a518d097dca20b7b00ce3bdfc5d418ccc79893a", "got_v8_revision": "d7b61abe7b48928aed739f02bf7695732d359e7e", "got_v8_revision_cp": "refs/heads/6.9.427@{#1}", "got_webrtc_revision": "94ef424b3160262f0f8298b8abd47c29bb9ee3da", "got_webrtc_revision_cp": "refs/heads/master@{#24059}", "mastername": "tryserver.chromium.mac", "patch_gerrit_url": "https://chromium-review.googlesource.com", "patch_issue": 1146495, "patch_project": "chromium/src", "patch_ref": "refs/changes/95/1146495/1", "patch_repository_url": "https://chromium.googlesource.com/chromium/src", "patch_set": 1, "patch_storage": "gerrit", "path_config": "generic", "reason": "CQ", "recipe": "chromium_trybot", "repository": "https://chromium.googlesource.com/chromium/src", "revision": "HEAD"}', '--summary-json', '/b/s/w/ir/tmp/t/tmptphIor/summary.json', '--task-output-dir', '/b/s/w/ir/tmp/t/tmptphIor', u'--verbose', '-o', '/b/s/w/ir/tmp/t/tmpmK3Smb.json', '/b/s/w/ir/tmp/t/tmptphIor/0/output.json', '/b/s/w/ir/tmp/t/tmptphIor/1/output.json', '/b/s/w/ir/tmp/t/tmptphIor/10/output.json', '/b/s/w/ir/tmp/t/tmptphIor/11/output.json', '/b/s/w/ir/tmp/t/tmptphIor/2/output.json', '/b/s/w/ir/tmp/t/tmptphIor/3/output.json', '/b/s/w/ir/tmp/t/tmptphIor/4/output.json', '/b/s/w/ir/tmp/t/tmptphIor/5/output.json', '/b/s/w/ir/tmp/t/tmptphIor/6/output.json', '/b/s/w/ir/tmp/t/tmptphIor/7/output.json', '/b/s/w/ir/tmp/t/tmptphIor/8/output.json', '/b/s/w/ir/tmp/t/tmptphIor/9/output.json'] returned exit code 1
WARNING:root:merge_cmd had non-zero return code: 1
step returned non-zero exit code: 1
Labels: OS-Mac
Owner: robertma@chromium.org
Status: Assigned (was: Untriaged)
Summary: webkit_layout_tests step completely fail on Mac (was: webkit_layout_tests step completely fail)
Thanks for filing the bug. I was also investigating this just now.

The root cause is that two tests in the same directory have a duplicate basename:

external/wpt/html/rendering/non-replaced-elements/tables/form-in-tables.{html,xhtml}

And their outputs (e.g. -stderr.txt) would have the same filename and hence collide with each other. I'll rename the tests to unblock CQ.


The tests have been there for a long time and the issue was only triggered today because content_shell started to print a lot of stderr in almost all layout tests on Mac:

[12226:775:0723/081727.016090:ERROR:image_transport_surface_overlay_mac.mm(114)] No context current!

This is a separate issue and also needs to be looked into.
Cc: ccameron@chromium.org
The error message was added in
https://chromium.googlesource.com/chromium/src/+/3bfab66af63dca6cda637f52077d8bdac2cf0b5e%5E%21/#F1

This is producing a lot of stderr (in almost all layout tests). Shall we revert it?
Labels: Sheriff-Chromium
Attempting to rename the test: https://chromium-review.googlesource.com/c/chromium/src/+/1146989

If the CL still doesn't pass CQ (likely because of more tests with duplicate basenames), I'll try reverting 3bfab66af63dca6cda637f52077d8bdac2cf0b5e 
Let's just delete the DLOG(ERROR) -- it's not a new error, but logging of a pre-existing error.

Someone recently (~1 week ago) broke a bunch of assumptions about how macOS GL contexts work, and it's causing serious stability issues on Canary as well.
#3: 
> The tests have been there for a long time and the issue was only triggered today because content_shell started to print a lot of stderr in almost all layout tests on Mac:

do you mind explain it further? Why would the extra stderr cause the duplicate base names error to fail now?
Re #9: 

Background is issue 764662. In short, Blink test runner mangles filenames a lot, e.g. stderr of foo-test.html goes to foo-test-stderr.txt. Historically the code assumes basenames are always unique in a directory. If this assumption is violated and the tests in question produce outputs (-stderr.txt, -actual.txt, etc.), output files will collide and the suite may fail with an exception (usually during merging results).

In this case, the duplicate form-in-tables.{html,xhtml} used to pass without any outputs before, but 3bfab66af63dca6cda637f52077d8bdac2cf0b5e made both of them producing form-in-tables-stderr.txt.

And sorry I considered issue 764662 low-priority because I thought it would only block new culprit CLs on CQ. I don't know why 3bfab66af63dca6cda637f52077d8bdac2cf0b5e could pass CQ and hence block all other CLs (apparently it retried four times; perhaps there's some nondeterminism somewhere).
Project Member

Comment 11 by bugdroid1@chromium.org, Jul 23

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/d63a2e68d79250d0da55fa00aa2ae0f680942676

commit d63a2e68d79250d0da55fa00aa2ae0f680942676
Author: Christopher Cameron <ccameron@chromium.org>
Date: Mon Jul 23 18:53:58 2018

Mac: Disable warnings about having no GLContextCGL

ImageTransportSurfaceOverlayMac is written under the assumption that it
is only ever paired with a GLContextCGL. This assumption has recently
been broken. The ERROR here was, as a result, creating an unruly amout
of console spam.

Bug:  866520 
Cq-Include-Trybots: luci.chromium.try:android_optional_gpu_tests_rel;luci.chromium.try:linux_optional_gpu_tests_rel;luci.chromium.try:mac_optional_gpu_tests_rel;luci.chromium.try:win_optional_gpu_tests_rel
Change-Id: I3f3524f3f45e9e4cdaae09aeb672c9ab3b8edafa
Reviewed-on: https://chromium-review.googlesource.com/1147100
Reviewed-by: Robert Ma <robertma@chromium.org>
Commit-Queue: ccameron <ccameron@chromium.org>
Cr-Commit-Position: refs/heads/master@{#577211}
[modify] https://crrev.com/d63a2e68d79250d0da55fa00aa2ae0f680942676/gpu/ipc/service/image_transport_surface_overlay_mac.mm

Project Member

Comment 12 by bugdroid1@chromium.org, Jul 23

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/1736c65757bac37455e16a11e27b4f58bc226955

commit 1736c65757bac37455e16a11e27b4f58bc226955
Author: Robert Ma <robertma@chromium.org>
Date: Mon Jul 23 22:22:00 2018

Rename a WPT which has a duplicate basename

as another test in the same directory.

Blink test runner isn't able to handle such duplicate basenames because
it relies on filename mangling extensively.

Bug:  866520 
Change-Id: I0c9671e83dcf9c61bc20f16a43dcace827145024
Reviewed-on: https://chromium-review.googlesource.com/1146989
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Commit-Queue: Robert Ma <robertma@chromium.org>
Cr-Commit-Position: refs/heads/master@{#577297}
[rename] https://crrev.com/1736c65757bac37455e16a11e27b4f58bc226955/third_party/WebKit/LayoutTests/external/wpt/html/rendering/non-replaced-elements/tables/form-in-tables-xhtml.xhtml

hi Robert, can this bug be closed or downgraded from P0 now? It looks like webkit_layout_tests is no longer failing on the merge.
Status: Fixed (was: Assigned)
Yes. With the two CLs landed, the bug itself is fixed.
Anyone know how the culrpit CL that triggered this get through the CQ in the first place?
Labels: -Sheriff-Chromium
Re #15: I just confirmed my theory in my comment #10. The error messages added by the culprit CL were a bit flaky; in rare cases they were not printed.

At the fourth retry, the culprit CL got lucky, and form-in-tables.xhtml didn't produce stderr:
https://chromium-swarm.appspot.com/task?id=3ed633412aa15310&refresh=10&show_raw=1
(search for "form-in-tables.xhtml passed" in the logs)

Hence there was no collision and the CL passed CQ.

Sign in to add a comment