New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 679742 link

Starred by 6 users

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Feature



Sign in to add a comment

Mark WPT sub-tests as expected failing, without being sensitive to minor message string diffs

Project Member Reported by rbyers@chromium.org, Jan 10 2017

Issue description

From discussion here: https://github.com/w3c/web-platform-tests/pull/4378

Rather than falling back to brittle text comparison, maybe it would be better if we could mark specific sub-tests as expected to fail?  Dunno.

 
Cc: domenic@chromium.org
Pros:
 - May reduce flakiness in any case where test output text can change while test output doesn't change.
Cons:
 - Changing the way we do things would take time and possibly increase complexity.

From domenic: "+1 on not using console output comparison, but instead comparing results from the testharness directly. I've had severe flakiness problems in the past with importing WPTs into Chrome due to that."

How Mozilla does it: https://developer.mozilla.org/en-US/docs/Mozilla/QA/web-platform-tests#Metadata_files
Expectations are kept in a metadata directory, with one expectation file per test, with all different platform-specific expectations in that file.

Given the way that Blink layout test baselines work, it's probably easier for us to do things the way we do things for layout tests - expectations are kept in baseline files next to tests, or in a platform-specific baseline directory.

One possible option would be to change our -expected.txt for testharness tests to -expected.json or similar. Although, I'm not convinced that it's worth it now.

Comment 2 by rbyers@chromium.org, Jan 11 2017

Would it work as a quick hack to just post-process the output (even with a simple grep as a short-term hack) so that it ONLY contains a list of FAIL lines for each sub-test that failed?  That way the only changes that would trigger a diff are changes to the sub-test name of failing subtests or an addition/subtraction from the set of subtests that are failing.  It looks like the Mozilla infrastructure would fail in exactly those cases too.
Cc: delph...@chromium.org
We could just add a testRunner.setDumpConsoleMessages(false) to testharnessreport.js if the test path matches the imported WPT test directory. We then need to rebase all the expected.txt files to remove console logging.

I have this change here along with unrelated change that unclamps setTimeout: https://codereview.chromium.org/2587963002/

Just gone on paternity leave so won't be able to commit for a month if anyone wants to take this further.

Comment 4 by mek@chromium.org, Jan 12 2017

Cc: mek@chromium.org
Disabling console messages for imported wpt tests sounds good to me. I don't think the way of detecting if a test is an imported WPT test as implemented in that CL will work when wptserve is enabled though, as at least currently in that case the servers root is the imported/wpt directory, so the script won't see that in its own path.
Does it sound like a bad idea to disable console messages for *all* testharness.js tests?

If this were done, then that would also solve a couple of other little problems:
 - Different/more console messages when testharness.js is updated in auto-import (see related  bug 685854 )
 - Flakiness when console messages in tests may contain non-deterministic strings in non-wpt testharness.js tests (e.g.  bug 685778 )

The main disadvantage would be that people writing tests could have error messages that they're not aware of; the main advantage would be that it would reduce flakiness and make import simpler.

Comment 6 by mek@chromium.org, Jan 27 2017

Sounds like a reasonable default to me. testharness.js tests that explicitly do want to test console output can still enable console messages. Not sure how many testharness.js based tests with -expected.txt files we currently have and so how much effort it would be to go through those to try to figure out if they have meaningful console output they're trying to verify?
Labels: -Pri-3 -OS-Chrome Pri-2
Owner: qyears...@chromium.org
Status: Assigned (was: Available)
Well, one way to check would be to try it out and see :-) I'll make a CL based on delphick@'s CL to try that.
Made a CL to try it out: https://codereview.chromium.org/2658093003

After making that change and rebaselining, that change involved:
  173 changed -expected.txt files
  55 deleted -expected.txt files
  115 console warning lines
  13157 console error lines (many of which were very repetitive)

tkent@ brought up a point in that CL:
> Console messages are helpful to find test errors.  Actually, testharness.js shows some console messages for wrong usage of the harness, and it's difficult to find such wrong usage without console messages.

My guess is that error messages are probably very helpful when developing tests, but maybe less helpful if nobody is looking at them.

I think the main concern about matching results based on console messages in testharness output was that it may increase flakiness -- it's more "brittle". Maybe the better policy is to still have a default of printing console messages, and if in some particular case we don't want console messages (because they are non-deterministic, for example) then we can disable these messages by adding `testRunner.setDumpConsoleMessages(false)` at the top of the test, with an explanation.
A quick thought: One thing that would solve our problem is if baseline text matching of testharness.js test results was handled differently than other text baselines -- if there was specific logic to look only at PASS/FAIL and test name, and ignore everything else, then we could keep the console error messages, but still allow test results to match even if some text is different.
Cc: tkent@chromium.org
The problem here was console messages with unreliable timing, whereas most console messages have reliable timing and are useful for debugging failures and for verifying that deprecation messages in fact are logged, as tkent@ pointed out in https://codereview.chromium.org/2658093003#msg6.

In https://github.com/w3c/web-platform-tests/pull/4378 it seems to me that the problem is that it's not clear whether the promise is resolved or rejected. If that were clear, presumably nobody would object to requiring the expected behavior or at least failing on the unexpected behavior?

Do we suspect we will have cases where the spec is very clear, but the timing is non-deterministic, and we won't want to tinker with the tests to avoid the unhandled promise rejection console message?
In https://codereview.chromium.org/2677763002/ a related issue came up -- how do we triage failing subtests and link them to bugs?

We use TestExpectations to link failures to bugs in other cases, could we perhaps do something like this?

 https://crbug.com/1234  external/wpt/a/test.html#Test title [ Failure ]
> Do we suspect we will have cases where the spec is very clear, but the timing is non-deterministic, and we won't want to tinker with the tests to avoid the unhandled promise rejection console message?

In the streams tests, there is a lot of nondeterminism (in a clear spec). The tests are written so that there are no unhandled rejections when they pass, but for a while, we were failing several of them as we lagged the spec. This issue meant that it was impossible to include test expectations for those tests, because the position of the unhandled rejection message kept bouncing around, so no expected.txt that we committed would ever consistently match. Thus we were unable to use the imported WPTs, and had to maintain our own in-tree copy with the failing tests commented out.

In general it just seems very problematic that we're using WPTs but our tooling is verifying things that are not verified in other engines or as part of the official runner. That can only lead to extra, spurious failures.
OK, so clearly https://github.com/w3c/web-platform-tests/pull/4378 was not an isolated case then.

To simply omit console output would probably make it harder to understand why something has failed on the bots, so that's not great. Tracking sub-tests only with something like #12 and having no -expected.txt files at all seems feasible, if in the case of failure one could still see the whole output as we currently see it. qyearsley@, any hunches about how tricky that might be?

Comment 15 by hta@chromium.org, Feb 7 2017

My two cents:
Console output is very valuable, and essential in debugging. But it's lousy for detecting pass/fail cases. So we shouldn't touch it while working on pass/fail detection.

If we can extend the approach from TestExpectations, where a single entry tells you the name of the test that's failing and the bug that's filed for getting it fixed, that would seem like the lowest possible overhead route to me.

Could we push the console output out of the test output and into stderr, which doesn't gate whether a test passes?
Owner: ----
Status: Available (was: Assigned)
If it's possible to output console messages (and other text besides subtest names and PASS/FAIL) to stderr, that sounds like it would be a good solution!

The idea of extending TestExpectations to allow for subtest-level-granularity sounds good, although my feeling now is that may not be simple to implement.
I share your suspicion. Because one can only know which tests exists by running the test, removing tests from TestExpectations that no longer exist seems trickier than when we track only files.
Cc: jochen@chromium.org
 Issue 685778  has been merged into this issue.

Comment 20 by mkwst@chromium.org, Feb 16 2017

Cc: mkwst@chromium.org
If we decide that we don't care about console messages when determining whether or not a test is passing, perhaps we could simply remove the check at https://cs.chromium.org/chromium/src/third_party/WebKit/Tools/Scripts/webkitpy/layout_tests/models/testharness_results.py?rcl=be2c7115703a6610b25128809eb388ac28d07285&l=15? Then we could keep the data in the output, but not treat it as failing the test.
That would be pretty straightforward, but would presumably leave console messages around that no longer exist, so that run-webkit-tests --reset-results seems to cause unrelated changes? Maybe that kind of confusing is better than what we have though?
it's already now confusing, as the order of console messages is not necessarily deterministic, and you end up having to skip the tests anyways

Comment 24 by mkwst@chromium.org, Feb 16 2017

foolip@: You wouldn't need the `-expected.txt` file if we stopped treating console warnings/errors as failing the tests. I guess I don't understand where the "unrelated changes" would come from in that scenario.
For tests that have some failing subtest, a -exepected.txt file is still the only concrete proposal for handling that. We could omit console output entirely, which might be fine, but if it is included but ignored, then presumably we would land changes that change the console output but don't update the -expected.txt files. The next person to touch it would see some unrelated changes.
Crazy idea: add some methods to content_shell so it understand whether a testharness based test passed or failed


That bit is probably not too hard, at worst a matter of parsing the existing output, but how to mark a subtest as failing if we don't have the -expected.txt files? I had a suggestion in https://bugs.chromium.org/p/chromium/issues/detail?id=679742#c12 but think it'd be frustrating having to mint the URL by hand when ignoring subtests.
In https://codereview.chromium.org/2678043002, I filtered all of the test output to remove console log messages and push them onto stderr. Currently it only applies to WPT, since there are tests outside WPT that are deliberately looking for console output.

If there are tests outside of WPT that also want to ignore console output we have two choices:

1) Add testRunner.setDumpConsoleMessages(false); at the start of the test.
2) Adapt the console filtering code from my change to detect a string written to console that says don't write console messages to the test output.

The advantage of (2) is that you still get the console messages for debugging purposes.

A final 3rd approach might be to turn on console filtering for all tests and selectively disable it for tests like https://cs.chromium.org/chromium/src/third_party/WebKit/LayoutTests/plugins/window-open.html which need the console output. Might be a lot of effort to determine what tests to label in this way.
Project Member

Comment 29 by bugdroid1@chromium.org, Mar 13 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/f8d3488e588458268f165f6e21e2e86bff0b9b6d

commit f8d3488e588458268f165f6e21e2e86bff0b9b6d
Author: delphick <delphick@chromium.org>
Date: Mon Mar 13 13:29:31 2017

Hide console log messages for imported WPT tests

Move console output messages for WPT tests from the test output file into stderr so they no longer affect whether the test passes or fails but can still be used when debugging a test. Remove all CONSOLE lines from external/wpt/**/*-expected.txt.

Update tests in following directories to add a console.warn statement to prevent the presubmit from trying to remove the -expected.txt file since these tests rely on console messages to determine whether they pass/fail:

third_party/WebKit/LayoutTests/csspaint (logs are from worklets which have no other way to communicate the test status)

Discussion on console messages in testharness tests can be found here:
https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/3Ob2lq14KEs

BUG=679742

Review-Url: https://codereview.chromium.org/2678043002
Cr-Commit-Position: refs/heads/master@{#456362}

[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/content/shell/browser/layout_test/blink_test_controller.cc
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/content/shell/browser/layout_test/blink_test_controller.h
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/content/shell/common/shell_messages.h
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/content/shell/renderer/layout_test/blink_test_runner.cc
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/content/shell/renderer/layout_test/blink_test_runner.h
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/content/shell/test_runner/test_interfaces.cc
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/content/shell/test_runner/web_frame_test_client.cc
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/content/shell/test_runner/web_test_delegate.h
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/FileAPI/idlharness-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/WebIDL/current-realm-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/dom/events/EventTarget-dispatchEvent-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/dom/interfaces-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/dom/ranges/Range-mutations-dataChange-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/dom/ranges/Range-mutations-splitText-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/browsers/browsing-the-web/navigating-across-documents/012-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/browsers/browsing-the-web/unloading-documents/001-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/browsers/offline/application-cache-api/api_update-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/dom/reflection-embedded-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/semantics/document-metadata/the-meta-element/pragma-directives/attr-meta-http-equiv-refresh/allow-scripts-flag-changing-1-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/semantics/document-metadata/the-meta-element/pragma-directives/attr-meta-http-equiv-refresh/allow-scripts-flag-changing-2-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/semantics/document-metadata/the-meta-element/pragma-directives/attr-meta-http-equiv-refresh/dynamic-append-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/semantics/document-metadata/the-meta-element/pragma-directives/attr-meta-http-equiv-refresh/moving-documents-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/html/semantics/embedded-content/media-elements/interfaces/TextTrack/mode-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/html/semantics/embedded-content/media-elements/loading-the-media-resource/resource-selection-invoke-set-src-networkState-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/html/semantics/embedded-content/the-img-element/update-the-source-set-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/html/semantics/forms/the-input-element/date-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/semantics/forms/the-input-element/type-change-state-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/semantics/scripting-1/the-script-element/module/imports-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/semantics/scripting-1/the-script-element/nomodule-set-on-external-module-script-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/semantics/scripting-1/the-script-element/nomodule-set-on-inline-module-script-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/syntax/parsing/html5lib_innerHTML_adoption01-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/syntax/parsing/html5lib_tests11-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/idle-callbacks/callback-suspended-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/events/inline-event-handler-ordering-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/events/invalid-uncompiled-raw-handler-compiled-late-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/events/invalid-uncompiled-raw-handler-compiled-once-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/addEventListener-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/body-onerror-compile-error-data-url-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/body-onerror-compile-error-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/body-onerror-runtime-error-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/compile-error-cross-origin-setInterval-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/compile-error-cross-origin-setTimeout-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/compile-error-in-body-onerror-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/compile-error-in-setInterval-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/compile-error-in-setTimeout-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/compile-error-same-origin-with-hash-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/runtime-error-cross-origin-setInterval-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/runtime-error-cross-origin-setTimeout-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/runtime-error-in-setInterval-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/runtime-error-in-setTimeout-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/runtime-error-same-origin-with-hash-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/window-onerror-with-cross-frame-event-listeners-1-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/html/webappapis/scripting/processing-model-2/window-onerror-with-cross-frame-event-listeners-2-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/media-capabilities/idlharness-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/mediasession/idlharness-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/pointerevents/idlharness-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/selection/addRange-00-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/selection/addRange-04-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/selection/addRange-12-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/selection/addRange-16-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/selection/addRange-20-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/selection/addRange-24-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/selection/addRange-28-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/selection/addRange-32-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/selection/addRange-36-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/selection/removeAllRanges-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/service-workers/cache-storage/window/cache-add.https-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/service-workers/cache-storage/worker/cache-add.https-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/service-workers/service-worker/fetch-request-fallback.https-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/service-workers/service-worker/fetch-request-redirect.https-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/service-workers/service-worker/fetch-request-resources.https-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/service-workers/service-worker/websocket.https-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/streams/piping/multiple-propagation-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/streams/piping/multiple-propagation.dedicatedworker-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/streams/piping/pipe-through-expected.txt
[delete] https://crrev.com/2ac5618843f8e7b3044b9931c571d45aaf2915ca/third_party/WebKit/LayoutTests/external/wpt/streams/piping/pipe-through.dedicatedworker-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/svg/interfaces-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/web-animations/interfaces/Animatable/animate-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/web-animations/interfaces/KeyframeEffect/constructor-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/webrtc/rtcpeerconnection/rtcpeerconnection-constructor-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/webrtc/rtcpeerconnection/rtcpeerconnection-idl-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/webrtc/simplecall-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/webvtt/api/VTTCue/align-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/webvtt/api/VTTCue/line-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/webvtt/api/VTTCue/snapToLines-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/webvtt/api/VTTCue/text-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/external/wpt/webvtt/api/VTTCue/vertical-expected.txt
[add] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/harness-tests/console_logging-expected.txt
[add] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/harness-tests/console_logging.html
[add] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/harness-tests/wpt/README.txt
[add] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/harness-tests/wpt/console_logging-expected.txt
[add] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/harness-tests/wpt/console_logging.html
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/linux/external/wpt/html/browsers/history/the-location-interface/location-protocol-setter-non-broken-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/linux/external/wpt/html/browsers/history/the-location-interface/location-protocol-setter-non-broken-weird-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/dom/interfaces-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/html/semantics/document-metadata/the-meta-element/pragma-directives/attr-meta-http-equiv-refresh/dynamic-append-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/pointerevents/extension/idlharness-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/selection/addRange-00-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/selection/addRange-04-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/selection/addRange-12-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/selection/addRange-16-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/selection/addRange-20-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/selection/addRange-24-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/selection/addRange-28-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/selection/addRange-32-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/selection/addRange-36-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/web-animations/interfaces/Animatable/animate-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/web-animations/interfaces/KeyframeEffect/constructor-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/webvtt/api/VTTCue/align-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/webvtt/api/VTTCue/text-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/webvtt/api/VTTCue/vertical-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/webvtt/parsing/cue-text-parsing/tests/entities-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/webvtt/parsing/cue-text-parsing/tests/tags-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/webvtt/parsing/cue-text-parsing/tests/timestamps-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-mac10.11/external/wpt/webvtt/parsing/cue-text-parsing/tests/tree-building-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/dom/interfaces-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/html/semantics/document-metadata/the-meta-element/pragma-directives/attr-meta-http-equiv-refresh/dynamic-append-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/pointerevents/extension/idlharness-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/selection/addRange-00-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/selection/addRange-04-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/selection/addRange-12-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/selection/addRange-16-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/selection/addRange-20-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/selection/addRange-24-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/selection/addRange-28-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/selection/addRange-32-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/selection/addRange-36-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/web-animations/interfaces/Animatable/animate-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/web-animations/interfaces/KeyframeEffect/constructor-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/webvtt/api/VTTCue/align-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/webvtt/api/VTTCue/text-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/webvtt/api/VTTCue/vertical-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/webvtt/parsing/cue-text-parsing/tests/entities-expected.txt
[modify] https://crrev.com/f8d3488e588458268f165f6e21e2e86bff0b9b6d/third_party/WebKit/LayoutTests/platform/mac-retina/external/wpt/webvtt/parsing/cue-text-parsing/te
I'd like to revive this conversation --

I believe the above change (hiding console log messages for imported WPT tests) reduced flakiness for web-platform-tests by removing some text that shouldn't affect whether the test passes or fails, but I think it didn't go far enough yet.

In other contexts where wpt tests are run (e.g. wptrunner, Mozilla's metadata files, etc.) neither the message that goes with a failure nor the order of the subtests are an essential part of the result.

Ideally, in order to reduce flakiness, we'd want to ignore failure text completely, but this would make it harder to debug failures. Following the example in the above change, we could print failure messages to stderr, and remove them from stdout.

For example: the result might just look like:

  PASS a-subtest
  FAIL b-subtest
  FAIL c-subtest

and the stderr may contain:

  FAIL c-subtest: another detailed error message
  FAIL b-subtest: detailed error message, potentially containing times or generated strings
  CONSOLE LOG: Some logged console message

When debugging, one would check the stderr, but running tests and comparing to baselines, we would check only the essential information.

How does that sound?
That sounds tentatively good, but since it's never affected me personally, do you have specific examples of flaky tests that such a change would help?

I've been pretty happy with things since console output has been removed, BTW, so kudos for that :).
Blocking: 701234
In the recent WPT auto-imports that failed (http://crrev.com/c/479211, http://crrev.com/c/477230, http://crrev.com/c/476311) there were a bunch of tests, especially in XMLHttpRequest:

Example test results:
 - https://storage.googleapis.com/chromium-layout-test-archives/linux_chromium_rel_ng/431670/layout-test-results/results.html
 - https://storage.googleapis.com/chromium-layout-test-archives/linux_chromium_rel_ng/430508/layout-test-results/results.html

Some of these tests have failure messages with paths, which are different on different machines, but consistent on the same machine for multiple re-runs, e.g.:

  FAIL XMLHttpRequest: anonymous mode unsupported assert_equals: The deprecated anonymous:true should be ignored, cookie sent anyway expected "Cookie: test=anonymous-mode-unsupported\n" but got "{\"error\": {\"message\": \"Traceback (most recent call last):\\n  File \\"/b/s/w/ir/third_party/WebKit/Tools/Scripts/webkitpy/thirdparty/wpt/wpt/tools/wptserve/wptserve/handlers.py\\", line 246, ...

Some other tests have times, e.g.:

  FAIL lastModified set to related HTTP header if provided assert_equals: expected 1492107871000 but got 1492081416000

https://cs.chromium.org/chromium/src/third_party/WebKit/LayoutTests/TestExpectations?l=1893

Example related bugs: bug 705490,  bug 711493 
#30 SGTM. I know how to find the stderr when looking at results from the bots, but some instruction about how to see it when running tests locally would also be good. I could never figure out if it's possible except by looking at the HTML report.

If the change affects only external/wpt, then https://chromium.googlesource.com/chromium/src/+/master/docs/testing/web_platform_tests.md would be a good place to explain the behavior and how to debug.
You can just run your tests with --driver-logging. The stderr output has a ERR: prefix.

I had no idea about that. qyearsley@, if you go ahead with the final changes for this issue, can you point to that in the documentation?
Definitely! --driver-logging is useful to know about when debugging layout tests.
Components: Blink>Infra>Ecosystem
Components: -Blink>Infra>Predictability
Project Member

Comment 40 by bugdroid1@chromium.org, Aug 4 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/21f578129b2b47b16460be10b2fe84670a826536

commit 21f578129b2b47b16460be10b2fe84670a826536
Author: Philip Jägenstedt <foolip@chromium.org>
Date: Fri Aug 04 10:00:11 2017

Enable tests expected to fail/timeout but passing in wpt

Based on the "Expected to timeout, but passed" and "Expected to fail,
but passed" sections of a full run of LayoutTests/external/wpt on Linux
Debug.

Bug:  310004 , 679742,  698256 
Change-Id: Ifc5966c6e8d611b2deb67c56bf2fb06fe35e71d9
Reviewed-on: https://chromium-review.googlesource.com/599728
Reviewed-by: Raphael Kubo da Costa (rakuco) <raphael.kubo.da.costa@intel.com>
Commit-Queue: Philip Jägenstedt <foolip@chromium.org>
Cr-Commit-Position: refs/heads/master@{#491979}
[modify] https://crrev.com/21f578129b2b47b16460be10b2fe84670a826536/third_party/WebKit/LayoutTests/TestExpectations

Cc: -qyears...@chromium.org
Owner: qyears...@chromium.org
Status: Started (was: Available)
Summary: Marking WPT sub-tests as expected failing (without failing due to minor message string differences) (was: Consider some mechanism for marking WPT sub-tests as expected failing)
Summary: Mark WPT sub-tests as expected failing, without being sensitive to minor message string diffs (was: Marking WPT sub-tests as expected failing (without failing due to minor message string differences))
Blocking: -701234
Labels: -Type-Bug -Pri-2 Pri-3 Type-Feature
So, after considering stripping error messages from testharness output for wpt, I'm not sure if that's the right thing to do since it might make debugging more difficult.

Also, the number of tests that have nondeterministic error message strings is fairly small. In particular, the tests with such problems are currently:

  Bug 705490 external/wpt/fetch/api/request/...
  Bug 679742 external/wpt/mediacapture-streams/...
   Bug 711493  external/wpt/XMLHttpRequest/...
Owner: ----
Status: Available (was: Started)
Project Member

Comment 45 by sheriffbot@chromium.org, Aug 24

Labels: Hotlist-Recharge-Cold
Status: Untriaged (was: Available)
This issue has been Available for over a year. If it's no longer important or seems unlikely to be fixed, please consider closing it out. If it is important, please re-triage the issue.

Sorry for the inconvenience if the bug really should have been left as Available.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Cc: robertma@chromium.org
Status: Available (was: Untriaged)
This would still be useful. As it is, the subtest order and failure messages must be stable, which is an extra constraint that doesn't exist in upstream wpt or in Gecko's infrastructure. This created trouble for me today in issue 888470.

Sign in to add a comment