New issue
Advanced search Search tips

Issue 837047 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Sep 21
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Clarify the "is_unexpected" field in the JSON test result format

Project Member Reported by robertma@chromium.org, Apr 26 2018

Issue description

https://chromium.googlesource.com/chromium/src/+/master/docs/testing/json_test_results_format.md says that "is_unexpected"

> If present and true, the failure was unexpected (a regression). If false (or if the key is not present at all), the failure was expected and will be ignored.

It doesn't cover a bunch of edge cases.

1. Unexpected passes currently also have is_unexpected set to true, which I think is sensible and should be documented.

2. Things are more complicated when we retry. For example, if a test is expected to pass but has a 'FAIL PASS' actual result, should is_unexpected be true? In  issue 822078 , I looked at recipe_modules' util.py, especially the logics related to "unexpected flake", and thought is_unexpected would be true in this case, but apparently that's not the case in blinkpy:
https://cs.chromium.org/chromium/src/third_party/blink/tools/blinkpy/web_tests/models/test_run_results.py?l=279&rcl=cb6928930fc1fb8e29aebc6dd24af31367719bfb

In layout tests, is_unexpected is set to true when *none* of the results is expected. The document is unclear about this edge case, but I think we should set is_unexpected to true when *any* result is unexpected, so that we can conveniently identify unexpected flakes, which will be of great help for FindIt.

Based on my understanding of the infra, my proposed change will make the webkit_layout_tests step to be orange more often on the bots because unexpected flakes are displayed as warnings, but won't fail the build. This will also make it easier for people to spot bad (flaky) layout tests.
 

Comment 1 by foolip@chromium.org, Apr 27 2018

 Issue 837050  has been merged into this issue.

Comment 2 by foolip@chromium.org, Apr 27 2018

 Issue 837052  has been merged into this issue.

Comment 3 by foolip@chromium.org, Apr 27 2018

Cc: foolip@chromium.org
 Issue 837245  has been merged into this issue.

Comment 4 by foolip@chromium.org, Apr 27 2018

Status: Available (was: Untriaged)
Oh sorry for the duplicates. Monorail had some database issues yesterday.
> 1. Unexpected passes currently also have is_unexpected set to true, 
> which I think is sensible and should be documented.

You're right, this should read "the result was unexpected" and it isn't necessarily a regression. We've discussed adding a separate "is_regression" field to disambiguate (see  bug 822078 ).

> 2. Things are more complicated when we retry.

The value corresponds to the result of the *last* attempt, so in the 'FAIL PASS' case is_unexpected should be false. This works only because we stop retrying after we get a passing result.

The is_unexpected result is not about detecting flakes. You can tell that something is a flake simply by seeing that it ran more than once.

Of course, if you are using --repeat-count or something like that to intentionally run a test more than once, a bunch of stuff in this format doesn't work well.

Components: Build
Labels: -Pri-3 Build-Tools-TYP Pri-1
Owner: dpranke@chromium.org
Status: Assigned (was: Available)
> The value corresponds to the result of the *last* attempt, so in the 'FAIL PASS' case is_unexpected should be false. This works only because we stop retrying after we get a passing result.

There are two reasons why I proposed to set is_unexpected to true when any of the results (attempts) fails:

1. This way we can easily tell "unexpected flakes" (pass after retry) apart from other passes. Besides, util.py already knows how to handle this case: https://cs.chromium.org/chromium/build/scripts/slave/recipe_modules/test_utils/util.py?rcl=546a13ec5959964fda7750bac483bc5efd891f85&l=116

2. In  issue 828605 , I'm adding results from all repeats/iterations to the "actual" field for FindIt. My proposed new definition of is_unexpected will also be more useful for them. (Besides, in the repeat/iteration use case, we don't stop after we get a passing result.)
Cc: seanmccullough@chromium.org
There are multiple different questions you might ask:

- was "overall" (== last) result unexpected?
- was any result unexpected?
- how many (%age) results were unexpected?

The current implementation answers the first question. Your proposed change answers the second question, but breaks the first one, i.e., it's not a backwards-compatible change.

That doesn't seem like a strict improvement to me. I prefer to optimize for the "simpler" case rather than the more complicated case, and I would prefer we keep compatibility where possible. It seems fine to me to ask the FindIt code to do more work if it's trying to do something more complicated.



Ping robertma@chromium.org. What's the status of this clarification?
I plan to land a CL updated the text today.
Status: Started (was: Assigned)
Project Member

Comment 14 by bugdroid1@chromium.org, Jun 12 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/b5f530d4fbf2083d511467906156a4891c38cf5c

commit b5f530d4fbf2083d511467906156a4891c38cf5c
Author: Dirk Pranke <dpranke@chromium.org>
Date: Tue Jun 12 22:51:32 2018

Update json_test_results_format docs.

This CL adds the `is_regression` per-test field (though it won't
yet be set by anything), and adds a bunch of text clarifying the
differences between failures, unexpected results, and regressions.

It also does some general editorial cleanup, fixing typos and
re-sorting some of the enumerated values.

BUG= 822078 ,  837047 

Change-Id: If78f6ef66c01d7c2924bfd98589832fe9299968f
Reviewed-on: https://chromium-review.googlesource.com/1096486
Commit-Queue: Dirk Pranke <dpranke@chromium.org>
Reviewed-by: Robert Ma <robertma@chromium.org>
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Reviewed-by: Chan Li <chanli@chromium.org>
Reviewed-by: Stephen Martinis <martiniss@chromium.org>
Cr-Commit-Position: refs/heads/master@{#566615}
[modify] https://crrev.com/b5f530d4fbf2083d511467906156a4891c38cf5c/docs/testing/json_test_results_format.md

Project Member

Comment 15 by bugdroid1@chromium.org, Jun 22 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/53235652f36d355532c65674a91968cedc197e56

commit 53235652f36d355532c65674a91968cedc197e56
Author: Dirk Pranke <dpranke@chromium.org>
Date: Fri Jun 22 00:17:36 2018

Update json test results for webkit_layout_tests.

This brings the test results more into compliance with the
current test result standard in bit.ly/chromium-json-test-results-format.

Notable changes:
- If a test is run multiple times, we explicitly return every
  `actual` result. Previously, if the test produced the same result every
  time, we'd only return a single value for `actual`
- If a test is skipped unexpectedly, that will be considered a regression
  and an unexpected result.
- The test results will contain `is_unexpected`, `is_flaky`, and
  `is_regression` fields for the matching conditions.

Bug:  837047 ,  822078 
Change-Id: I4896e61469d3b576ea9e7dbbe16fac709f74b6b9
Reviewed-on: https://chromium-review.googlesource.com/1103611
Commit-Queue: Dirk Pranke <dpranke@chromium.org>
Reviewed-by: Robert Ma <robertma@chromium.org>
Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
Cr-Commit-Position: refs/heads/master@{#569466}
[modify] https://crrev.com/53235652f36d355532c65674a91968cedc197e56/third_party/blink/tools/blinkpy/web_tests/models/test_expectations.py
[modify] https://crrev.com/53235652f36d355532c65674a91968cedc197e56/third_party/blink/tools/blinkpy/web_tests/models/test_expectations_unittest.py
[modify] https://crrev.com/53235652f36d355532c65674a91968cedc197e56/third_party/blink/tools/blinkpy/web_tests/models/test_run_results.py
[modify] https://crrev.com/53235652f36d355532c65674a91968cedc197e56/third_party/blink/tools/blinkpy/web_tests/models/test_run_results_unittest.py
[modify] https://crrev.com/53235652f36d355532c65674a91968cedc197e56/third_party/blink/tools/blinkpy/web_tests/run_webkit_tests_unittest.py

Project Member

Comment 16 by bugdroid1@chromium.org, Jun 22 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/687768e979b009ac2bb90ebca5d4d1af6777b0d6

commit 687768e979b009ac2bb90ebca5d4d1af6777b0d6
Author: Dirk Pranke <dpranke@chromium.org>
Date: Fri Jun 22 18:27:07 2018

Revert "Update json test results for webkit_layout_tests."

This reverts commit 53235652f36d355532c65674a91968cedc197e56.

Reason for revert: Build is no longer reporting failures correctly.

Original change's description:
> Update json test results for webkit_layout_tests.
> 
> This brings the test results more into compliance with the
> current test result standard in bit.ly/chromium-json-test-results-format.
> 
> Notable changes:
> - If a test is run multiple times, we explicitly return every
>   `actual` result. Previously, if the test produced the same result every
>   time, we'd only return a single value for `actual`
> - If a test is skipped unexpectedly, that will be considered a regression
>   and an unexpected result.
> - The test results will contain `is_unexpected`, `is_flaky`, and
>   `is_regression` fields for the matching conditions.
> 
> Bug:  837047 ,  822078 
> Change-Id: I4896e61469d3b576ea9e7dbbe16fac709f74b6b9
> Reviewed-on: https://chromium-review.googlesource.com/1103611
> Commit-Queue: Dirk Pranke <dpranke@chromium.org>
> Reviewed-by: Robert Ma <robertma@chromium.org>
> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org>
> Cr-Commit-Position: refs/heads/master@{#569466}

TBR=qyearsley@chromium.org,dpranke@chromium.org,seanmccullough@chromium.org,robertma@chromium.org

Change-Id: Icf1882e8eea328b115a458afa6378b35bb11a638
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  837047 ,  822078 
Reviewed-on: https://chromium-review.googlesource.com/1112178
Reviewed-by: Dirk Pranke <dpranke@chromium.org>
Commit-Queue: Dirk Pranke <dpranke@chromium.org>
Cr-Commit-Position: refs/heads/master@{#569703}
[modify] https://crrev.com/687768e979b009ac2bb90ebca5d4d1af6777b0d6/third_party/blink/tools/blinkpy/web_tests/models/test_expectations.py
[modify] https://crrev.com/687768e979b009ac2bb90ebca5d4d1af6777b0d6/third_party/blink/tools/blinkpy/web_tests/models/test_expectations_unittest.py
[modify] https://crrev.com/687768e979b009ac2bb90ebca5d4d1af6777b0d6/third_party/blink/tools/blinkpy/web_tests/models/test_run_results.py
[modify] https://crrev.com/687768e979b009ac2bb90ebca5d4d1af6777b0d6/third_party/blink/tools/blinkpy/web_tests/models/test_run_results_unittest.py
[modify] https://crrev.com/687768e979b009ac2bb90ebca5d4d1af6777b0d6/third_party/blink/tools/blinkpy/web_tests/run_webkit_tests_unittest.py

Project Member

Comment 17 by bugdroid1@chromium.org, Jul 31

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/src.git/+/6f42c06285f026d8a9a9cf3e4c86c076cf05a5f3

commit 6f42c06285f026d8a9a9cf3e4c86c076cf05a5f3
Author: Dirk Pranke <dpranke@chromium.org>
Date: Tue Jul 31 20:43:13 2018

Re-land "Update json test results for webkit_layout_tests."

This re-lands r569466 (reverting 569703) with the needed fixes.

This brings the test results more into compliance with the
current test result standard in bit.ly/chromium-json-test-results-format.

Notable changes:
 - If a test is run multiple times, we explicitly return every
   `actual` result. Previously, if the test produced the same result every
   time, we'd only return a single value for `actual`
 - If a test is skipped unexpectedly, that will be considered a regression
   and an unexpected result.
 - The test results will contain `is_unexpected`, `is_flaky`, and
   `is_regression` fields for the matching conditions.

Bug:  837047 ,  822078 
Change-Id: I98a80deb2e7e095fe471507a04e80b8b583ed5cc
Reviewed-on: https://chromium-review.googlesource.com/1154107
Commit-Queue: Dirk Pranke <dpranke@chromium.org>
Reviewed-by: Ned Nguyen <nednguyen@google.com>
Reviewed-by: Robert Ma <robertma@chromium.org>
Cr-Commit-Position: refs/heads/master@{#579556}
[modify] https://crrev.com/6f42c06285f026d8a9a9cf3e4c86c076cf05a5f3/third_party/blink/tools/blinkpy/web_tests/models/test_expectations.py
[modify] https://crrev.com/6f42c06285f026d8a9a9cf3e4c86c076cf05a5f3/third_party/blink/tools/blinkpy/web_tests/models/test_expectations_unittest.py
[modify] https://crrev.com/6f42c06285f026d8a9a9cf3e4c86c076cf05a5f3/third_party/blink/tools/blinkpy/web_tests/models/test_run_results.py
[modify] https://crrev.com/6f42c06285f026d8a9a9cf3e4c86c076cf05a5f3/third_party/blink/tools/blinkpy/web_tests/models/test_run_results_unittest.py
[modify] https://crrev.com/6f42c06285f026d8a9a9cf3e4c86c076cf05a5f3/third_party/blink/tools/blinkpy/web_tests/run_webkit_tests_unittest.py

Status: Fixed (was: Started)
I think this is fixed now.

Sign in to add a comment