webkit-patch rebaseline-cl removes lines with [ Failure Pass ] or [ Skip ] |
||||
Issue descriptionSteps to reproduce: 1. Create a local branch 2. git cl patch --force https://chromium-review.googlesource.com/c/chromium/src/+/664242 3. webkit-patch rebaseline-cl Expected result: For tests marked Skip or flaky on some platforms, we should rebaseline them on other platforms, and leave the TestExpectations lines as-is. Actual result: The script removes lines containing [ Failure Pass ] or [ Skip ], etc. from TestExpectations: $ git diff LayoutTests/TestExpectations --- a/third_party/WebKit/LayoutTests/TestExpectations +++ b/third_party/WebKit/LayoutTests/TestExpectations @@ -2915,7 +2915,6 @@ crbug.com/736177 [ Mac10.12 ] css2.1/t1202-counter-04-b.html [ Failure Pass ] crbug.com/736177 [ Mac10.12 ] css2.1/t1202-counters-04-b.html [ Failure Pass ] crbug.com/736177 [ Mac10.12 ] fast/block/basic/001.html [ Failure Pass ] crbug.com/736177 [ Mac10.12 ] fast/block/float/intruding-painted-twice.html [ Failure Pass ] - crbug.com/736177 [ Mac10.12 ] fast/borders/inline-mask-overlay-image-outset.html [ Failure Pass ] crbug.com/736177 [ Mac10.12 ] fast/css-generated-content/012.html [ Failure Pass ] crbug.com/736177 [ Mac10.12 ] fast/css-generated-content/014.html [ Failure Pass ] crbug.com/736177 [ Mac10.12 ] fast/css/acid2-pixel.html [ Failure Pass ] @@ -3484,7 +3483,6 @@ crbug.com/757165 [ Win ] http/tests/inspector/extensions/extensions-panel.html [ crbug.com/757165 [ Win ] http/tests/inspector/sources/ui-source-code-metadata.html [ Skip ] crbug.com/757165 [ Win ] http/tests/misc/client-hints-accept-meta-preloader.html [ Skip ] crbug.com/757165 [ Win ] inspector-protocol/debugger/debugger-evaluate-in-worker-while-pause-in-page.js [ Skip ] -crbug.com/757165 [ Win ] paint/invalidation/filter-repaint-accelerated-child-with-filter-child.html [ Skip ] crbug.com/757165 [ Win ] paint/invalidation/filter-repaint-on-accelerated-layer.html [ Skip ] crbug.com/757165 [ Win ] paint/invalidation/scrollbar-damage-and-full-viewport-repaint.html [ Skip ] crbug.com/757165 [ Win ] virtual/gpu/fast/canvas/canvas-filter-modified-save-restore.html [ Skip ]
,
Sep 14 2017
#c1 SGTM. Just one comment about Skip: perhaps it might be better to assume that the test produces the same new result on the specified platforms as on the fallback platforms. This will reduce extra platform baselines. For example, for paint/invalidation/filter-repaint-accelerated-child-with-filter-child.html, we have only the common baseline, but rebaseline-cl will create new baselines for linux and mac by assuming windows still produces the old result. If we assumed that windows also produce the same new result, we would update the common baseline only. The script can handle this in the same way as it fills in missing results for builders without result.
,
Dec 18 2017
robertma@, have you observed this to be a problem for wpt import?
,
Dec 19 2017
I didn't realize rebaseline-cl would also rebaseline tests marked as flaky... That's a bit counter-intuitive. But I don't remember seeing examples of this. . As for skipped tests, I'm confused why they'd be rebaselined. IIUC, rebaseline-cl downloads results from the try jobs; and since the tests are skipped, they wouldn't have results in the first place. What do I miss? I'll double check the code and the repro case.
,
Dec 20 2017
I'm not sure, but I think the complicating issue here is: what happens when a test is expected flaky or skipped on some platforms, and we want to rebaseline for the other platforms, e.g. marked flaky on mac, skipped on android, and we want to update baselines for linux and win. In theory, there is logic right now in rebaseline.py to handle rebaselining some platforms but not others: https://cs.chromium.org/chromium/src/third_party/WebKit/Tools/Scripts/webkitpy/tool/commands/rebaseline.py?l=383 In AbstractParallelRebaselineCommand.rebaseline, the platforms to rebaseline are kept inside the TestBaselineSet object that's passed in, and the lines/platforms to update in the TestExpectations file should be tracked in the lines_to_remove variable. I haven't yet tried to reproduce this so I don't know what the actual behavior is :-/ Looking back on rebaseline.py I see that there were still more refactoring changes that I think should be made, and the updating of the test expectations files could possibly be simplified...
,
Jan 4 2018
,
Jan 4 2018
Starting to work on this as a P1 as it blocks wpt-import.
,
Jan 5 2018
I plan to fix the wrong update of expectations first, i.e., stop removing flaky or skipped expectations incorrectly. As for optimizing the baselines, it'd be a bit more trickier. Though we do have logic for filling in missing results, it currently works on the level of builds, i.e. if a whole build is missing (did not complete), another build is selected to fill in that port. Yet in this case, the skipped platform will have complete build, but the skipped test cannot be found in the build. The fill-missing logic has to be modified to work on individual test level. It probably won't be too hard, but I'd like to separate the tasks: fix first, and then optimize.
,
Jan 5 2018
Besides, regarding optimization, optimize-baseline might be in the better position. There, we can pretend skipped platforms have some kind of "wildcard baselines" that match everything, and then optimize-baseline can exercise its de-dupe algorithm. This also has the extra benefit of always achieving the optimal result, which cannot be guaranteed with the fill-missing approach (if baselines on other non-skipped ports are not all the same).
,
Jan 9 2018
https://crrev.com/c/857600 for the fix
,
Jan 10 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/78fefdf15bd63c0ef3c462ce97d2e86f2997d107 commit 78fefdf15bd63c0ef3c462ce97d2e86f2997d107 Author: Robert Ma <robertma@chromium.org> Date: Wed Jan 10 23:16:16 2018 Stop removing flaky or skip expectations when rebaselining Previously, `wekbit-patch rebaseline{-*}` would remove an expectation if no baseline suffixes could be found for a test run, which was intended to remove failure expectations for passing tests. (Behaviour was introduced at 04d907f857afaabad4705d8b9c4919950d238070). However, there were two problems in using the triggering condition: 1. Only failing_results.json was fetched, which does not include entries for skipped tests, so skipped tests might be incorrectly considered as unexpected passes and the skip expectations would be removed. 2. Flaky expectations were not honored. A passing result of a flaky test would cause its expectation to be removed. Hence, we should 1) fetch full results; 2) rely on the "is_unexpected" field in the results. In addition, 3) a pre-caution is added to check the port name of the builder, and only proceed to update expectations if the builder's port is same as the port we are rebaselining. This is necessary because rebaseline-cl can fill in missing results of a port with other builders, and we don't want to update expectations in such case. Unit tests are also overhauled and reorganized. Bug: 765273 Change-Id: I2631a7244e2c95a33ab0ef2322d24dfc946899f3 Reviewed-on: https://chromium-review.googlesource.com/857600 Commit-Queue: Robert Ma <robertma@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> Cr-Commit-Position: refs/heads/master@{#528467} [modify] https://crrev.com/78fefdf15bd63c0ef3c462ce97d2e86f2997d107/third_party/WebKit/Tools/Scripts/webkitpy/tool/commands/rebaseline.py [modify] https://crrev.com/78fefdf15bd63c0ef3c462ce97d2e86f2997d107/third_party/WebKit/Tools/Scripts/webkitpy/tool/commands/rebaseline_cl.py [modify] https://crrev.com/78fefdf15bd63c0ef3c462ce97d2e86f2997d107/third_party/WebKit/Tools/Scripts/webkitpy/tool/commands/rebaseline_test_unittest.py [modify] https://crrev.com/78fefdf15bd63c0ef3c462ce97d2e86f2997d107/third_party/WebKit/Tools/Scripts/webkitpy/tool/commands/rebaseline_unittest.py
,
Jan 11 2018
The CL above should provide correctness. The optimization issue is tracked in a new issue 801287, which is a desired feature, but not an urgent bug. Closing this bug as FIXED. |
||||
►
Sign in to add a comment |
||||
Comment 1 by qyears...@chromium.org
, Sep 14 2017Status: Available (was: Untriaged)