Android try bot for rebaseline-cl seems to incorrectly pass layout tests. |
|||
Issue descriptionI'm running rebaseline-cl on https://codereview.chromium.org/2820743003/ patchset 6. The patch changes how we paint underlines, so I'd expect anywhere from 20 - 150 tests with modified images on every platform, including Android (which should be similar to Linux). For some reason, the android_blink_rel build is green and shows all layout tests passing, even though every other platform fails as I expected. https://build.chromium.org/p/tryserver.chromium.android/builders/android_blink_rel/builds/2310 I'd expect it to look like linux results: https://build.chromium.org/p/tryserver.blink/builders/linux_trusty_blink_rel/builds/8111 I can use --fill-missing but I'm worried that there may be some actual Android specific diffs, which IIRC would mean that while I would clear the CQ, my patch would make the Android build go red, and I'd end up either having to revert or figure out how to manually update baselines or redo those with the rebaseline bot subsequently. Also, I did experiment with --fill-missing via: 1. Ran rebaseline-cl normally, it fetched and updated ~100-ish expectations but logging noted android results were missing. 2. Ran it again with --fill-missing, expecting to see linux/android baselines merged in some manner. What actually happened was another ~50 - 100 updates, where it looked liked it collapsed the win7-specific subdirs down into just 'win'. I'm not sure whether the above was actually incorrect. Figured worth filing bug to hear thoughts, I will look at more tomorrow.
,
Apr 18 2017
The reason why Android passes here is because it only runs the smoke tests (https://cs.chromium.org/chromium/src/third_party/WebKit/LayoutTests/SmokeTests), and the set of smoke tests has no overlap with the set of failing tests for that change. That is, that change almost certainly affects results on Android, but we only run a very small number of tests on Android, so we don't discover the results that are different. I'm surprised that it said that it said that android results were missing though, since they should be available (https://storage.googleapis.com/chromium-layout-test-archives/android_blink_rel/2310/layout-test-results/results.html). I expect that when you run rebaseline-cl it should log something like: Got 404 response from: https://storage.googleapis.com/chromium-layout-test-archives/android_blink_rel/2310/layout-test-results/failing_results.json if it failed to get results. Do you see that?
,
Apr 18 2017
Yes I saw something like what you said. Ok, hmm. What is my best next step? How do we produce, fetch and integrate Android changes?
,
Apr 18 2017
BTW I will re-run script tonight/tomorrow and attach specific output so we have it for record.
,
Apr 18 2017
Well, the status quo is that we're not covering Android currently, so for just that particular change, we could go ahead and ignore android for now. In order to actually add more coverage in Android, we just need to add more tests to SmokeTests ( bug 412732 ). AFAIK there's no good reason why we didn't do this before; this would be done by https://chromium-review.googlesource.com/c/481042/. You could add tests to SmokeTests yourself for that change in order to get android_blink_rel to run the tests -- or you could go ahead for now with rebaselining for other platforms, and when that change above is submitted then the relevant tests should be covered.
,
Apr 19 2017
Ah, thanks. I thought we were running most/all paint-related layout tests on Android builder on waterfall, but it sounds like if we were, perhaps we aren't currently? If we enable some we just need to make sure there is a trybot people can add to their change to see and somehow resolve failures before landing. We had an issue with this in past, see http://crbug.com/512856. And then I would have thought the new trybot added for use with rebaseline-cl would be another potential source but that one also seems to only run SmokeTests. I am actually ok with the current status quo for now. Using more Android bot resources and increasing time to run, etc. doesn't seem warranted unless we're actually breaking functionality or burning dev time for some reason without enhancing smoke tests. +pdr as he may have more background/opinion on layout tests on Android.
,
Apr 21 2017
Yep, so the current status is that android_blink_rel also runs only smoke tests, and currently if you want to try your change on android on tests that aren't in smoke tests, you have to include a change to smoke tests as part of the patch. No new bot was added for rebaseline-cl for android, we just use android_blink_rel. Update: I've now committed https://chromium-review.googlesource.com/c/481042/ which increases the number of smoke tests to about 1000, including many of the tests with different results in https://codereview.chromium.org/2820743003/. For example, for fast/text/international/bidi-neutral-directionality-paragraph-start.html: https://chromium-review.googlesource.com/c/481042/6/third_party/WebKit/LayoutTests/platform/android/fast/text/international/bidi-neutral-directionality-paragraph-start-expected.png |
|||
►
Sign in to add a comment |
|||
Comment 1 by wkorman@chromium.org
, Apr 18 2017Owner: qyears...@chromium.org
Status: Assigned (was: Available)