Fix telemetry unit tests on chromeos mojo bots |
||||||||||||||
Issue description
We are currently running telemetry unit-tests on the chromeos FYI bots [1]. Some of the tests are currently failing.
telemetry_perf_unittests telemetry_perf_unittests
Run on OS: 'Ubuntu-14.04'
Total tests: 319
* Passed: 288 (288 expected, 0 unexpected)
* Skipped: 27 (27 expected, 0 unexpected)
* Failed: 1 (0 expected, >>>1 unexpected<<<)
* Flaky: 3 (3 expected, 0 unexpected)
Unexpected Failures:
* scripts_smoke_unittest.ScriptsSmokeTest.testRunTelemetryBenchmarkAsGoogletest
and
telemetry_unittests telemetry_unittests
Run on OS: 'Ubuntu-14.04'
Total tests: 1159
* Passed: 1072 (1072 expected, 0 unexpected)
* Skipped: 86 (86 expected, 0 unexpected)
* Failed: 1 (0 expected, >>>1 unexpected<<<)
* Flaky: 0 (0 expected, 0 unexpected)
Unexpected Failures:
* telemetry.page.page_run_end_to_end_unittest.ActualPageRunEndToEndTests.testTrafficSettings
We should fix (or disable, as appropriate) these tests, to turn the tests green.
[1] https://build.chromium.org/p/chromium.fyi/builders/Mojo%20ChromiumOS
,
Aug 23 2017
Ned is out for a week, I am happy to investigate. Removing him and I will add the appropriate person once I triage.
,
Aug 23 2017
Oh I didn't see this was on chrome os. Chrome os team needs to investigate. Let me know if you have specific questions. Thanks!
,
Aug 23 2017
Yeah the problem is that while the build is chromeos, the telemetry tests think it is linux. As a result linux_platform_backend.py is being used to determine which tests are enabled. I'm assigning this to Achuith who is on the chromeos telemetry team.
,
Aug 23 2017
I would start by figuring out how to get it to use cros_platform_backend.py instead. https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/platform/cros_platform_backend.py
,
Aug 23 2017
Yeah, we could use some pointers to make sure that our configuration actually picks up cros_platform_backend.py
We've setup the run to use --browser=exact and are pointing it to a test executable:
{
"args": [
"--browser=exact",
"--browser-executable=./test_chrome",
"--jobs=1"
],
"isolate_name": "telemetry_unittests",
"name": "telemetry_unittests",
"swarming": {
"can_use_on_swarming_builders": true,
"shards": 4
}
}
This leads the bot to run the command:
/usr/bin/python ../../testing/scripts/run_telemetry_as_googletest.py --xvfb ../../tools/perf/run_telemetry_tests -v --jobs=1 --chrome-root ../../ --browser=exact --browser-executable=./test_chrome --jobs=1 --isolated-script-test-output=/b/s/w/ioV8O9la/output.json --isolated-script-test-chartjson-output=/b/s/w/ioV8O9la/chartjson-output.json
,
Aug 23 2017
+ sullivan Annie is there someone that might know that code better for the different backends that could could better answer their questions?
,
Aug 23 2017
+ bccheng Ben, can anyone on your team dig into this?
,
Aug 24 2017
Failing tests:
telemetry_perf_unittests
scripts_smoke_unittest.ScriptsSmokeTest.testRunTelemetryBenchmarkAsGoogletest
telemetry_unittests
telemetry.internal.actions.action_runner_unittest.ActionRunnerTest.testWaitForElement
telemetry.page.page_run_end_to_end_unittest.ActualPageRunEndToEndTests.testTrafficSettings
,
Aug 25 2017
,
Aug 25 2017
After speaking to achuith@ it is not possible to run the chromeos versions of the tests on a linux chromeos bot. The chromeos backend has expectancies for lower level chromeos functions that are not present in this build. As a result we should look into disabling the tests failing in this hybrid configuration. I was looking at the decorators, but usage states that these support "platforms, os versions or browser types." I don't want to disable these across general linux runs. We specify our executable via --browser=exact. Locally I confirmed that I can disable these tests with "exact" as a decorator. Though that also seems a bit heavy-handed. Is there a more appropriate way to disable these?
,
Aug 25 2017
Randy can you briefly weigh in on how to disable in the new world? Here is an overview (if this is even what you are looking for) of this new disabling technique (we are moving away from decorators in telemetry) in case this doesn't make it to the top of Randy's list: https://chromium.googlesource.com/chromium/src/+/master/docs/speed/perf_bot_sheriffing.md#disabling-telemetry-tests
,
Aug 25 2017
Randy indicated the following: Unit tests still use decorators.py decorators. Story expectations are for benchmarks only. So the story expectations disabling link I sent you might not be relevant. He will chime in with more specifics when he can.
,
Aug 25 2017
Not much to add, but here are some examples of unit tests for telemetry that are disabled: https://cs.chromium.org/search/?q=f:tools/perf+decorators.Disabled&sq=package:chromium&type=cs
,
Aug 25 2017
> Locally I confirmed that I can disable these tests with "exact" as a > decorator. Though that also seems a bit heavy-handed. Maybe we could filter based on some command line flag? For example, '--override-os-version=linux-chromeos' could override the value returned from GetOSVersionName() [1] to be 'linux-chromeos', which would allow us to filter based on that. Or, override GetOSVersionDetailString(), and include that in the list of attributes used for filtering [2]. [1] https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/internal/platform/linux_platform_backend.py?l=53 [2] https://cs.chromium.org/chromium/src/third_party/catapult/telemetry/telemetry/decorators.py?l=349
,
Aug 28 2017
I'm working on a patch based on #15 which will then let us setup decorators based on a command line flag.
,
Aug 28 2017
Adding Ned for suggestions he may have on how to proceed. I'm not sure the solution suggested in #15 works for the telemetry team.
,
Aug 29 2017
The review is available here: https://codereview.chromium.org/3006733002/ As noted in the review we will wait for Ned to return from OOO to comment on the suggestion in #15 There is a precedent for this with other OS filtering on versions, eg 'snowleopard' has some decorators disabling for it. However there doesn't currently seem to be a good way to detect the linux-chromeos automatically. I have seen the command line flag --skip, however this seems to expect the entire set of skips to be on the command line. This is similar to how we can filter for gtests, however it looks like it lacks a file option. We have some bots configured with --test-launcher-filter-file= which lets us keep common files of skips. Whereas here we'd need to keep the set in the trybot json config files.
,
Aug 29 2017
,
Aug 29 2017
,
Aug 30 2017
Seems like there is part of a bigger effort to run Telemetry tests (unittest/perf?) on a new platform? Can someone point me to a design doc or a roadmap of this effort?
,
Aug 30 2017
We are not really bringing this up on a new platform. We are trying to run these tests on a chromeos build, on linux bots, so that we can add it in the CQ. We do this for other tests on the waterfall/cq (in fact, that is the only config we have in the chromium bots for testing chromeos). Issue 759163 is the meta-bug for the initial set of tasks we are doing. Are there additional information we can provide?
,
Aug 30 2017
Hmhh, why not run these tests on ChromeOS VM instead of running this on Linux? I would rather spend a bit more CPU/memory to make the test environment more similar to the production environment.
,
Aug 30 2017
There are already some telemetry tests being ran on actual ChromeOS devices, but they are not in the CQ. This however does give us full coverage of the production environment. Currently the cycle of upreving a chromium version to match ChromeOS takes a long time. With changes landing on the CQ often breaking compatibility with ChromeOS. Due to this the linux-chromeos configuration has been used for years to run various unittests on the CQ. We are interested in adding telemetry coverage at CQ time. So that we can reject changes breaking performance earlier on in the process. We also don't want to expose performance testing to the impact of a VM.
,
Aug 30 2017
>We also don't want to expose performance testing to the impact of a VM. We only test for breakage on CQ, so it doesn't matter if we are using VM, right? +Dirk: is using Linux platform for test coverage of ChromeOS on CQ a recommended practice?
,
Aug 30 2017
Ned, we currently run chromeos browser tests in the waterfall using this particular setup - chromeos-chrome build running on a linux GCE instance. This particular configuration is useful in the chromeos waterfall and for chromeos developers, but has not been interesting from the perspective of telemetry - the performance numbers are meaningless since we're running in a linux VM, and it's not useful for chromeos integration tests either since chromeos system services are not available on linux. So the chrome for chromeos on linux browser is treated as a linux browser by telemetry (and it will continue to do so). There is a long-standing servicification effort in chromium called Mojo: https://chromium.googlesource.com/chromium/src/+/master/mojo/README.md This is an attempt to break components of the monolithic chrome executable into smaller independent services, and use IPC to communicate between these services. So we want to break apart the chromeos window manager, the chromeos OOBE/login screen, etc from chrome. Some of these refactorings may end up regressing performance, and this is what jonross@ is interested in catching. Jon - please correct me if I'm wrong on any of this.
,
Aug 30 2017
Achuith was correct. We already have an FYI bot: https://build.chromium.org/p/chromium.fyi/builders/Mojo%20ChromiumOS Which we are using to run the classic versions of various unittest suites, next to variants running in the servicified worlds (--mash, --mus) Once we get tests working in the servicified variant we then go an enable these on the main CQ, on the existing Linux ChromiumOS builder: https://build.chromium.org/p/chromium.chromiumos/builders/Linux%20ChromiumOS%20Tests%20%281%29 We are now looking at telemetry, so that we can compare the classic mode from the servicified ones. To catch performance regressions as they happen.
,
Aug 31 2017
I think we're talking slightly past each other, which is quite common in this particular context. There are 3-5 different configurations here, using (my) terminology: 1) regular linux desktop builds (Linux Builder / Linux Tests on the chromium.linux waterfall) 2) Linux Ozone (work being done to work w/ wayland or, w/o X11, e.g. the Ozone Linux builder on the chromium.fyi waterfall) 3) Linux ChromeOS, aka "desktop ChromeOS". This is a regular linux build except that we build with target_os="chromeos" (the Linux ChromiumOS Builder on the chromium.chromiumos waterfall) 4) the "simplechrome" builds, which use pinned versions of the actual CrOS environment and build in a chroot (e.g., ChromiumOS amd64-generic Compile on the chromium.chromiumos waterfall) 5) the "ebuild" builds (only found on the chromeos waterfalls) The first three configurations run on a stock Linux machine and are where we do the bulk of our linux and "ChromeOS" testing. The last two produce binaries that run only on "real" CrOS images or hardware. We currently do not have any of these running tests on the public Chrome waterfalls. We have been working (slowly) with achuith@ and bccheng@ to fix this, as this lack of coverage is bad. achuith@ and I are working on getting CrOS VM images to run properly under qemu; Ben is working on getting actual devices to work. The Mojo ChromiumOS bot, however, is a "desktop ChromeOS" build, which is halfway between linux and true CrOS. It sounds like the telemetry scaffolding doesn't actually know how to deal with this configuration properly. I do not know offhand if this particular configuration is either interesting or viable from telemetry's point of view. Obviously, we wouldn't want to use it for actual performance numbers, since it's not a real configuration that we ship. However, if we can use it to get functional coverage of the code of some or all of the cros-specific code in telemetry, we should do that. I'm not sure what the right way to configure the telemetry tests would be to make it aware of this configuration, but hopefully we can figure out (or define) a new set of flags that'll get things done. Does that help at all?
,
Aug 31 2017
Thanks Dirk for the info. What I would want to know the most is whether (3) will be the way we run ChromeOS test for the long haul, or it will eventually be replaced by (4) or (5). If it's temporary, then I am fine with the patch in #18 as it's a bit hacky but easy to revert. If we rely on this linux host + ChromeOS browser for the long haul, we should modify cros_browser_backend to accept the possibility of a linux platform backend instance.
,
Aug 31 2017
I doubt it'll be replaced by (4) or (5) completely, the overhead of chroots + VMs / real hardware isn't worth it.
,
Aug 31 2017
So I synced with Ned offline. The posted review has been cancelled. Currently the Mojo FYI bots are using --skip to handle the consistent failures. We would like to look into adding --skip-json which can be pointed at a file of tests to skip. This matches the pattern of filter files used by gtests to skip known failures. We will want to begin using cros_browser_backend, but with linux_platform_backend. This will enable the running of linux-chromeos, but with a more appropriate test coverage. (There are differences between what desktop_browser_backend and cros_browser_backend run.) This may require introducing --browser=linux-chromeos but that will be dependant on investigations I'll start next week.
,
Aug 31 2017
Don't use --skip-json. The filter files aren't JSON, and neither are the layout_test TestExpectations. We should try to be closer to one of the other formats (probably the filter files) instead.
,
Aug 31 2017
I am fine with the filter file. Though I suggest that we won't have add the platform name in the filter file for now (until we consider removing @decorator.Disabled(..) system in Telemetry).
,
Sep 6 2017
> .. This may require introducing --browser=linux-chromeos ... One thing to note is that we use a different binary (test_chrome) when running the tests for linux-chromeos (this is to make sure that the test-only API are not available in regular chrome binary). We need to use --browser=exact for that currently. So if we use --browser=linux-chromeos, we need to make sure we pick up the correct binary for the test (test_chrome instead of chrome)
,
Sep 11 2017
,
Sep 20 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/e724b1772409ec4e6848e762289b7842ab097b22 commit e724b1772409ec4e6848e762289b7842ab097b22 Author: Jonathan <jonross@chromium.org> Date: Wed Sep 20 02:40:15 2017 Add Telemetry to linux-chromiumos bot We've been running telemetry_unittests and telemetry_perf_unittests on the Mojo FYI bots. They've been cycling green for a week, so we'd like to add them to the main linux_chromeos bot. TBR=sky@chromium.org Bug: 758065 Change-Id: I6ea2c3f619237124ebba5b691ac08bffb93652a6 Reviewed-on: https://chromium-review.googlesource.com/657858 Commit-Queue: Jonathan Ross <jonross@chromium.org> Reviewed-by: Sadrul Chowdhury <sadrul@chromium.org> Cr-Commit-Position: refs/heads/master@{#503031} [modify] https://crrev.com/e724b1772409ec4e6848e762289b7842ab097b22/testing/buildbot/chromium.chromiumos.json
,
Sep 20 2017
Perhaps it makes sense to add a cataput trybot with this configuration and make it part of the default catapult CQ run?
,
Sep 22 2017
Perhaps, I'm not familiar enough with those trybots to be able to comment on if that is a more appropriate location. I can't seem to find their status on: https://build.chromium.org/p/tryserver.client.catapult/console I was also looking into them in codesearch, but couldn't find where the configs point to GN args. I did find the builder files: build/masters/master.tryserver.client.catapult/builders.pyl
,
Sep 23 2017
I believe the catapult trybots *only* check out catapult, and so it doesn't make sense to talk about runnning a chromium build in that configuration. It would make sense to talk about making sure that configuration is tested when catapult is rolled into Chromium, though. But, it's possible I'm wrong and the catapult CQ is testing chromium builds as well; maybe sullivan@ or nednguyen@ could confirm that.
,
Sep 25 2017
#39L Dirk is right that catapult trybot only check out catapult. It currently download Chrome binary from cloud storage to run tests.
,
Oct 3 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/3f2ff8cf8a6d6f6e2bd82273ac9c98388cfd7df2 commit 3f2ff8cf8a6d6f6e2bd82273ac9c98388cfd7df2 Author: Jonathan Ross <jonross@chromium.org> Date: Tue Oct 03 00:25:20 2017 Revert "Add Telemetry to linux-chromiumos bot" This reverts commit e724b1772409ec4e6848e762289b7842ab097b22. Reason for revert: There have been a series of flakes on the trybots, that were no occurring on the Mojo FYI bots. Removing these tests from the trybots until we can find out why. Original change's description: > Add Telemetry to linux-chromiumos bot > > We've been running telemetry_unittests and telemetry_perf_unittests on the Mojo > FYI bots. They've been cycling green for a week, so we'd like to add them to > the main linux_chromeos bot. > > TBR=sky@chromium.org > > Bug: 758065 > Change-Id: I6ea2c3f619237124ebba5b691ac08bffb93652a6 > Reviewed-on: https://chromium-review.googlesource.com/657858 > Commit-Queue: Jonathan Ross <jonross@chromium.org> > Reviewed-by: Sadrul Chowdhury <sadrul@chromium.org> > Cr-Commit-Position: refs/heads/master@{#503031} TBR=sadrul@chromium.org,sky@chromium.org,jonross@chromium.org # Not skipping CQ checks because original CL landed > 1 day ago. Bug: 758065 Change-Id: I682ef5e4c7ac45384ce4b4c435ab19f23f0b80ec Reviewed-on: https://chromium-review.googlesource.com/696301 Reviewed-by: Jonathan Ross <jonross@chromium.org> Commit-Queue: Jonathan Ross <jonross@chromium.org> Cr-Commit-Position: refs/heads/master@{#505877} [modify] https://crrev.com/3f2ff8cf8a6d6f6e2bd82273ac9c98388cfd7df2/testing/buildbot/chromium.chromiumos.json
,
Oct 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7c43b531296903d31c81b866ff8f57061296812d commit 7c43b531296903d31c81b866ff8f57061296812d Author: Jonathan <jonross@chromium.org> Date: Thu Oct 05 18:36:35 2017 Skip flaking telemetry test on FYI bot Another test is flaky on telemetry_unittests, skipping it in config. TBR=sky@chromium.org Bug: 758065 Change-Id: I49849ca1b271fcceb1976f86f8ed6225a22cc302 Reviewed-on: https://chromium-review.googlesource.com/700797 Reviewed-by: Jonathan Ross <jonross@chromium.org> Commit-Queue: Jonathan Ross <jonross@chromium.org> Cr-Commit-Position: refs/heads/master@{#506803} [modify] https://crrev.com/7c43b531296903d31c81b866ff8f57061296812d/testing/buildbot/chromium.fyi.json
,
Oct 5 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/ea3a87c66f04bfdcbb6b87701ce6024b8b835f9c commit ea3a87c66f04bfdcbb6b87701ce6024b8b835f9c Author: Jonathan <jonross@chromium.org> Date: Thu Oct 05 23:32:36 2017 Another flaking telemetry test on FYI bot Another test is flaky on telemetry_unittests, skipping it in config. TBR=sky@chromium.org Bug: 758065 Change-Id: I9d08df3109e43682b1edcff11b545bc684f06ab5 Reviewed-on: https://chromium-review.googlesource.com/703684 Reviewed-by: Jonathan Ross <jonross@chromium.org> Reviewed-by: Scott Violet <sky@chromium.org> Commit-Queue: Jonathan Ross <jonross@chromium.org> Cr-Commit-Position: refs/heads/master@{#506911} [modify] https://crrev.com/ea3a87c66f04bfdcbb6b87701ce6024b8b835f9c/testing/buildbot/chromium.fyi.json
,
Oct 6 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d72a01af0498ac26c4c2fdee06fabea212d9f009 commit d72a01af0498ac26c4c2fdee06fabea212d9f009 Author: Jonathan <jonross@chromium.org> Date: Fri Oct 06 22:09:14 2017 Re-add test_chrome build config. Currently telemetry cannot differentiate between linux and linux_chromiumos. In the interm we are using the test_chrome binary as a way to get linux chromiumous telemetry running on our FYI bot and the CQ. Re-adding the build config for test_chrome Bug: 758065 Change-Id: I99b317737f34ab9c01f7d9559c4a7d2ae781f424 Reviewed-on: https://chromium-review.googlesource.com/705074 Commit-Queue: Jonathan Ross <jonross@chromium.org> Reviewed-by: Scott Violet <sky@chromium.org> Cr-Commit-Position: refs/heads/master@{#507189} [modify] https://crrev.com/d72a01af0498ac26c4c2fdee06fabea212d9f009/chrome/BUILD.gn
,
Oct 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/bb3cf44ad4f4ffab528b215c1bb687ac2119ed2d commit bb3cf44ad4f4ffab528b215c1bb687ac2119ed2d Author: Jonathan <jonross@chromium.org> Date: Wed Oct 11 19:33:13 2017 Re-add test_chrome deps to telemetry When I re-added the test_chrome config I missed re-adding it to the deps for telemetry_unittests and telemetry_perf_unittests. Some of the telemetry tests also have a dependency on the 200% resource package existing. So I've re-added that for now. Bug: 758065 Change-Id: If91cf568dd3d379928fef0c8857136aeb9297d57 Reviewed-on: https://chromium-review.googlesource.com/709463 Reviewed-by: Scott Violet <sky@chromium.org> Commit-Queue: Jonathan Ross <jonross@chromium.org> Cr-Commit-Position: refs/heads/master@{#508067} [modify] https://crrev.com/bb3cf44ad4f4ffab528b215c1bb687ac2119ed2d/chrome/BUILD.gn [modify] https://crrev.com/bb3cf44ad4f4ffab528b215c1bb687ac2119ed2d/chrome/test/BUILD.gn
,
Oct 12 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/4126fa96d6447c3754db5587fe9eb7802f239ba1 commit 4126fa96d6447c3754db5587fe9eb7802f239ba1 Author: Jonathan <jonross@chromium.org> Date: Thu Oct 12 21:14:34 2017 Disable flaking test from Mojo FYI bot Another telemetry_unittest is flaking on the Mojo FYI bot. Disabling on the bot. TBR=sky@chromium.org Bug: 758065 Change-Id: Ide5276e2f710150f402678c60cd1ae208d78716e Reviewed-on: https://chromium-review.googlesource.com/716299 Reviewed-by: Jonathan Ross <jonross@chromium.org> Reviewed-by: Scott Violet <sky@chromium.org> Commit-Queue: Jonathan Ross <jonross@chromium.org> Cr-Commit-Position: refs/heads/master@{#508469} [modify] https://crrev.com/4126fa96d6447c3754db5587fe9eb7802f239ba1/testing/buildbot/chromium.fyi.json
,
Oct 13 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/cd63a7a77fc207cc2e0c61332c4e44b75d3eebda commit cd63a7a77fc207cc2e0c61332c4e44b75d3eebda Author: Jonathan <jonross@chromium.org> Date: Fri Oct 13 20:14:45 2017 Readd telemetry unittests to linux chromium os bot. Reading the telemetry_perf_unittests and telemetry_unittests to the linux chromium os bot. This contains filtering of the known flaking tests. TBR=sky@chromium.org Bug: 758065 Change-Id: I406af89973c5d7a94b634cce5095617a1b397270 Reviewed-on: https://chromium-review.googlesource.com/718980 Reviewed-by: Jonathan Ross <jonross@chromium.org> Commit-Queue: Jonathan Ross <jonross@chromium.org> Cr-Commit-Position: refs/heads/master@{#508793} [modify] https://crrev.com/cd63a7a77fc207cc2e0c61332c4e44b75d3eebda/testing/buildbot/chromium.chromiumos.json
,
Nov 24 2017
,
Jan 23 2018
This is fixed, right?
,
Jan 24 2018
,
Jan 16
,
Jan 16
|
||||||||||||||
►
Sign in to add a comment |
||||||||||||||
Comment 1 by jonr...@chromium.org
, Aug 23 2017Cc: nednguyen@chromium.org achuith@chromium.org