linux_site_isolation trybot should be always (non-optionally) included for CQ |
||||||||
Issue descriptionWe would like the CQ to always include the linux_site_isolation trybot. So far the bot was optional. There is a presubmit (*) adding the bot to a subset of CLs that touch //content/browser/frame_host. The extra coverage is desired because: - It will avoid a situation when a CL that doesn't require linux_site_isolation lands and breaks the bot, effectively blocking all subsequent CLs that do require the linux_site_isolation bot - We continue to ship more isolation modes (--isolate-extensions in M56, sign-in-isolation in M63, more to come) The bot seems fast-enough and stable-enough to be added to all CQ runs: - Sharding is enabled on the bot (including layout tests) - I am not sure how to check how much time the trybots takes on average. FWIW, looking at the FYI bot (**) that uses the same config, the runtime is usually around 30 minutes (lloking at builds 22730-22735). - The FYI bot (**) using the same config stays mostly green (in particular - the most recent redness starting at build #22758 is caused by issue 786539 - that issue would be prevented by CQ if it included linux_site_isolation) (*) https://chromium.googlesource.com/chromium/src/+/e95c0692055388f752fd7b8db460024944176534/content/browser/frame_host/PRESUBMIT.py (**) https://build.chromium.org/p/chromium.fyi/builders/Site%20Isolation%20Linux?numbuilds=200
,
Nov 17 2017
Raising priority since our existing CQ coverage is going away in issue 786555 . I agree that having CQ coverage at all should include it being monitored by sheriffs, which presumably means being on the main waterfall?
,
Nov 18 2017
So, fortunately, "Site Isolation Linux" (and the matching linux_site_isolation trybot) are actually the same build configuration as "Linux Builder" / "Linux Tests" (and linux_chromium_rel_ng). So, all we need to do is make sure that the tests that are running on "Site Isolation Linux" are moved over to "Linux Tests" (and that we have the capacity for this). To do that, you'd just copy the "Site Isolation Linux" entries in //testing/buildbot/chromium.fyi.json over to the "Linux Tests" entry in //testing/buildbot/chromium.linux.json. If you do that, you should be able to just make the change in a CL and run a linux_chromium_rel_ng try job and see it work. However, given that you're running browser_tests and the layout tests, I'd guess we probably don't actually have the capacity at the moment (kinda depends on how often the Cq-Include-Trybots thing was triggering, compared to how often analyze would trigger it). That said, next week should be light, traffic-wise, so we could probably land things now and see how it goes while we work on getting more capacity. (jbudorick, smut, fyi ...)
,
Nov 18 2017
I should have noted that once linux_site_isolation is part of the CQ, then some steps can be removed from linux_chromium_rel_ng (all the steps beginning with site_per_process - for example site_per_process_browser_tests, site_per_process_content_browsertests, site_per_process_content_unittests, etc.). OTOH, some site-per-process tests are currently unique to linux_site_isolation bot (e.g. layout tests, interactive_ui_tests, extensions_browsertests ,extensions_unittests). Not sure how much this helps lower the capacity requirements and if/how much this helps with capacity planning...
,
Nov 20 2017
@lukasza: curious if we need layout tests running with site isolation on CQ? The heavy test suites (browser tests based) are mostly run, so apart from layout tests this this bug is just asking for a few fast tests to be added.
,
Nov 20 2017
IMHO layout tests are right now the most common culprit for turning the Site Isolation Linux FYI bot red - adding CQ coverage for OOPIF support in layout tests would go a long way toward avoiding OOPIF-specific regresions. Please note that "OOPIF support" is both A) --site-per-process and B) a long list of WPT-specific origins to isolate - see the current config of the Site Isolation Linux FYI bot for details. Since dpranke@ asked this offline, let me also clarify that the Site Isolation Linux FYI bot runs *all* layout tests - this gives OOPIF coverage for 1) http/tests, 2) external/wpt, 3) any virtual test suites that include 1 or 2 and 4) any other layout tests that happen to trigger a different behavior in site-per-process mode (not sure how many tests fall into 4th category in practice / I don't remember specific examples.) creis@/alexmos@ - how much do we care about CQ coverage for other test suites? I know that some OOPIF-specific functionality (e.g. drag-and-drop) is for example only verified by interactive_ui_tests, but in practice I don't think we've seen failures from this test suite on the FYI bot, right? Same question for extension_browsertests/unittests, components_browsertests/unittests and other test suites covered by the FYI bot but not CQ. CQ currently covers site_per_process_browser_tests, site_per_process_unit_tests, site_per_process_content_browsertests, site_per_process_content_unittests (based on https://cs.chromium.org/search/?q=file:json+file:testing+site_per_process&sq=package:chromium&type=cs).
,
Nov 20 2017
#6: I think we've seen failures in interactive_ui_tests, but maybe not the other examples you brought up. Theoretically, anything that uses content/ could behave differently with --site-per-process, which is how we've determined the list of test targets originally. In practice, you're right that pretty much all of the failures we've ever seen were in content_unittests, content_browsertests, browser_tests, unit_tests, interactive_ui_tests, or layout tests. I agree that getting layout tests with --site-per-process added to the CQ is the highest priority. I'd also add interactive_ui_tests if possible, and maybe we can keep relying on our FYI bots for the rest.
,
Nov 21 2017
note: I think adding the extra browser test based harnesses you listed is pretty cheap compared to the existing ones that are running. About layout_tests, is it possible to use flagsexpectations for this? that way we avoid having to do the isolate/archive steps twice? It's not clear that we really care to run the other virtual test suites with this in practice. I'm just wondering if there's a lighter weight way of getting most of the benefit of running all of layout tests. Could we try a virtual test suite and see what the new rate of failures is on the FYI bot?
,
Nov 21 2017
FlagsExpectations is a specific mechanism separate from VirtualTestSuites. We could likely do either, or add an additional expectations file (like we use the filter files for gtests). I agree you'd want something where we could reuse the isolate.
,
Dec 5 2017
dpranke@: any updates on this? I think the highest priority for us is to get CQ coverage for layout tests running in --site-per-process (prior to r517627 we at least had it for content/browser/frame_host changes). A bit lower priority would be to also get coverage for the other test targets that linux_site_isolation runs with --site-per-process (everything other than content_unittests, unit_tests, content_browsertests, browser_tests, which do have site-per-process CQ coverage on the linux bots).
,
Dec 5 2017
Sorry for the delay; we have more capacity and should be able to add this now. I'll try to get to it tomorrow.
,
Dec 6 2017
Thanks! That will be very helpful to keep things working as we run trials of this mode, and help avoid problems like issue 786539 (which happened multiple times).
,
Dec 11 2017
I just noticed that adding "Cq-Include-Trybots: master.tryserver.chromium.linux:linux_site_isolation" to my CL description of https://chromium-review.googlesource.com/c/chromium/src/+/817661 didn't seem to kick off linux_site_isolation try jobs for the CQ. Could that be related to in-progress work here? (Or do I have a typo?) :) I can file a separate issue about Cq-Include-Trybots if needed, but finishing this one (i.e., getting --site-per-process coverage on the waterfall) would be sufficient for me as well. Thanks!
,
Dec 12 2017
CL out for review / testing now in https://crrev.com/c/821554; if it turns up a bunch of bugs I'll likely punt this over to the site isolation folks to debug and land. Hopefully this will make the thing in #c13 moot. Sorry for the delay :(.
,
Dec 14 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/7feb880d77166ca4cadb5a7f27acf1f2e54f97ad commit 7feb880d77166ca4cadb5a7f27acf1f2e54f97ad Author: Dirk Pranke <dpranke@chromium.org> Date: Thu Dec 14 19:27:45 2017 Add remaining site_per_process test suites to main Linux bots. This CL adds - site_per_process_components_browsertests - site_per_process_components_unittests - site_per_process_content_browsertests - site_per_process_content_unittests - site_per_process_extensions_browsertests - site_per_process_extensions_unittests - site_per_process_interactive_ui_tests - site_per_process_sync_integration_tests to the Linux Tests and Linux Tests (dbg)(1) builders; this will also add them to linux_chromium_rel_ng in the CQ. This CL also removes these tests from the "Linux Site Isolation" builder on chromium.fyi. R=jam@chromium.org, creis@chromium.org, jbudorick@chromium.org BUG= 786554 Change-Id: I5b26655138a0120e17a88e1f6aee1ae73cbb8571 Reviewed-on: https://chromium-review.googlesource.com/821554 Reviewed-by: John Budorick <jbudorick@chromium.org> Reviewed-by: Dirk Pranke <dpranke@chromium.org> Reviewed-by: John Abd-El-Malek <jam@chromium.org> Reviewed-by: Alex Moshchuk <alexmos@chromium.org> Commit-Queue: Dirk Pranke <dpranke@chromium.org> Cr-Commit-Position: refs/heads/master@{#524141} [modify] https://crrev.com/7feb880d77166ca4cadb5a7f27acf1f2e54f97ad/testing/buildbot/chromium.fyi.json [modify] https://crrev.com/7feb880d77166ca4cadb5a7f27acf1f2e54f97ad/testing/buildbot/chromium.linux.json [modify] https://crrev.com/7feb880d77166ca4cadb5a7f27acf1f2e54f97ad/testing/buildbot/test_suite_exceptions.pyl [modify] https://crrev.com/7feb880d77166ca4cadb5a7f27acf1f2e54f97ad/testing/buildbot/test_suites.pyl [modify] https://crrev.com/7feb880d77166ca4cadb5a7f27acf1f2e54f97ad/testing/buildbot/waterfalls.pyl
,
Dec 15 2017
Thanks! This is fantastic to see. Do you know when the layout tests will move over as well? (We're continuing to see changes land that turn those red as in issue 795333 , so it'd be great to cover them in try jobs if possible.)
,
Dec 15 2017
I'm going to check how are capacity is doing over the past 24 hours and, if things look okay, add that in today (I expect we should have plenty of capacity given things getting quieter for the holidays).
,
Dec 16 2017
I've hit a complication, in that I didn't realize that the way the layout test steps are written, we shouldn't actually run two different variants of the layout tests in a single build on a single builder, because the `archive_webkit_test_results` step for the second build will overwrite the first one's results. I'm not sure how hard this will be to fix, as the assumption that there's only one `layout-test-results` directory per build may be fairly hard-coded in a few places. If I can't readily fix this, we can either go back to running this on another bot, or turn this into VirtualTestSuites.
,
Dec 17 2017
A virtual test suite might be a good option. Is there a way to do that on certain bots instead of all of them (to limit the extra workload)?
,
Dec 18 2017
trybots "mirror" waterfall bots, so, for example, linux_chromium_rel_ng runs the same tests the same way as "Linux Tests". So, there's no way to run things as a VirtualTestSuite on the trybot but run everything on *that* waterfall bot. It would be possible to run everything on a different waterfall bot (e.g., if we wanted to keep the FYI bots around), but it's not clear how useful that'd be. I'll work on this today and see what I can come up with; one way or another I should have something running by the end of the day.
,
Dec 18 2017
I think the main concern in #c19 was to only run --site-per-process+--isolate-origins=wpt-origins version of layout tests once (e.g. on linux_chromium_rel_ng) and avoid running it on all bots that run layout tests (e.g. Mac and Windows). I am not sure if this is possible via VirtualTestSuite mechanism (which I assume is platform agnostic). Additionally, I see that most VirtualTestSuites try to apply only to a subset of layout tests (e.g. via "base": "fast/dom/shadow" declaration). This might be possibly different from our requirements: - Today Site Isolation Linux bot runs *all* layout tests with --site-per-process+--isolate-origins=wpt-origins - alexmos@ and creis@ might know more, but I think running http/tests and external/wpt is P0 (and running other layout tests is either P1 or P2).
,
Dec 18 2017
> I think the main concern in #c19 was to only run > --site-per-process+--isolate-origins=wpt-origins version of layout > tests once (e.g. on linux_chromium_rel_ng) and avoid running it on all > bots that run layout tests (e.g. Mac and Windows). I am not sure if this > is possible via VirtualTestSuite mechanism (which I assume is platform agnostic). The VirtualTestSuites themselves are platform agnostic, but you can easily [ Skip ] them on some platforms in TestExpectations. > Additionally, I see that most VirtualTestSuites try to apply only to a subset of layout tests > (e.g. via "base": "fast/dom/shadow" declaration). This might be possibly different from our > requirements: > - Today Site Isolation Linux bot runs *all* layout tests with > --site-per-process+--isolate-origins=wpt-origins > - alexmos@ and creis@ might know more, but I think running http/tests and external/wpt is > P0 (and running other layout tests is either P1 or P2). It is true that VirtualTestSuites cannot (easily) be used to run *everything* again with a different flag, and it would be an abuse of the feature to try and do so. But, running just http/tests again would be fine and is exactly what it's for. I do understand the your requirements (from #c6 and #c20), and I do think that running the test step twice is the better way to do things if you really do want to run everything. As an aside, external/wpt is basically 50% of the tests at this point, and covers pretty much the whole platform, so saying that you need to run external/wpt is basically saying "I need to test the whole platform". It strikes me that that's probably not really true, but I expect that it's hard to say *which* subsets of external/wpt really matter.
,
Dec 19 2017
Update: it looks like the archive upload location isn't that hard-coded, and fixing this should be straightforward. I'm testing it now.
,
Dec 19 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/9dbcaaf60aa0960572a31ced8fdc92b0060d9595 commit 9dbcaaf60aa0960572a31ced8fdc92b0060d9595 Author: Dirk Pranke <dpranke@chromium.org> Date: Tue Dec 19 18:06:24 2017 Add custom step name to layout test archive uploads. Currently it's only possible to upload one set of layout test results per build. This is a problem if you want to run the tests twice, with different configurations. This CL changes things so that the step name will be included in the path to the upload if we're using a step name other than 'webkit_tests' or 'webkit_layout_tests' (ignoring the with/without patch part). R=jbudorick@chromium.org, qyearsley@chromium.org BUG= 786554 Change-Id: I198add586d8fedc9cd6f7ecb6572b6e9b3030476 Reviewed-on: https://chromium-review.googlesource.com/833092 Commit-Queue: Dirk Pranke <dpranke@chromium.org> Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> [modify] https://crrev.com/9dbcaaf60aa0960572a31ced8fdc92b0060d9595/scripts/slave/README.recipes.md [add] https://crrev.com/9dbcaaf60aa0960572a31ced8fdc92b0060d9595/scripts/slave/recipe_modules/chromium_tests/tests/steps/generate_isolated_script.expected/custom_webkit_tests_step_name.json [modify] https://crrev.com/9dbcaaf60aa0960572a31ced8fdc92b0060d9595/scripts/slave/chromium/archive_layout_test_results.py [modify] https://crrev.com/9dbcaaf60aa0960572a31ced8fdc92b0060d9595/scripts/slave/recipe_modules/chromium_tests/tests/steps/generate_isolated_script.py [modify] https://crrev.com/9dbcaaf60aa0960572a31ced8fdc92b0060d9595/scripts/slave/recipe_modules/chromium_tests/steps.py
,
Dec 19 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/b4ab1e4e1025f2f14dd86b5613ea867e6235c806 commit b4ab1e4e1025f2f14dd86b5613ea867e6235c806 Author: Dirk Pranke <dpranke@chromium.org> Date: Tue Dec 19 22:00:53 2017 Run the site_per_process_webkit_layout_test on linux_chromium_rel_ng. This adds the layout tests run with --site-per-process to the CQ. R=jbudorick@chromium.org, alexmos@chromium.org BUG= 786554 Change-Id: I79d5b5b6972350cb0af567e9623620294843ac80 Reviewed-on: https://chromium-review.googlesource.com/822987 Commit-Queue: Dirk Pranke <dpranke@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> Reviewed-by: Alex Moshchuk <alexmos@chromium.org> Cr-Commit-Position: refs/heads/master@{#525151} [modify] https://crrev.com/b4ab1e4e1025f2f14dd86b5613ea867e6235c806/testing/buildbot/chromium.fyi.json [modify] https://crrev.com/b4ab1e4e1025f2f14dd86b5613ea867e6235c806/testing/buildbot/chromium.linux.json [modify] https://crrev.com/b4ab1e4e1025f2f14dd86b5613ea867e6235c806/testing/buildbot/test_suite_exceptions.pyl [modify] https://crrev.com/b4ab1e4e1025f2f14dd86b5613ea867e6235c806/testing/buildbot/test_suites.pyl [modify] https://crrev.com/b4ab1e4e1025f2f14dd86b5613ea867e6235c806/testing/buildbot/waterfalls.pyl [modify] https://crrev.com/b4ab1e4e1025f2f14dd86b5613ea867e6235c806/third_party/WebKit/LayoutTests/FlagExpectations/site-per-process
,
Dec 20 2017
I think this is done, and we can remove the FYI bot.
,
Dec 21 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromium/tools/build/+/6ced07da6d703937f3d2418fbcb81cace5abe3fb commit 6ced07da6d703937f3d2418fbcb81cace5abe3fb Author: Dirk Pranke <dpranke@chromium.org> Date: Thu Dec 21 03:03:41 2017 Fix how we determine the custom step name for layout_tests-based-tests. crrev.com/c/833092 changed how we set the step name for archive_layout_test_results to better support running webkit_test-based tests multiple times in a build, for the site_per_process tests. This, however, affected rebaseline-cl, which is bad. This CL attempts to fix that by using a custom step name less often. R=robertma@chromium.org BUG= 786554 Change-Id: I9e30f09367b141d9e5b61e9f9dd0fa0f042a040d Reviewed-on: https://chromium-review.googlesource.com/837645 Reviewed-by: Quinten Yearsley <qyearsley@chromium.org> Commit-Queue: Dirk Pranke <dpranke@chromium.org> [modify] https://crrev.com/6ced07da6d703937f3d2418fbcb81cace5abe3fb/scripts/slave/recipe_modules/chromium_tests/steps.py
,
Dec 23 2017
Thank you!! This is a fantastic help for preventing regressions in Site Isolation modes. Looks like the Site Isolation Linux bot can be retired whenever you want. (It appears to still be building but not doing much else.)
,
Jan 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d946846ba23387d87af71292b5cf50e100404941 commit d946846ba23387d87af71292b5cf50e100404941 Author: Dirk Pranke <dpranke@chromium.org> Date: Fri Jan 12 02:37:52 2018 Run the layout tests under site-per-process on Linux debug. Previously we were only running them on Linux Tests (release), so this wouldn't catch DCHECKs and other debug-only failures, and we'd have tests failing on the CQ but not realize it :(. TBR=jbudorick@chromium.org, lukasza@chromium.org NOTRY=true BUG= 786554 Change-Id: I05259069fd2ab390a5372b6c330b1500231c98a3 Reviewed-on: https://chromium-review.googlesource.com/863163 Reviewed-by: Dirk Pranke <dpranke@chromium.org> Reviewed-by: John Budorick <jbudorick@chromium.org> Commit-Queue: Dirk Pranke <dpranke@chromium.org> Cr-Commit-Position: refs/heads/master@{#528858} [modify] https://crrev.com/d946846ba23387d87af71292b5cf50e100404941/testing/buildbot/chromium.linux.json [modify] https://crrev.com/d946846ba23387d87af71292b5cf50e100404941/testing/buildbot/test_suite_exceptions.pyl
,
Jan 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/948f1199acde5bcae335808d7624b143e4c9d4d8 commit 948f1199acde5bcae335808d7624b143e4c9d4d8 Author: Friedrich Horschig <fhorschig@chromium.org> Date: Fri Jan 12 11:54:05 2018 Revert "Run the layout tests under site-per-process on Linux debug." This reverts commit d946846ba23387d87af71292b5cf50e100404941. Reason for revert: tests apparently look for content_shell in out/Release (https://crbug.com/801478) so this would need to be configured as well when enabling this. Original change's description: > Run the layout tests under site-per-process on Linux debug. > > Previously we were only running them on Linux Tests (release), > so this wouldn't catch DCHECKs and other debug-only failures, > and we'd have tests failing on the CQ but not realize it :(. > > TBR=jbudorick@chromium.org, lukasza@chromium.org > NOTRY=true > BUG= 786554 > > Change-Id: I05259069fd2ab390a5372b6c330b1500231c98a3 > Reviewed-on: https://chromium-review.googlesource.com/863163 > Reviewed-by: Dirk Pranke <dpranke@chromium.org> > Reviewed-by: John Budorick <jbudorick@chromium.org> > Commit-Queue: Dirk Pranke <dpranke@chromium.org> > Cr-Commit-Position: refs/heads/master@{#528858} TBR=dpranke@chromium.org,lukasza@chromium.org,jbudorick@chromium.org Change-Id: I27fa9abe8faf10390f074cf5fcb976575e2e8e4c No-Presubmit: true No-Tree-Checks: true No-Try: true Bug: 786554 Reviewed-on: https://chromium-review.googlesource.com/864022 Reviewed-by: Friedrich Horschig <fhorschig@chromium.org> Commit-Queue: Friedrich Horschig <fhorschig@chromium.org> Cr-Commit-Position: refs/heads/master@{#528928} [modify] https://crrev.com/948f1199acde5bcae335808d7624b143e4c9d4d8/testing/buildbot/chromium.linux.json [modify] https://crrev.com/948f1199acde5bcae335808d7624b143e4c9d4d8/testing/buildbot/test_suite_exceptions.pyl
,
Feb 15 2018
,
Feb 15 2018
|
||||||||
►
Sign in to add a comment |
||||||||
Comment 1 by lukasza@chromium.org
, Nov 17 2017Status: Assigned (was: Untriaged)