New issue
Advanced search Search tips

Issue 686471 link

Starred by 7 users

Issue metadata

Status: Fixed
Owner:
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug

Blocked on:
issue 702849

Blocking:
issue 707006



Sign in to add a comment

[WPT Export] Export is subject to head-of-line blocking

Project Member Reported by jeffcarp@chromium.org, Jan 28 2017

Issue description

The current in-flight PR has failed one of the Travis CI checks:
https://github.com/w3c/web-platform-tests/pull/4645

This means that we cannot continue the export process until the PR passes the Travis CI checks. But if a PR can't pass the Travis CI checks, it can never be exported. This means that if a Chromium commit contains changes that won't pass the Travis CI checks, it will block the export process indefinitely.

Options for how to resolve this:

1. Modify the PR out-of-band to get it to pass and then be landed.
2. Skip the Travis CI checks by committing directly to master.
 
Skipping checks seems dangerous, so we'll probably want to avoid that.

Modifying the PR to get it to pass seems OK as long as the change is minor and the author knows about it, but this doesn't seem very sustainable in the long-term.

In the long term, if an "exportable" commit is landed and there's actually something wrong with the commit that prevents it from being committed to wpt, then it seems like one course of action would be:

 - (Auto-)revert the Chromium commit, with explanation.
 - Change the export script to ignore commits that have reverts ( bug 679951 )

Does that seem like the right thing to do? Are there other choices?
Building in auto-reverting and revert-skipping sounds like a good option for keeping things moving, however it would create an unpleasant experience for Blink devs since it could significantly lengthen the feedback loop on their WPT changes. Any WPT CI failure would result in a revert, meaning that for every WPT change they'd have to wait for both the Chromium CQ (1-2hrs) and the WPT CI (1hr) to run.
More info: the current failure looks like it's due to an infra failure (even though the CI has been re-run once):

https://travis-ci.org/w3c/web-platform-tests/jobs/196075279
u'log' (u'warning', {'message': 'Connecting to Selenium failed:\nTraceback (most recent call last):\n  File "/home/travis/virtualenv/python2.7.12/lib/python2.7/site-packages/wptrunner/executors/executorselenium.py", line 59, in setup
...

We run any new WPT tests as part of the CQ, right? So theoretically if tests pass on the CQ they should also pass in the upstream WPT CI. Unless the change passes in Chrome but fails in Firefox.

As another data point, the current Firefox export process commits directly to master, I believe.
What do the wpt Travis checks check, anyway? It looks like there's a lint check, a "ci_built_diff.sh" check, and a CI stability check (checking for flaky tests). It seems like these checks don't completely overlap with our CQ checks.

We'll probably want to check with jgraham (and others?) to see if it's OK to commit directly to master; it would be annoying if an auto-exported Chromium commit failed a lint check upstream, since that would disrupt other wpt contributors. Maybe if our CQ does the same lint check, then that's better?


If the Travis CI checks are very useful and reliable, I think it would be OK to require them, and auto-revert if they don't pass. But if they're flaky, then auto-revert would be annoying and cause trouble.

Meanwhile, if the Travis CI checks are sometimes flaky but sometimes useful, then it would be nice to be able to react quickly to either direct-commit (in the case of flaky failure), or revert the Chromium commit (if there's something wrong with the Chromium commit). What do you think?
Blocking: -657117 684767
Labels: -Pri-0 Pri-1
Making this priority 0->1 since it is no longer an emergency. 

Our short term plan of action for dealing with Travis CI failures is to reach out to a maintainer for a manual force merge. 

Our long term plan is to integrate the WPT CI into Chromium CI so that commits that would have been stuck after passing the Chromium CQ are prevented from landing until resolved.

Comment 7 by foolip@chromium.org, Feb 21 2017

I believe we have two root issues to tackle here:
 - Travis checks more things than Chromium's CQ, which are genuine problems
 - Flakiness of the Travis setup itself, leading to false rejections

I think the main options are:
 - Run all of the same checks as an integrated part of Chromium CQ, and bypass Travis entirely for the WPT PR. We would have to link back to some log for the stability results.
 - Create a WPT PR while Chromium code review is still in progress, so that we know whether Travis will pass. When it doesn't and the cause is flakiness, have some NOTRAVIS=true option to bypass it.

Neither is trivial, but the second is a bit more like what is discussed in https://bugzilla.mozilla.org/show_bug.cgi?id=1336422

Thoughts?
I think that second sounds a bit better, because:

 - It seems like it would be difficult to make all of exactly the same checks (including the flakiness check on multiple browsers) run as part of Chromium CQ
 - But, before the original Chromium commit is made, we want to have feedback from the stability check run on Travis -- occasionally, this may find problems with tests that can be fixed before the original commit is made.
Quick thought:

The second option would involve setting up some system where uploading a new CL in Chromium triggers a new PR to be made, and uploading a new patch set triggers a new commit to be made on that PR. Not sure whether this would be best done via a post-upload hook, or what else...

AFAIK making a new PR or new commit triggers a new Travis job, and we would also want Travis job results to be posted on the Chromium issue if possible...
Cc: jeffcarp@chromium.org
 Issue 684767  has been merged into this issue.
Blocking: -684767 657117
Blockedon: 700092
Blockedon: -700092
Blockedon: 702849
Blocking: 707006
Blocking: -657117
Status: Fixed (was: Assigned)
https://chromium-review.googlesource.com/c/457023/ landed and has been running smoothly so far this week. I'm keeping an eye out for conflicts and anomalies but I think we can safely close this bug.
Components: Blink>Infra>Ecosystem
Components: -Blink>Infra>Predictability

Sign in to add a comment