New issue
Advanced search Search tips

Issue 752909 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Aug 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: ----



Sign in to add a comment

Gerrit CQ does not reuse Gerrit triggered jobs on LUCI.

Project Member Reported by serg...@chromium.org, Aug 7 2017

Issue description

Previously on Rietveld, CQ used to reuse presubmit if it was still running when CQ has started. However, in Gerrit we seem to be triggering new presubmit regardless.

Not sure if it's WAI, but decided to file this bug anyway, since we may be wasting some resources and developer time.
 
Owner: serg...@chromium.org
Status: Assigned (was: Available)
Proof of CL/PS attempts log OR it didn't happen :)
Cc: -tandrii@chromium.org serg...@chromium.org
Owner: tandrii@chromium.org
CL/PS: https://chromium-cq-status.appspot.com/v2/patch-status/chromium-review.googlesource.com/583148/11. I've triggered a presubmit just before clicking on "Submit to CQ" and still CQ has scheduled anyone one.

From raw JSON logs on status app (verifier_trigger event), CQ triggered presubmit at 5:25:03.135 PM UTC: https://luci-milo.appspot.com/swarming/task/37d68c07824b3310. The other build has started running at   2017-08-07 5:24 PM UTC and finished at 2017-08-07 5:26 PM UTC: https://luci-milo.appspot.com/swarming/task/37d68b7d11110910. So it appears it was running CQ scheduled another one.
OK, let's see, your CL has 4 presubmits attached:

1. https://luci-milo.appspot.com/swarming/task/37d63ffa1677eb10  CQ dry run
2. https://luci-milo.appspot.com/swarming/task/37d68b7d11110910  Gerrit UI
3. https://luci-milo.appspot.com/swarming/task/37d68c07824b3310  CQ full run
4. https://luci-milo.appspot.com/swarming/task/37d6914f82793510  CQ full run

The first bug I see is that 2 wasn't recognized by CQ because "master" filed was missing. I think it's time for CQ to ignore that.

However, I don't yet see why 3 wasn't re-used and 4 was triggered instead, which is what your bug is about.
Actually, I disagree with you that it worked differently in Rietveld. It didn't, at least that's how I read this code:

https://chrome-internal.googlesource.com/infra/infra_internal/+/3c3a6cc/infra_internal/services/cq/verification/try_job_utils.py#1006

  
   ...
    # Remove presubmit builder successes from previous attempts. See
    #  https://crbug.com/481074  for more details.
    for fresh_build ∈ fresh_builds[:]:
      if (base.is_presubmit_builder(fresh_build['builder'])
          ∧ fresh_build['job_state'].state ≡ Job.SUCCEEDED
          ∧ fresh_build['created_ts'] < start_time):
        fresh_builds.remove(fresh_build)

Hence, what you see is totally WAI.

Note, that while old presubmit is running (!=Job.Succeeded) CQ will be happily "using" re-using it, but as soon as that presubmit succeeds, CQ will need to re-schedule it. So, that condition on "Job.Succeeded" is 100% pointless.

Even more, I would say that it is in fact broken as intended. The 
 https://crbug.com/481074  issue was that CQ wasn't able to distinguish dry from non dry run, which was "fixed" by ignoring older presubmit runs. Since I fixed dry run properly last year, we can don't need this weirdness any more.
Summary: Gerrit CQ does not reuse Gerrit triggered Presubmit jobs on LUCI. (was: Gerrit CQ does not reuse running "presubmit")
That said, I actually think it is beneficial to not re-use 23h stale presubmit in chromium CQ and other CQs, particularly until we move away OWNER checks out of presubmit and into Gerrit plugin.

So, here are two CLs:
 Fix first bug I found:
  https://chrome-internal-review.googlesource.com/425810 
 Simplify presubmit skipping.
  https://chrome-internal-review.googlesource.com/426028
Thanks for the analysis. I though it worked like that in Rietveld, but looks like I misremembered. Maybe it worked like that for non-presumit builds, which would make sense since builds in Chromium CQ can take up to an hour.
Project Member

Comment 7 by bugdroid1@chromium.org, Aug 8 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/infra_internal/+/6aef94cc035a58c448078ceed4e023905c3091df

commit 6aef94cc035a58c448078ceed4e023905c3091df
Author: Andrii Shyshkalov <tandrii@chromium.org>
Date: Tue Aug 08 07:48:04 2017

Status: Fixed (was: Assigned)
Project Member

Comment 9 by bugdroid1@chromium.org, Aug 8 2017

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/infra/infra_internal/+/99317a88c66dd05c03610e698e768c7517f203f0

commit 99317a88c66dd05c03610e698e768c7517f203f0
Author: Andrii Shyshkalov <tandrii@chromium.org>
Date: Tue Aug 08 13:41:57 2017

Summary: Gerrit CQ does not reuse Gerrit triggered jobs on LUCI. (was: Gerrit CQ does not reuse Gerrit triggered Presubmit jobs on LUCI.)

Sign in to add a comment