New issue
Advanced search Search tips

Issue 905313 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

Investigate increase in median CQ duration

Project Member Reported by erikc...@chromium.org, Nov 14

Issue description

Over the last two months, there's been a steady increase in median CQ duration [of committed CQ attempts]. 

Potential causes: 
  * 'retry with patch' was landed ~2 months ago.
 
If 'retry with patch' were solely responsible, I would expect a sudden spike in median CQ duration. 

Needs more investigation.
 
Screen Shot 2018-11-13 at 2.43.09 PM.png
327 KB View Download
Cc: tikuta@chromium.org
I created dashboard go/cq-per-build-stats recently to know which builder and which step consumes cycle time in builder for myself.

Also I think we want to see weekly (or monthly) stats rather than daily stats to see long term trend.
https://datastudio.google.com/c/u/0/reporting/1mciriXfm5rfdGvUM4GYteI31XZCSIaNA/page/X7Vc
(screenshot attached)

Anyway, for better investigation, we need to drill down the CQ.
Uv0M5ihyJrP.png
94.1 KB View Download
Components: -Infra Infra>Client>Chrome
Interesting. tikuta@, the graph you shows implies that there's no major change to weekly CQ attempt duration over the last few months. However, go/cq-slo-dash shows a clear upwards trend. I wonder if this is due to a difference in queries being used? [e.g. committed CLs vs all CLs]
There is no difference between queries other than timestamp truncation unit.


This is query to show graph in data studio dashboard.
```
SELECT
  TIMESTAMP_TRUNC(TIMESTAMP_MILLIS(CAST(attempt_start_msec AS int64)), WEEK) AS timestamp,
  APPROX_QUANTILES(click_to_patch_committed_sec, 100)[
OFFSET
  (50)] AS p50_attempt_duration,
  APPROX_QUANTILES(click_to_patch_committed_sec, 100)[
OFFSET
  (90)] AS p90_attempt_duration
FROM
  `chrome-infra-events.aggregated.cq_attempts`
WHERE
  REGEXP_CONTAINS(cq_name, r"chromium/chromium/src")
  AND committed
  AND NOT custom_trybots
GROUP BY
  timestamp
ORDER BY
  timestamp
```

This is query to show graph in viceroy from http://shortn/_c5rH3lLAWJ
```
SELECT
  TIMESTAMP_TRUNC(TIMESTAMP_MILLIS(cast(attempt_start_msec as int64)), DAY) AS timestamp,
  APPROX_QUANTILES(click_to_patch_committed_sec, 100)[OFFSET({{ percentile }})] as attempt_duration
FROM `chrome-infra-events.aggregated.cq_attempts`
WHERE REGEXP_CONTAINS(cq_name, r"chromium/chromium/src") AND committed AND NOT custom_trybots
GROUP BY timestamp
ORDER by timestamp
```

And weekly stats also show upward trend, I think.
Thanks for posting the queries. I agree, there's an upward trend [enough noise we can't tell when this started]. 

Sign in to add a comment