New issue
Advanced search Search tips

Issue 893727 link

Starred by 1 user

Issue metadata

Status: Unconfirmed
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug

Blocked on:
issue 848960



Sign in to add a comment

Loading a builder page takes several seconds

Project Member Reported by martiniss@chromium.org, Oct 9

Issue description

I've noticed that loading a builder page on milo can take a few seconds, and at times a very long time.

I tried creating a trace report thingy on the cloud project: https://pantheon.corp.google.com/traces/tasks/9015641381937676754?project=luci-milo

The 90th percentile looks really bad. That trace report also doesn't match my personal experience, although I could be very biased by the few times it's very slow.

Anecdotally, I've tried loading that page (sometimes with limit=200, to look at older builds) and had to wait for something like 10 seconds, which has been very annoying.

Is there any sort of SLO on what page view times should be for a builder page? Or individual build?
 
Cc: estaab@chromium.org
Components: Infra>Platform>Buildbucket
+Buildbucket

The builder page just makes 4 calls to buildbucket, so it's just a matter of optimizing those search queries.
Blockedon: 848960
Components: -Infra>Platform>Buildbucket
from the report above, the 90p examples show that buildbucket query was done under 1s, the rest of the time is spent on talking to logdog.

why does it talk to logdog? buildbucket returns everything that handler needs to render the page.

buildbucket RPC can also be improved. Currently it returns things that the handler does not need, such as properties. v2 API allows specifying the fields that the client needs. Bug 848960 is about switching to v2 API
actually, the report is incorrect. It includes requests to builds, that's why see milo talking to logdog.

it appears that 90th percentile of '/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng' is 4.16sec
https://pantheon.corp.google.com/bigquery?project=luci-milo&folder&organizationId=433637338589&queryFilter=%255B%257B_22k_22_3A_22Query%2520text_22_2C_22t_22_3A15_2C_22v_22_3A_22_5C_22milo_5C_22_22%257D%255D&j=bquxjob_3eaa45f5_1665e780728&page=queryresults

also it appears that on average, buildbucket is accountable for 30% of that latency

https://pantheon.corp.google.com/bigquery?project=luci-milo&folder&organizationId=433637338589&j=bquxjob_27e67c11_1665e8b525b&page=queryresults

is till recommend switching to v2. Its unlikely that v1 will be optimized.
note that in addition to buildbucket RPCs, milo builder view makes O(CL) requests to gerrit to load the emails of the CL authors.
Re #4: I can't repro the results from the query in the logs.  Something seems off.  For example

Query: https://screenshot.googleplex.com/wBwVN4xi5UH
Log: https://screenshot.googleplex.com/OCpJH5xG3xB

1.8/2.2 = 0.81, but BQ shows 0.098
I see, it's because the logs pring ms for lower latency entries, s for higher latency entries.

Looking from randomly sampling the logs manually, builder page latencies are divided into two cases:
1. Buildbucket taking longer
2. Getting Gerrit CLs taking longer


1. Is what the blockedon bug is about
2. This happens in the non-cached case, for the unlucky person to hit the endpoint when there are a lot of uncached entries.  We could use a cron cache warmer to iterate through all of the known builder pages and keep the caches warm.

Sign in to add a comment