Loading a builder page takes several seconds |
|||
Issue descriptionI've noticed that loading a builder page on milo can take a few seconds, and at times a very long time. I tried creating a trace report thingy on the cloud project: https://pantheon.corp.google.com/traces/tasks/9015641381937676754?project=luci-milo The 90th percentile looks really bad. That trace report also doesn't match my personal experience, although I could be very biased by the few times it's very slow. Anecdotally, I've tried loading that page (sometimes with limit=200, to look at older builds) and had to wait for something like 10 seconds, which has been very annoying. Is there any sort of SLO on what page view times should be for a builder page? Or individual build?
,
Oct 9
+Buildbucket The builder page just makes 4 calls to buildbucket, so it's just a matter of optimizing those search queries.
,
Oct 10
from the report above, the 90p examples show that buildbucket query was done under 1s, the rest of the time is spent on talking to logdog. why does it talk to logdog? buildbucket returns everything that handler needs to render the page. buildbucket RPC can also be improved. Currently it returns things that the handler does not need, such as properties. v2 API allows specifying the fields that the client needs. Bug 848960 is about switching to v2 API
,
Oct 10
actually, the report is incorrect. It includes requests to builds, that's why see milo talking to logdog. it appears that 90th percentile of '/p/chromium/builders/luci.chromium.try/linux_chromium_rel_ng' is 4.16sec https://pantheon.corp.google.com/bigquery?project=luci-milo&folder&organizationId=433637338589&queryFilter=%255B%257B_22k_22_3A_22Query%2520text_22_2C_22t_22_3A15_2C_22v_22_3A_22_5C_22milo_5C_22_22%257D%255D&j=bquxjob_3eaa45f5_1665e780728&page=queryresults also it appears that on average, buildbucket is accountable for 30% of that latency https://pantheon.corp.google.com/bigquery?project=luci-milo&folder&organizationId=433637338589&j=bquxjob_27e67c11_1665e8b525b&page=queryresults is till recommend switching to v2. Its unlikely that v1 will be optimized.
,
Oct 10
note that in addition to buildbucket RPCs, milo builder view makes O(CL) requests to gerrit to load the emails of the CL authors.
,
Oct 10
Re #4: I can't repro the results from the query in the logs. Something seems off. For example Query: https://screenshot.googleplex.com/wBwVN4xi5UH Log: https://screenshot.googleplex.com/OCpJH5xG3xB 1.8/2.2 = 0.81, but BQ shows 0.098
,
Oct 10
I see, it's because the logs pring ms for lower latency entries, s for higher latency entries. Looking from randomly sampling the logs manually, builder page latencies are divided into two cases: 1. Buildbucket taking longer 2. Getting Gerrit CLs taking longer 1. Is what the blockedon bug is about 2. This happens in the non-cached case, for the unlucky person to hit the endpoint when there are a lot of uncached entries. We could use a cron cache warmer to iterate through all of the known builder pages and keep the caches warm. |
|||
►
Sign in to add a comment |
|||
Comment 1 by martiniss@chromium.org
, Oct 9