New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 649186 link

Starred by 2 users

Issue metadata

Status: WontFix
Owner:
Last visit > 30 days ago
Closed: Oct 2016
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 0
Type: Bug



Sign in to add a comment

PFQ: Build slaves are progressively slowing and timing out

Project Member Reported by steve...@chromium.org, Sep 21 2016

Issue description

From a recent PFQ lumpy build:

https://uberchromegw.corp.google.com/i/chromeos/builders/lumpy-chrome-pfq/builds/9140
InitSDK: +25 mins
BuildPackages: +37 mins

Looking through the logs of BuildPackages it is not clear why either stage took longer. The Load numbers looked fairly similar, but something was clearly slowing the builder down.

This has happened before. It is very frustrating. I am tempted to increase the master timeout since failing because we killed a slave is generally worse than delaying a PFQ build by an extra half an hour. Our margins are pretty thin right now.

 
Cc: ihf@chromium.org
Labels: -Pri-1 Pri-0
Owner: steve...@chromium.org
OK, this caused tricky, peppy, lumpy, and cyan to fail because they took too long.

I suspect that all builders may be taking longer than normal but most are just making it within the threshold. I am investigating that now.

Something strange is afoot with the buildslaves.

Starting with this build:

https://uberchromegw.corp.google.com/i/chromeos/builders/master-chromium-pfq/builds/3395

The following slaves got significantly slower (by about an hour):
x86-alex, tricky, falco, amd64-generic, peppy, cyan, lumpy

Starting with the following build, the remaining builders also slowed down by about an hour.

This can be visualized nicely but looking at the 'Slaves timeline' link from the Report section of the master-pfq (kudos to whoever added that!):

3394 (normal):
https://00e9e64baca863998ad3db9b7fd5839586cf9215a92cee211f-apidata.googleusercontent.com/download/storage/v1/b/chromeos-image-archive/o/master-chromium-pfq%2FR55-8822.0.0-rc2%2Ftimeline-slaves.html?qk=AD5uMEtRX1uw9qPeZIBIUKH7k0DZu7Hiji7K__t5lJu-WXnJGWZHEluga1hkZ8ckt4GHZFki4tPyYT0kaB-JRsRZmFsou44oENv9SvYHmatF4PI6FrAuWcDeZ5lteYRm2aNytUGBfdNl9auUdiaZqleRKzn4XlUX_lyozb2-j4MELLpPmVZl2RoV3Gh6haJfh5JGa4PVer88s9JXEW7NvuODIZhLAoo3wSgx4vY8Vk69xz-_3k6mPKSMXAKmrbTV9IPyAxh8flNsUXucv5a9jcQQYsUh41bIxCGSlLNXkOpNpSfkmCxXAuU002SX_i-FhMgyU5PH7t8CFjHmTwHL7weWgEIVN0Q72Avek4-p1Z41qZtqifpYzSWw0vASSSUTT9rimXBPyLrFHnjH8ods5Yuk84G9w91EjQhL-melXcpZU9HjHjDX9t1t2MOWGmFym3fcpCEy4bDNGWHwgplNgluKu9dU64JUzv1UQMflrd72dJhza9aEsmefrRK9w_QlQIlwZBsv4tDtn6H9SNzSwkJrMbNzZoQUQaN7S-0AKMu6loZXsH6FIL7uSmB_OkOLgUSKoeKCEHtyTssWcG9LFp1yWakM3c7Sv0JTb-ID3oQtVKYnBQRleKoq163bOD2YTvGMv9Y1qg6CS_QyVyAUuh4K-fyztRZD_aRzm7_fdiHMYBoKOQHjOTHzEboQyyWDwuzSoBWxM-925jfr6opEnP8bHSKs32P3Hbt3B4RLl_cuE1NkP5fGlK1nM20qRRUAuETTaLKKNOFHhTjdW7NreYuIp9Q5FomXU5Lc3vGe6JORToNYFa6v68cTrC1lpLknTGIbjY4PWJwHkXbgVnlSDmbfMrP4xpkb8Q

3394 (some slow):
https://00e9e64baca7f54b8a2052d9b8f59ce337968f57894e5d7247-apidata.googleusercontent.com/download/storage/v1/b/chromeos-image-archive/o/master-chromium-pfq%2FR55-8823.0.0-rc1%2Ftimeline-slaves.html?qk=AD5uMEsWy2uLIbzTgAVEqtd-xlwmTJYIXT0Gp5psKOpAKnRDXNoJ2FM2tiVmskSkkJsVfuZy_YHEXtdCaY7GBPba8napXGATtjUi_xFpNu3m3A54gxri4ba4xD2dBAJyW0oQnXRvpCDKHREiiRDc2_zvcppIiVSHDULCSuGL67HCM9afMgw4vFcVwTpJHTKYItaHDu53vvqe3OhmyDUd8V5d39pSnDVeTr6XzPxKBBQt1xKAxySZglO3X3kd4YjhGTE6XZEfIvuD0vgGdMxiPtdyWy2mY2CIuuY7ecNuRiosaWw2M14SjIAUVXGlnx45S2fojl_za5rd98-5K9ZguBJYztb2KLSrecTb8_iYIm3uhBJyd03kEzQaTQR651840l3FBtLA5ojiGKU5LoiR3bXw2N6YJIp4fWVq7QfZmoDHDe3s65YMRMBv3RuY1RxAbP1UUFbkh6WkM9AglJ_V2Nuh_pn9DnvglJJryDh5EN9TasbKK0oqhz2fINWTJ5ippJj212WhLwdQ4r-yB8tPGNkNEsLrexlI_7ISxCl2QKOaAq5Xr2XXfh9w1HjQKpoc8Jt3FbAsQQA5SSnrpYQO9IgoJzDb-Frs9fb3cwfrDXPzAXswZepBvgu4LGTEMXG6ABMlCTO-iMJbj741i_5SDhZDz_dW-Y-2MDOXesQwTHq7Eo4PuJjKpyikwvEJmLnVrxUiK1MK4PcfRYluwWRyhEwom5QdlftrIapkebnCrfoZ08eWAOo8M8TtG3twfnV1_zUbbwri2_2gr2QJk_9148f6g_nkdepOD-usccsJZem_osvPBVQwf5cY-lNmgedDWow8KPnQxaYmK70Kz-xeIrtNYmVLtQk9zA

3395 (all slow):
https://00e9e64bac40f12647aa504ce911d8ea9323b8d04e1e098f51-apidata.googleusercontent.com/download/storage/v1/b/chromeos-image-archive/o/master-chromium-pfq%2FR55-8824.0.0-rc2%2Ftimeline-slaves.html?qk=AD5uMEv2EVy75QiohfVfqjN0H0V554nyyIyctw-I6RcM2h3KB96qsmXUtqaa3xkpOPhT7GshShDdxATaVGeyx5t0JKnS87BGl8pcvXDt2Er-NpZYrsKgFB-3oPFguwWIlPACjjHFaLfoN5z28nLX73lgEhxQi1NdsxI_zBaMdQMhgkmEAqt3X1vx0JHqXLWcWHichPIIXophIku3oSdJWCtWkW-6YZpdOv43A7v3zESd811PVVmjn8Elfx-BB3k1c9uKYcS2afI5d5jLb5WKLvhtvNrmRjK1akmsR0ZH0AfE6ajw2TdWt0srd1KXtCCoSEvW3Rprw5XGhHRYRjQkgjpPlVNnCHGfaVy9zlXVgmm1hdm6P7KqsIxUmXseMOrWEt6BNrtUdk-Z4TLqxtdiZbjIbJDGNd9qEXyyL5IBJq-ogVbzZPzvIPDuZiOv4fJDz4Z9EpnXFAdrO3SwgetYUcBwzpXwMgzKEcEhXWDCqLh057DkHe7yXc4jBslE2TkN9UXM9_KOQS3A4Nqv5ixzRvfN68KrLwbsqDYUpmReibh1z7pVv8lgyCULHkvRuYnfjKESRVLBDzhuFAKfb02au5_Qn07I0kDVlWzLgMPCm16IN5grzcAomwq_WC-SWBStg9TS0asDrbwpCnxKqM-Q5mUgG38xuwIyza3sACt-W1uwCLoiRbAksqdIGwHJ-kc89oTRQ55QKtWFioYwEkpkNAdMC5fpOVauiZJDC7VNtctSYj3XNhhhW2PI0B_2G_c9ihtiYj3rGukgY7tJzabtqartsI6WLK1JECVsQ14LqmI-tmN3YwTMqPKWoQEH3cmhkm4c0E-2SDoORbADSrURZP-oRVmj-9pGcw

This behavior makes me suspect that something is slowing down the builders and this is not a cros or chrome change. (The fact that it also affects the InitSDK stage contributes to that suspiction).

Summary: PFQ: Build slaves are progressively slowing and timing out (was: PFQ: Sometimes slaves take too long to build)
Status: Sss (was: Untriaged)
Cc: shuqianz@chromium.org vpalatin@chromium.org apronin@chromium.org charliemooney@chromium.org
Owner: sbasi@chromium.org
Status: Assigned (was: Sss)
Assigning to sbasi@ since it seems unlikely that this is due to a chrome or chromeos change, but also +cros sheriffs in case this is also affecting other cros builds.



Building the Chrome package seems unusually slow (memory says that should be ! 45m, but that should be confirmed):

Completed chromeos-base/chromeos-chrome-55.0.2867.0_rc-r1 (in 83m35.2s)

Since it's the actual Chrome compile time, that could caused by a change in the Chrome code, or some bit of infrastructure Chrome uses to build (Goma? I'm ignorant here).

Comment 8 by sbasi@chromium.org, Sep 22 2016

Hey Steven, any idea if something changed on the Chrome build side?

That link in comment #1 took 83m35.2s to build Chrome.

Dgarrett is thinking this could be a change on their side: "That's compile time, so I'd look for something different in their build system.
Using Goma, not parallel.... or something."

In the peppy build for 3395 https://uberchromegw.corp.google.com/i/chromeos/builders/peppy-chrome-pfq/builds/2546/steps/BuildPackages/logs/stdio : 84m3.0s

Cyan: 81m42.6


Comment 9 by sbasi@chromium.org, Sep 22 2016

Cc: dpranke@chromium.org dgarr...@chromium.org
+dpranke,dgarrett
Oh... I checked a couple of recent lumpy-release build as a comparison:

Completed chromeos-base/chromeos-chrome-55.0.2866.0_rc-r1 (in 77m29.6s)

Completed chromeos-base/chromeos-chrome-55.0.2862.3_rc-r1 (in 78m16.5s)

That means 80 minutes isn't surprising.
My initial assumption was that it might be a chrome change, but that should
have affected all boards at the same time. Also I don't see how that would
affect InitSDK?
Cc: davidri...@chromium.org pho...@chromium.org
+davidriley, phobbs

Our new stats tools might be able to help narrow down when/where/what.
/google/data/ro/projects/speckle/mysql -S /cloudsql/cosmic-strategy-646:cidb-replica-gviz -u root

mysql> select builder_name, b.start_time, bs.status, bs.name, timediff(bs.finish_time, bs.start_time) from buildTable b join buildStageTable bs where builder_name like 'lumpy-chrome-pfq' and b.id = bs.build_id and bs.status != 'skipped' and bs.name = 'BuildPackages' and bs.last_updated > date_sub(current_date(), interval 14 day);
+------------------+---------------------+--------+---------------+-----------------------------------------+
| builder_name     | start_time          | status | name          | timediff(bs.finish_time, bs.start_time) |
+------------------+---------------------+--------+---------------+-----------------------------------------+
| lumpy-chrome-pfq | 2016-09-07 23:00:04 | pass   | BuildPackages | 01:37:28                                |
| lumpy-chrome-pfq | 2016-09-08 03:42:35 | pass   | BuildPackages | 01:37:55                                |
| lumpy-chrome-pfq | 2016-09-08 17:08:03 | fail   | BuildPackages | 00:02:32                                |
| lumpy-chrome-pfq | 2016-09-08 20:59:04 | fail   | BuildPackages | 00:02:31                                |
| lumpy-chrome-pfq | 2016-09-09 00:50:35 | pass   | BuildPackages | 01:35:51                                |
| lumpy-chrome-pfq | 2016-09-09 05:20:00 | pass   | BuildPackages | 01:35:47                                |
| lumpy-chrome-pfq | 2016-09-09 09:52:02 | pass   | BuildPackages | 01:35:28                                |
| lumpy-chrome-pfq | 2016-09-09 17:15:30 | pass   | BuildPackages | 01:35:43                                |
| lumpy-chrome-pfq | 2016-09-09 21:54:32 | pass   | BuildPackages | 01:35:29                                |
| lumpy-chrome-pfq | 2016-09-10 02:17:04 | pass   | BuildPackages | 01:26:20                                |
| lumpy-chrome-pfq | 2016-09-10 05:56:27 | pass   | BuildPackages | 01:36:00                                |
| lumpy-chrome-pfq | 2016-09-11 00:09:00 | pass   | BuildPackages | 01:35:59                                |
| lumpy-chrome-pfq | 2016-09-11 04:54:33 | pass   | BuildPackages | 01:36:09                                |
| lumpy-chrome-pfq | 2016-09-12 03:24:05 | pass   | BuildPackages | 01:36:06                                |
| lumpy-chrome-pfq | 2016-09-12 07:50:05 | pass   | BuildPackages | 01:36:11                                |
| lumpy-chrome-pfq | 2016-09-13 03:25:06 | pass   | BuildPackages | 01:39:05                                |
| lumpy-chrome-pfq | 2016-09-13 08:10:40 | pass   | BuildPackages | 01:39:14                                |
| lumpy-chrome-pfq | 2016-09-13 22:20:07 | pass   | BuildPackages | 01:39:50                                |
| lumpy-chrome-pfq | 2016-09-14 02:35:36 | pass   | BuildPackages | 01:38:03                                |
| lumpy-chrome-pfq | 2016-09-14 06:57:04 | pass   | BuildPackages | 01:40:42                                |
| lumpy-chrome-pfq | 2016-09-14 16:36:06 | pass   | BuildPackages | 01:38:57                                |
| lumpy-chrome-pfq | 2016-09-14 23:30:42 | pass   | BuildPackages | 01:44:08                                |
| lumpy-chrome-pfq | 2016-09-15 04:16:40 | pass   | BuildPackages | 01:44:38                                |
| lumpy-chrome-pfq | 2016-09-15 17:29:35 | pass   | BuildPackages | 01:43:58                                |
| lumpy-chrome-pfq | 2016-09-16 00:11:12 | pass   | BuildPackages | 01:39:21                                |
| lumpy-chrome-pfq | 2016-09-16 04:21:36 | pass   | BuildPackages | 01:39:57                                |
| lumpy-chrome-pfq | 2016-09-16 21:44:09 | pass   | BuildPackages | 01:39:11                                |
| lumpy-chrome-pfq | 2016-09-17 02:11:09 | pass   | BuildPackages | 01:37:59                                |
| lumpy-chrome-pfq | 2016-09-17 06:08:06 | pass   | BuildPackages | 01:39:56                                |
| lumpy-chrome-pfq | 2016-09-18 00:08:59 | pass   | BuildPackages | 01:39:54                                |
| lumpy-chrome-pfq | 2016-09-18 04:54:02 | pass   | BuildPackages | 01:39:54                                |
| lumpy-chrome-pfq | 2016-09-19 00:08:29 | pass   | BuildPackages | 01:39:40                                |
| lumpy-chrome-pfq | 2016-09-19 04:48:32 | pass   | BuildPackages | 01:38:59                                |
| lumpy-chrome-pfq | 2016-09-20 00:13:09 | pass   | BuildPackages | 01:39:13                                |
| lumpy-chrome-pfq | 2016-09-20 04:10:04 | pass   | BuildPackages | 01:40:35                                |
| lumpy-chrome-pfq | 2016-09-20 16:05:31 | pass   | BuildPackages | 01:40:16                                |
| lumpy-chrome-pfq | 2016-09-21 03:27:03 | pass   | BuildPackages | 01:42:05                                |
| lumpy-chrome-pfq | 2016-09-21 08:07:08 | pass   | BuildPackages | 01:40:03                                |
| lumpy-chrome-pfq | 2016-09-21 17:26:05 | pass   | BuildPackages | 02:17:07                                |
| lumpy-chrome-pfq | 2016-09-22 00:09:39 | pass   | BuildPackages | 02:17:01                                |
| lumpy-chrome-pfq | 2016-09-22 04:56:04 | pass   | BuildPackages | 02:17:29                                |
+------------------+---------------------+--------+---------------+-----------------------------------------+

Via Monarch: http://shortn/_aVDA81yD0s
Cc: vapier@chromium.org
this is due to this change:
  https://chromium-review.googlesource.com/386970

it's causing a lot of rebuilds due to metadata now being in there that wasn't before.  it should go away for the sdk once the next sdk bot run finishes with that CL and publishes an updated sdk.  it should go away for the board once we get updated binpkgs for target boards (so i guess CQ run finishes).

you can tell when it's the problem because you'll see this in the package line:
  ABI_X86="(64*) -32 (-x32)"
that * means the USE flag is new and portage needs to rebuild it.

unfortunately, this change is something we can't avoid.  it had to land at some point in order to stay in sync with Gentoo and to better support multilib.  the longer we waited the worse it got :(.  if we had pulled it in much sooner, it wouldn't have been nearly as painful.
looks like the chromiumos-sdk bot has recently finished a build w/the CL:
  https://uberchromegw.corp.google.com/i/chromiumos/builders/chromiumos-sdk/builds/7467

so hopefully the InitSDK times should go back down for all new builds.  BuildPackages times will go down once the CQ finishes and publishes new prebuilts.
Project Member

Comment 17 by sheriffbot@chromium.org, Oct 6 2016

Pri-0 bugs are critical regressions or serious emergencies, and this bug has not been updated in three days. Could you please provide an update, or adjust the priority to a more appropriate level if applicable?

If a fix is in active development, please set the status to Started.

Thanks for your time! To disable nags, add the Disable-Nags label.

For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
Status: WontFix (was: Assigned)
This was expected failure related to the toolchain update.

Sign in to add a comment