New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 1 user
Status: Fixed
Owner:
Closed: Nov 28
Cc:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment
falco-release times-out at BuildPackage
Project Member Reported by laszio@chromium.org, Nov 22 Back to list
https://luci-milo.appspot.com/buildbot/chromeos/falco-release/3028

It took ~7hours before being killed. It looks to me that this is a problem uncovered by the chrome experiment we are doing [1]. I have a few successful tryjobs which took 14-15 hours to BuildPackage.

The build log is much larger than a successful build. It seems that the build process is restarted repeatedly. For example, I saw this 10 times:
> chromeos-chrome-64.0.3274.0_rc-r1: [13557/41631] CXX host/obj/base/base/process_handle_linux.o

I cannot reproduce it locally. However, I observed some heavy loading on my system when building; A sampled "load" showed more than 200 jobs and ate up all the memory.

[1] We use different AutoFDO profiles on falco-release and peppy-release. Therefore, they may not have prebuilt chrome package and need to build from source. It is strange that this started to show up after I updated a profile for falco manually. The tryjob I used to test the CL finished in ~1h. peppy-release is also fine. I'm trying to see if reverting the profile hides the problem.
 
Cc: gmx@chromium.org
Gabriel, I may need to revert the profile if the tryjob passed.
Update: the repeated log seems to be printed every 60m so it is unlikely restarted repeatedly.

Another observation: the profile is ~4x larger than previous ones. I'm trying to measure the time spent in each process.
Since previous tryjob doesn't catch the problem, the tryjob that reverts the CL may not tell anything as well. Therefore, I reverted it on R63. Let's see if it fixes the problem.
Alright, the new profile make compilation much slower and eats a lot of memory. For example, when compiling memory_region_map.cc in tcmalloc,

the old profile uses (elapsed time, user time, sys time, max rss in kb):
2.464410 : 1.944000 : 0.312000 : 334176

while the new profile uses
53.394735 : 4.712000 : 6.612000 : 1545784

I guess most of the elapsed time comes from threshing because my workstation becomes unusable when building chrome.

I'm going to revert the profile on master, too.
Owner: laszio@chromium.org
Status: Started
Cc: kbleicher@chromium.org
I noticed that the profile was larger when I created it.
It is almost 5x larger than the previous one, but "only" 2.3x larger than the first one:

22884910 -> Oct 10 13:16 chrome_cwp_62.0.3202.43_solo_peppy.afdo
11769227 -> Nov  2 17:22 chrome_cwp_63.0.3239.20_celes.afdo
52325126 -> Nov 16 15:23 chrome_cwp_63.0.3239.42_peppy.afdo

Since the previous, smaller profile didn't perform that well (performance wise), I had hoped that this one will do better.
But it is too large. The likely cause is the symbol aliasing introduced by ICF (b/38454265).

My feeling is that aliases are handled incorrectly by the autofdo profile creator. I am looking over the code and I think that I can fix at least the part that dumps the symbol information for each alias, which causes the size blowout.
> It seems that the build process is restarted repeatedly.

nothing was restarted.  emerge is configured to dump the build log every 60 minutes.  if you look at the parallel_emerge output that brackets the chrome build log:
=== Start output for job chromeos-chrome-64.0.3274.0_rc-r1 (60m3.6s) ===
...
=== Still running: job chromeos-chrome-64.0.3274.0_rc-r1 (60m3.6s) ===
...
=== Continue output for job chromeos-chrome-64.0.3274.0_rc-r1 (120m7.2s) ===
...
=== Still running: job chromeos-chrome-64.0.3274.0_rc-r1 (120m7.2s) ===
...
=== Continue output for job chromeos-chrome-64.0.3274.0_rc-r1 (180m10.8s) ===
...
Project Member Comment 9 by bugdroid1@chromium.org, Nov 25
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/a8889de841211103d105d397cd9f94a029227726

commit a8889de841211103d105d397cd9f94a029227726
Author: Ting-Yuan Huang <laszio@chromium.org>
Date: Sat Nov 25 07:18:48 2017

Revert "chrome: update experimental autofdo profiles to 3239.42"

This reverts commit fd92f4beba2f98bd29756a0a64fc922c60c66073.

Reason for revert: the CL seemed to cause BuildPackages timeout.

Original change's description:
> chrome: update experimental autofdo profiles to 3239.42
>
> BUG=b:37251947,  chromium:777730 
> TEST=USE=chrome_afdo_exp1 emerge-peppy chromeos-chrome
>      cros tryjob falco-release
>
> Change-Id: Ic549960c8db4b124bb0208faaba4198f2e749d28
> Reviewed-on: https://chromium-review.googlesource.com/777026
> Reviewed-by: Ting-Yuan Huang <laszio@chromium.org>
> Tested-by: Ting-Yuan Huang <laszio@chromium.org>

BUG=b:37251947,  chromium:777730 ,  chromium:788017 
TEST=USE=afdo_chrome_exp1 emerge-falco chromeos-chrome

Change-Id: I3f31f39fe4ce45c9097f935843c3a53fc21ccf21
Reviewed-on: https://chromium-review.googlesource.com/786826
Commit-Ready: Ting-Yuan Huang <laszio@chromium.org>
Tested-by: Ting-Yuan Huang <laszio@chromium.org>
Reviewed-by: Ting-Yuan Huang <laszio@chromium.org>

[modify] https://crrev.com/a8889de841211103d105d397cd9f94a029227726/chromeos-base/chromeos-chrome/chromeos-chrome-9999.ebuild
[modify] https://crrev.com/a8889de841211103d105d397cd9f94a029227726/chromeos-base/chromeos-chrome/Manifest

Cc: slavamn@chromium.org drinkcat@chromium.org athilenius@chromium.org
falco-release still failing:

https://logs.chromium.org/v/?s=chromeos%2Fbb%2Fchromeos%2Ffalco-release%2F3039%2F%2B%2Frecipes%2Fsteps%2FBuildPackages__afdo_use_%2F0%2Fstdout

Probably because we are still trying to build chromeos-chrome-64.0.3274.0_rc-r1, which does not have the revert in #9.
Status: Fixed
Chrome just got up-reved.
Status: Started
We had to pin Chrome due to 788925, so not fixed yet.
Status: Fixed
The bad profile only affected 3274. Pinning it to 3273 hides the problem. After unpinning, the problem is already solved.

Unless you want to pin to 3274, this can be marked fixed.
Sign in to add a comment