New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 620773 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

incremental-paladin build leaves old commits around during pre-patch build

Project Member Reported by akes...@chromium.org, Jun 16 2016

Issue description

https://uberchromegw.corp.google.com/i/chromeos/builders/lumpy-incremental-paladin/builds/9155

Note -- this builder runs through the build process twice, one before and once after applying test patches.

In the post-patch run of InitSdk, there was an ominous warning:

04:36:37: INFO: RunCommand: /b/cbuild/internal_master/chromite/bin/cros_sdk 'PARALLEL_EMERGE_STATUS_FILE=/tmp/tmptWIG6H' 'USE=chrome_internal' 'FEATURES=separatedebug' -- ./run_chroot_version_hooks in /b/cbuild/internal_master
ERROR   : Fatal: Missing upgrade hook for 141
ERROR   : Chroot version is too new. Consider running cros_sdk --replace
04:36:42: ERROR: 
return code: 1; command: /b/cbuild/internal_master/chromite/bin/cros_sdk 'PARALLEL_EMERGE_STATUS_FILE=/tmp/tmptWIG6H' 'USE=chrome_internal' 'FEATURES=separatedebug' -- ./run_chroot_version_hooks
cwd=/b/cbuild/internal_master, extra env={'PARALLEL_EMERGE_STATUS_FILE': '/tmp/tmptWIG6H', 'USE': 'chrome_internal', 'FEATURES': 'separatedebug'}

@@@STEP_TEXT@Replacing broken chroot@@@

@@@STEP_WARNINGS@@@

Then in SetupBoard, it failed:

04:36:43: INFO: RunCommand: /b/cbuild/internal_master/chromite/bin/cros_sdk 'PARALLEL_EMERGE_STATUS_FILE=/tmp/tmpHLLZpU' -- ./update_chroot --toolchain_boards lumpy in /b/cbuild/internal_master
ERROR   : Fatal: Missing upgrade hook for 141
ERROR   : Chroot version is too new. Consider running cros_sdk --replace
ERROR   : Thu Jun 16 04:36:48 PDT 2016
ERROR   :  PGID  PPID   PID     ELAPSED     TIME %CPU COMMAND
ERROR   : Arguments of 10: ./update_chroot '--toolchain_boards' 'lumpy'
ERROR   : Backtrace:  (most recent call is last)
ERROR   :  update_chroot:47:main(), called: die_err_trap  
ERROR   : 
ERROR   : Command failed:
ERROR   :   Command '${SCRIPTS_DIR}/run_chroot_version_hooks' exited with nonzero code: 1
ERROR   :   (Note bash sometimes misreports "command not found" as exit code 1 instead of 127)


I can't see any CLs that seem responsible for this, in that run.
 

Comment 1 by vapier@chromium.org, Jun 16 2016

did someone test a chroot upgrade hook to 142 and the CQ run failed ?  so the bot had to reset itself to be clean.
Hmmmm good point. Yeah, this was in the previous run. https://chromium-review.googlesource.com/#/c/349101/

But then the CQ failed. That's supposed to cause all the slave bots to clear their chroot. Maybe it didn't for this bot.

Comment 3 by vapier@chromium.org, Jun 16 2016

the chroot clearing should happen on the next run rather than on the failing run since it's done in the init sdk phase (iirc)
Cc: henryhsu@chromium.org
Labels: -Pri-2 Pri-1
Owner: nxia@chromium.org
nxia@ is working on a fix

Comment 7 by autumn@chromium.org, Jun 21 2016

Labels: -current-issue

Comment 8 by roc...@chromium.org, Jun 21 2016

Any updates on this? It's apparently blocking https://chromium-review.googlesource.com/#/c/349101/ and we'd really like to get that landed ASAP.

Comment 9 by nxia@chromium.org, Jun 21 2016

The fix has been merged to master at commit# 5f125e28e6a9a32538e0458d1361e32b4b645fb5 (CL: https://chromium-review.googlesource.com/#/c/353772/). But there's a bug in Gerrit/CQ that CL was not marked as merged.

Please land your changes as the fix is already in master. 


Project Member

Comment 10 by bugdroid1@chromium.org, Jun 22 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/5f125e28e6a9a32538e0458d1361e32b4b645fb5

commit 5f125e28e6a9a32538e0458d1361e32b4b645fb5
Author: Ningning Xia <nxia@chromium.org>
Date: Fri Jun 17 20:37:00 2016

Clear chroot at CommitQueueSync stage for incremental builders.

Ensure and clear chroot at CommitQueueSync stages before PrePatchBuild
for incremental builders.

BUG=chromium:620773
TEST=run_tests

Change-Id: I4f55ab027fd2612c35e74d893aedb334b8ec8b8c
Reviewed-on: https://chromium-review.googlesource.com/353772
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Ningning Xia <nxia@chromium.org>

[modify] https://crrev.com/5f125e28e6a9a32538e0458d1361e32b4b645fb5/cbuildbot/stages/sync_stages.py

Comment 11 by nxia@chromium.org, Jun 22 2016

Status: Fixed (was: Untriaged)
Status: Available (was: Fixed)
This happened again. Not sure why.
https://uberchromegw.corp.google.com/i/chromeos/builders/lumpy-incremental-paladin/builds/9216

Again, this was on the run following a failed CQ in which we had a chroot-modifying CL.
Looking at 9216, looks like the InitSDK (pre-Patch) claims to be building chroot version 141, even though the correct version is 140. Did the chroot replacement not happen correctly?
In the InitSDK logs on that build, https://uberchromegw.corp.google.com/i/chromeos/builders/lumpy-incremental-paladin/builds/9216/steps/InitSDK%20%28pre-Patch%29/logs/stdio

I see that it is running that 141_unmerge_libchrome upgrade hook, even though that CL is no longer even participating in this CQ run.

Are we accidentally keeping around CLs from the prior run??
We have some logic that David wrote to keep the old chroot around as an optimization. On success, it's supposed to stash a copy away, then on failure restore the last good one.

I don't know if that's relevant to these builds or not.

Yes, that is known, and that was fixed by comment #10. But that isn't what's happening in #12. What looks like is happening in #12 is that, prior to finishing CommitQueueSync, we still have some of the temporary commits from the previous run lying around. So when we do our pre-patch build of the sdk, we use them.

I thought CleanUp was supposed to remove all these temporary commits, but looks like it didn't.
Hum....

It looks like the CleanUp stage deletes local branches, but leaves the local TOT alone.

https://cs.corp.google.com/chromeos_public/chromite/cbuildbot/commands.py?rcl=61fb5ca8994ec10c131719e77d535c2ba6fc1d7a&l=162

However, that should be okay. The sync stage should adjust the state of the checkout to exactly match the new build's manifest before the build restarts inside the checkout.
>However, that should be okay. The sync stage should adjust the state of the checkout to exactly match the new build's manifest before the build restarts inside the checkout.

Not in the case of the incremental builder though. It does a initsdk run before commitqueuesync does much of anything.
Cc: -pho...@chromium.org davidjames@chromium.org
Summary: incremental-paladin build leaves old commits around during pre-patch build (was: update_chroot failed on lumpy-incremental-paladin for no apparent reason)
RE 17 what's the rationale for CleanUp not cleaning up ToT?
Cc: -dgarr...@chromium.org pho...@chromium.org
CleanUp's job is to get rid of left over stuff that might affect the build.

Sync's job is to fetch the source to use for the build.

So.... why is InitSdk running before Sync? To test incremental behavior even if the previous build failed? If so, the right behavior would be to sync before InitSDK to to the supplied manifest, but ignoring the test CLs. Then apply the test CLs afterwards.

Comment 22 by nxia@chromium.org, Jun 23 2016

This probably explains why the second InitSDK got its origin version but the InitSDK pre-patch(before sync) got the 141 version.
Project Member

Comment 23 by bugdroid1@chromium.org, Jun 24 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/d210c08157e77970bb72d1cdc4e14842092f1b0e

commit d210c08157e77970bb72d1cdc4e14842092f1b0e
Author: Aviv Keshet <akeshet@chromium.org>
Date: Thu Jun 23 17:55:44 2016

CleanUpStage: add logging

BUG=chromium:620773
TEST=None

Change-Id: I580ece409349c5fd9ab20af4b3b7846c999be528
Reviewed-on: https://chromium-review.googlesource.com/355690
Commit-Ready: Aviv Keshet <akeshet@chromium.org>
Tested-by: Aviv Keshet <akeshet@chromium.org>
Reviewed-by: Don Garrett <dgarrett@chromium.org>

[modify] https://crrev.com/d210c08157e77970bb72d1cdc4e14842092f1b0e/cbuildbot/stages/build_stages.py

Project Member

Comment 24 by bugdroid1@chromium.org, Jul 1 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/chromite/+/a4303ef2ff27800b52e46ea4fb8dc5a2ccea8aa8

commit a4303ef2ff27800b52e46ea4fb8dc5a2ccea8aa8
Author: Ningning Xia <nxia@chromium.org>
Date: Wed Jun 29 23:17:17 2016

Repo sync before RunPrePatchBuild.

Repo sync before RunPrePatchBuild so that incremental-paladin can clear
the old commits.

BUG=chromium:620773
TEST=run_tests

Change-Id: I56a5604fa4ab2a3f19db4975575acaf468006ae1
Reviewed-on: https://chromium-review.googlesource.com/357371
Commit-Ready: Ningning Xia <nxia@chromium.org>
Tested-by: Ningning Xia <nxia@chromium.org>
Reviewed-by: Aviv Keshet <akeshet@chromium.org>

[modify] https://crrev.com/a4303ef2ff27800b52e46ea4fb8dc5a2ccea8aa8/cbuildbot/stages/sync_stages.py

Comment 25 by nxia@chromium.org, Jul 6 2016

Status: Fixed (was: Available)
Status: Verified (was: Fixed)
Closing. please reopen if its not fixed.
Cc: dshi@chromium.org
Status: Available (was: Verified)
Just hit this again:  https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/12617

From daisy-skate-paladin: https://uberchromegw.corp.google.com/i/chromeos/builders/daisy_skate-paladin/builds/7976/steps/SetupBoard/logs/stdio

19:25:17: INFO: RunCommand: /b/cbuild/internal_master/chromite/bin/cros_sdk 'PARALLEL_EMERGE_STATUS_FILE=/tmp/tmp2osg5B' -- ./update_chroot --toolchain_boards daisy_skate in /b/cbuild/internal_master
ERROR   : Fatal: Missing upgrade hook for 144
ERROR   : Chroot version is too new. Consider running cros_sdk --replace
ERROR   : Fri Oct 14 19:25:21 PDT 2016
ERROR   :  PGID  PPID   PID     ELAPSED     TIME %CPU COMMAND
ERROR   : Arguments of 10: ./update_chroot '--toolchain_boards' 'daisy_skate'
ERROR   : Backtrace:  (most recent call is last)
ERROR   :  update_chroot:47:main(), called: die_err_trap  
ERROR   : 
ERROR   : Command failed:
ERROR   :   Command '${SCRIPTS_DIR}/run_chroot_version_hooks' exited with nonzero code: 1
ERROR   :   (Note bash sometimes misreports "command not found" as exit code 1 instead of 127)
Cc: -davidjames@chromium.org

Comment 30 by nxia@chromium.org, Jun 20 2017

Labels: -Pri-1 Pri-2
the logs are gone, downgraded it to P2 for now. will debug it when it shows up with logs.
Components: Infra>Client>ChromeOS>CI
Components: -Infra>Client>ChromeOS

Comment 33 by nxia@chromium.org, May 31 2018

Cc: -dshi@chromium.org -pho...@chromium.org
Owner: ----

Sign in to add a comment