New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 876411 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Closed: Sep 6
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 2
Type: Bug



Sign in to add a comment

chromite hasn't been uprevved

Project Member Reported by gu...@chromium.org, Aug 21

Issue description

In recent push-to-prod, there was always no new commits in chromite. However, there have been queue lots of commits (138) in chromite, see

d998ce5f1 4 days ago Xixuan Wu   (HEAD, m/master, cros/master) cbuildbot: Remove swarming_cli_cmd.
4b9089142 4 days ago Mike Frysinger      tree_status: remove unused modules
efcc86a68 4 days ago Mike Frysinger      pylintrc: document deprecated-modules field
065a4f374 4 days ago Mike Frysinger      pylintrc: ban exit & quit builtins
9b5a3d78d 12 days ago Xixuan Wu  cbuildbot: Let SkylabHWTestStage run any suite.
...
8dace2766 5 weeks ago Don Garrett        gen_luci_scheduler: Add branched build config support.
68844342e 5 weeks ago Don Garrett        chromeos_config: Remove many dead boards.
dad47a953 4 weeks ago Caroline Tice      Enable CFI on peach_pit and kevin release builders.
62e2a5e8c 4 weeks ago Mike Frysinger     cbuildbot: fuzzer: stop processing android stages
69c79e9bc 6 weeks ago Mike Frysinger     lint: fix not-an-iterable warnings
8960f7c7b 6 weeks ago Mike Frysinger     lint: fix consider-using-enumerate warnings
9e8edc1dd 6 weeks ago Mike Frysinger     lint: fix unsubscriptable-object warnings
b96253cab 5 weeks ago Amin Hassani       cros_payload: deprecate it
5389d34c8 4 weeks ago Bob Moragues       (cros/prod-next, cros/prod) config: Fix Build Break on firmware tryjobs on master branch



$ git l m/master...cros/prod-next|wc -l
138



 
Components: Infra>Client>ChromeOS>Test
Labels: -Pri-2 Pri-1
Summary: chromite hasn't been uprevved (was: chromte havn't got revved)
The last chromite change in production was merged on July 23rd. crrev.com/c/1147112

It is HEAD of both cros/prod and cros/prod-next. No test push has updated it since then.

Cc: jkop@chromium.org
Labels: Chase-Pending
I did some investigation. The change responsible is almost certainly in one of these two lists:

https://chrome-internal-review.googlesource.com/q/is:merged+before:2018-07-31++after:2018-07-23

Internal changes that landed the day of the last chromite push that made changes and the day of the first chromite push which did not.

https://chromium-review.googlesource.com/q/is:merged+before:2018-07-24++after:2018-07-23+projects:autotest

Autotest changes which were in the push itself. (I am willing to rule out the chromite changes being responsible; there were very few and none are plausible candidates.)
Status: Started (was: Untriaged)
push master has local changes.  These should have been reset by Puppet.  Looking
Oh, Puppet doesn't reset the chromiumos repo repo
pprabhu, shame (although the fault is in the system, not the developer)

WARN: 2018-08-22 15:00:03 -0700: Ignoring configured merge_behavior
WARN: 2018-08-22 15:00:03 -0700: Must have 'deep_merge' gem installed.
WARN: 2018-08-22 15:00:03 -0700: Cannot load backend eyaml: cannot load such file -- hiera/backend/eyaml_backend
2018-08-22 15:00:03,566 INFO| Running atomic task RunTestPushTask
2018-08-22 15:00:03,566 INFO| Starting test push
2018-08-22 15:00:03,566 INFO| Triggering /usr/local/autotest/site_utils/test_push.py
2018-08-22 15:00:03,570 INFO| Triggering /opt/infra-tools/usr/bin/skylab_test_push
[xkcd] Would have run: /usr/local/autotest/site_utils/test_push.py
[xkcd] Would have run: /opt/infra-tools/usr/bin/skylab_test_push
2018-08-22 15:00:03,575 INFO| [xkcd] Success!!!
2018-08-22 15:00:03,576 INFO| Completed test push
2018-08-22 15:00:03,576 INFO| RunTestPushTask succeed.
2018-08-22 15:00:03,576 INFO| Printing out task report.
{
  "sub_reports": [],
  "exception": null,
  "is_successful": true,
  "description": "RunTestPushTask succeed.",
  "arguments_used": {
    "service_account_json": "/creds/service_accounts/cipd-uprev-service.json"
  },
  "task_name": "RunTestPushTask"
}
afe_labels_sync keeps crashing
Okay, that means it needs to be dropped from test push too
Re #10 that happens automatically, so I'll go run test push again
Ironically, the CL in #9 broke test push in a different way.
Cc: gu...@chromium.org
Labels: Hotlist-Deputy
There were other problems affecting test push, but chromite still isn't being uprevved.
Oh curses it's using site_utils/deploy_server.py to update the repos.  I have a bad feeling about this
./utils/build_externals.py --use_chromite_master

isn't working
When I stepped through with a debugger, it started working...
> /usr/local/autotest/ExternalSource/utils/build_externals.py(180)build_and_install_packages()
(Pdb) l
[EOF]


DEBUG:root:[stdout] Updating server: chromeos-staging-shard2.cbf.corp.google.com
DEBUG:root:[stdout] Checking tree status:
DEBUG:root:[stdout] Tree status: clean
DEBUG:root:[stdout] Updating Repo.
DEBUG:root:[stdout] Updating push servers, checkout cros/master
DEBUG:root:[stdout] Removing .pyc files
DEBUG:root:[stdout] Updating ~chromeos-test/chromiumos
DEBUG:root:[stdout] Removing .pyc files
DEBUG:root:[stdout] Running update commands: build_externals
DEBUG:root:[stdout] Running: build_externals: ./utils/build_externals.py --use_chromite_master
DEBUG:root:[stdout] Restarting Services: apache2, gs_offloader, gs_offloader_s, host-scheduler, job_aborter, scheduler, shard-client, sysmon
DEBUG:root:[stdout] Restarting: apache2
DEBUG:root:[stdout] Restarting: gs_offloader
DEBUG:root:[stdout] Restarting: gs_offloader_s
DEBUG:root:[stdout] Restarting: host-scheduler
DEBUG:root:[stdout] Restarting: job_aborter
DEBUG:root:[stdout] Restarting: scheduler
DEBUG:root:[stdout] Restarting: shard-client
DEBUG:root:[stdout] Restarting: sysmon
DEBUG:root:[stdout] Changes:
DEBUG:root:[stdout] autotest:
DEBUG:root:[stdout] fc085e574 autotest: Modify provision expiration secs as 95% of the whole timeout.
DEBUG:root:[stdout] e1fd8bf78 autotest: urgent fix: remove unused @property
DEBUG:root:[stdout] 
DEBUG:root:[stdout] autotest/site_utils/autotest_private:
DEBUG:root:[stdout] No Change.
DEBUG:root:[stdout] 
DEBUG:root:[stdout] autotest/site_utils/devserver:
DEBUG:root:[stdout] No Change.
DEBUG:root:[stdout] 

Bah, it's probably working now.  Maybe stale pyc file or a race condition?  All of the code appears correct and it appears to be working now, stupid heisenbug.
Cc: ayatane@chromium.org
Labels: -Pri-1 -Chase-Pending Pri-2
Owner: jrbarnette@chromium.org
-> deputy who will be doing a push anyway.

Can check on deputy-view to see if chromite was updated.
Project Member

Comment 22 by bugdroid1@chromium.org, Aug 27

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/fb85484f887525242600c74a7742f9a81fd481a0

commit fb85484f887525242600c74a7742f9a81fd481a0
Author: Allen Li <ayatane@chromium.org>
Date: Mon Aug 27 19:35:13 2018

Apparently, our push testing no longer updates the cros/prod-next
branch in chromite:

: jrbarnette $REPOS/cros.base/chromite; git rev-parse cros/prod cros/prod-next
5389d34c833dae162248c0c8ddb1b71d31851924
5389d34c833dae162248c0c8ddb1b71d31851924
: jrbarnette $REPOS/cros.base/chromite; git log -1 cros/prod
commit 5389d34c833dae162248c0c8ddb1b71d31851924 (cros/prod-next, cros/prod)
Author: Bob Moragues <moragues@google.com>
Date:   Mon Jul 23 10:26:46 2018 -0700

    config: Fix Build Break on firmware tryjobs on master branch
    
    Set  upload_hw_test_artifacts to false
    Generate config_dump.json
    
    BUG=chromium:828953
    TEST=cros tryjob nocturne-firmware-tryjob
    
    Change-Id: I2168c742df1c717d934ecdb99e124793bedfb8f6
    Reviewed-on: https://chromium-review.googlesource.com/1147112
    Commit-Ready: Bob Moragues <moragues@chromium.org>
    Tested-by: Bob Moragues <moragues@chromium.org>
    Reviewed-by: Mike Frysinger <vapier@chromium.org>


> Apparently, our push testing no longer updates the cros/prod-next
> branch in chromite:

I guess we already knew that.  :-(

It's not yet proven whether the fix above worked, since the last successful
push test was just before the change landed.

Oh dear.

Re #23 by all appearances the code should still update chromite.  It worked as the code said it should work when I stepped through usigng pdb.  Maybe there's a race, since stepping through works.
> Re #23 by all appearances the code should still update chromite.
>  It worked as the code said it should work when I stepped through
> usigng pdb.  Maybe there's a race, since stepping through works.

We haven't successfully updated either the autotest prod or the chromite
prod branch since the fix was committed.  So, we don't yet know whether
this works or fails.

Re #26, are you talking about the "fix" in #22?  That does not address the root problem, merely a related problem I ran into while digging into the root problem.

The code path that uprevs chromite during test push hasn't been touched in ages, which makes this problem all the more infuriating.

Ultimately, the code winds up calling build_externals.py --use_chromite_master

I have checked that both the test push code ends up calling build_externals with the correct flag and that build_externals.py when called with said flag checks out master in site-packages/chromite
Owner: pprabhu@chromium.org
Status: Assigned (was: Started)
As expected, only the build_externals deploy version of chromite is stuck on an old ref. The ~/chromiumos/chromite version is latest:

chromeos-test@chromeos-staging-master2:/usr/local/autotest/site-packages/chromite$ git log -1
commit 5389d34c833dae162248c0c8ddb1b71d31851924 (HEAD -> master, origin/prod-next, origin/prod)
Author: Bob Moragues <moragues@google.com>
Date:   Mon Jul 23 10:26:46 2018 -0700

    config: Fix Build Break on firmware tryjobs on master branch
    
    Set  upload_hw_test_artifacts to false
    Generate config_dump.json
    
    BUG=chromium:828953
    TEST=cros tryjob nocturne-firmware-tryjob
    
    Change-Id: I2168c742df1c717d934ecdb99e124793bedfb8f6
    Reviewed-on: https://chromium-review.googlesource.com/1147112
    Commit-Ready: Bob Moragues <moragues@chromium.org>
    Tested-by: Bob Moragues <moragues@chromium.org>
    Reviewed-by: Mike Frysinger <vapier@chromium.org>
chromeos-test@chromeos-staging-master2:/usr/local/autotest/site-packages/chromite$ cd ~/chromiumos/chromite/
chromeos-test@chromeos-staging-master2:~/chromiumos/chromite$ git log -1
commit d15b2011d9c0ec0dcb683a0456360ae950889e1f (HEAD, m/master, cros/master)
Author: Manoj Gupta <manojgupta@google.com>
Date:   Fri Aug 31 14:31:41 2018 -0700

    chromeos_config: Add kevin64-full builder.
    
    We want to start testing AArch64 userspace in Chrome OS.
    So start testing kevin64-full builds.
    
    BUG=chromium:878565
    TEST=chromite unit tests pass
    
    Change-Id: Id424b7c349bed433e5984b3f73045f2cc6709dab
    Reviewed-on: https://chromium-review.googlesource.com/1200173
    Commit-Ready: Manoj Gupta <manojgupta@chromium.org>
    Tested-by: Manoj Gupta <manojgupta@chromium.org>
    Reviewed-by: Mattias Nissler <mnissler@chromium.org>
    Reviewed-by: Luis Lozano <llozano@chromium.org>

Status: Started (was: Assigned)
Aha! My nasty little frenemy!


chromeos-test@chromeos-staging-master2:/usr/local/autotest/site-packages/chromite$ /usr/local/autotest/utils/build_externals.py --use_chromite_master
...

chromeos-test@chromeos-staging-master2:/usr/local/autotest/site-packages/chromite$ git log -1
commit d15b2011d9c0ec0dcb683a0456360ae950889e1f (HEAD -> master, origin/master, origin/HEAD)
Author: Manoj Gupta <manojgupta@google.com>
Date:   Fri Aug 31 14:31:41 2018 -0700

    chromeos_config: Add kevin64-full builder.
    
    We want to start testing AArch64 userspace in Chrome OS.
    So start testing kevin64-full builds.
    
    BUG=chromium:878565
    TEST=chromite unit tests pass
    
    Change-Id: Id424b7c349bed433e5984b3f73045f2cc6709dab
    Reviewed-on: https://chromium-review.googlesource.com/1200173
    Commit-Ready: Manoj Gupta <manojgupta@chromium.org>
    Tested-by: Manoj Gupta <manojgupta@chromium.org>
    Reviewed-by: Mattias Nissler <mnissler@chromium.org>
    Reviewed-by: Luis Lozano <llozano@chromium.org>

root@chromeos-staging-master2:/usr/local/autotest/site-packages/chromite# /root/chromeos-admin/puppet/run_puppet
...

chromeos-test@chromeos-staging-master2:/usr/local/autotest/site-packages/chromite$ git log -1
commit 5389d34c833dae162248c0c8ddb1b71d31851924 (HEAD -> master, origin/prod-next, origin/prod)
Author: Bob Moragues <moragues@google.com>
Date:   Mon Jul 23 10:26:46 2018 -0700

    config: Fix Build Break on firmware tryjobs on master branch
    
    Set  upload_hw_test_artifacts to false
    Generate config_dump.json
    
    BUG=chromium:828953
    TEST=cros tryjob nocturne-firmware-tryjob
    
    Change-Id: I2168c742df1c717d934ecdb99e124793bedfb8f6
    Reviewed-on: https://chromium-review.googlesource.com/1147112
    Commit-Ready: Bob Moragues <moragues@chromium.org>
    Tested-by: Bob Moragues <moragues@chromium.org>
    Reviewed-by: Mike Frysinger <vapier@chromium.org>



because, as part of puppet run,

Notice: /Stage[main]/Lab::Compiled_autotest_repo/Exec[build_externals]/returns: executed successfully


I'm guessing that around July 31 is when I changed the order in which test_push updates the repos and runs puppet (puppet uses some tools deployed in autotest, and when these tools break, the only way to get the staging servers to a good state is to first update the tools, then the use of puppet. running the puppet first fails, and the servers get stuck).

So, since then each test_push has promptly downgraded chromite as part of setup and never pushed a new chromite.

Since build_externals is run as part of prod-push, it should not be run by puppet, except during setup. This is how the ~/chromiumos and /usr/local/autotest checkouts are handled as well.
Project Member

Comment 34 by bugdroid1@chromium.org, Sep 5

The following revision refers to this bug:
  https://chrome-internal.googlesource.com/chromeos/chromeos-admin/+/c22ff88235abcdcdeb8ae12bb8058e59826ad32a

commit c22ff88235abcdcdeb8ae12bb8058e59826ad32a
Author: Prathmesh Prabhu <pprabhu@chromium.org>
Date: Wed Sep 05 22:08:57 2018

Status: Fixed (was: Started)

verified that staging lab has now updated the site-packages/chromite path for the next test_push

Sign in to add a comment