New issue
Advanced search Search tips

Issue 886575 link

Starred by 4 users

Issue metadata

Status: Fixed
Owner:
Closed: Oct 5
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug-Regression

Blocking:
issue 889005


Previous locations:
v8:8195


Sign in to add a comment

Buildbot repeatedly closes the tree for the same reason

Project Member Reported by clemensh@chromium.org, Sep 19

Issue description

https://v8-status.appspot.com/

E.g:
buildbot	Wed, 19 Sep 09:48	closed ( https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8%20Arm%20-%20debug/7868 from 1f3802a1e718e8315c826cc7cd43a100f896e1ec )
yangguo	Wed, 19 Sep 09:45	open (flakes are being investigated)
buildbot	Wed, 19 Sep 09:29	closed ( https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8%20Arm%20-%20debug/7868 from 1f3802a1e718e8315c826cc7cd43a100f896e1ec )
clemensh	Wed, 19 Sep 09:26	open
buildbot	Wed, 19 Sep 09:19	closed ( https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8%20Arm%20-%20debug/7868 from 1f3802a1e718e8315c826cc7cd43a100f896e1ec )
clemensh	Wed, 19 Sep 09:16	open
buildbot	Wed, 19 Sep 09:13	closed ( https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8%20Arm%20-%20debug/7868 from 1f3802a1e718e8315c826cc7cd43a100f896e1ec )
clemensh	Wed, 19 Sep 08:21	open


Or:
leszeks	Mon, 17 Sep 12:48	open (╯°□°)╯︵ pǝsolɔ
buildbot	Mon, 17 Sep 12:46	closed ( https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8%20Linux%20-%20arm64%20-%20sim%20-%20gc%20stress/12583 from a4105a437dd016a5e2d8aa2bb0ff603a717db6d7 )
leszeks	Mon, 17 Sep 12:42	open (why do you hate me buildbot ಠ_ಠ)
buildbot	Mon, 17 Sep 12:40	closed ( https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8%20Linux%20-%20arm64%20-%20sim%20-%20gc%20stress/12583 from a4105a437dd016a5e2d8aa2bb0ff603a717db6d7 )
leszeks	Mon, 17 Sep 12:36	open (omg buildbot stop)
buildbot	Mon, 17 Sep 12:35	closed ( https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8%20Linux%20-%20arm64%20-%20sim%20-%20gc%20stress/12583 from a4105a437dd016a5e2d8aa2bb0ff603a717db6d7 )
leszeks	Mon, 17 Sep 12:31	open (stop it buildbot I've reverted already)
buildbot	Mon, 17 Sep 12:30	closed ( https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8%20Linux%20-%20arm64%20-%20sim%20-%20gc%20stress/12583 from a4105a437dd016a5e2d8aa2bb0ff603a717db6d7 )
leszeks	Mon, 17 Sep 12:28	open
buildbot	Mon, 17 Sep 12:25	closed ( https://ci.chromium.org/p/v8/builders/luci.v8.ci/V8%20Linux%20-%20arm64%20-%20sim%20-%20gc%20stress/12583 from a4105a437dd016a5e2d8aa2bb0ff603a717db6d7 )
 
Cc: machenb...@chromium.org yangguo@chromium.org
Project: chromium
Moved issue v8:8195 to now be  issue chromium:886575 .
Components: Infra
Labels: -Type-Bug -Priority-1 Infra-Troopers Pri-1 Type-Bug-Regression
Status: Untriaged (was: Available)
Looks like gatekeeper is broken. See pic: http://shortn/_XX4uoNH5nw
Labels: -Infra-Troopers DevX-Troopers
Components: -Infra Infra>Sheriffing>Gatekeeper
Labels: Foundation-Troopers
Unfortunately Gatekeeper is currently essentially unowned. :( (seanmccullough@ and I are listed as owners but onlyu because we briefly touched it at one point) 

+ Foundation trooper in case the source of the problem here is related to the LUCI migration. 
I'm not sure it is related or not, but v8 gatekeeper fails when sending emails: https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.gatekeeper%2FChromium_Gatekeeper%2F1275907%2F%2B%2Frecipes%2Fsteps%2Fgatekeeper%3A_v8-tree-closers%2F0%2Fstdout

Maybe it prevents it from saving the state correctly or something...
It fails because 'results' field in some element in 'steps' is None (it is supposed to be an integer). I can't figure out from logs which step exactly.
Cc: no...@chromium.org
Anyway, it is probably unrelated. The block that sends notifications is in try:finally clause and should't affect the rest.

The gatekeeper log also consistently has following line:
INFO:root:revision properties have changed from {} to ['got_revision_cp']. clearing previous data.

As far as I understood, it causes it for "forget" that it already handled a revision, which would explain why it insists on closing the tree each time it runs (it think the failure it sees is a new one).

I didn't figure out yet why "revision properties" change.

Also, if this theory is correct, it affects not only v8, but all projects.
Cc: martiniss@chromium.org
I think this is because Gatekeeper was switched to remote_run: https://chromium.googlesource.com/chromium/tools/build/+/ea94fc1ee29de9662c4be2c48ead3352754a0856

Before the switch, the gatekeeper was always running from /b/build/slave/gatekeeper/build 

After the switch, it runs from a temp directory (e.g. /mnt/data/b/rr/tmpVYOCxJ/w).

As far as I can tell, gatekeeper stores the state into <cwd>/<project>.json. Since <cwd> is new each time, there's essentially no state preserved between gatekeeper runs.
Looking
This is almost definitely due to my change to remote_run. I'll make a fix.
Owner: martiniss@chromium.org
Status: Started (was: Untriaged)
Fixing
Project Member

Comment 14 by bugdroid1@chromium.org, Sep 20

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/8b4820d36d804f7ee3a0a601d06708e66be3be24

commit 8b4820d36d804f7ee3a0a601d06708e66be3be24
Author: Stephen Martinis <martiniss@chromium.org>
Date: Thu Sep 20 23:26:07 2018

gatekeeper: Put build db in a persistent location

TBR=nodir

Recipe-Nontrivial-Roll: build_limited_scripts_slave
Bug:  886575 
Change-Id: Id835b7d8f068185e27c043902d04a51204a86751
Reviewed-on: https://chromium-review.googlesource.com/1236544
Commit-Queue: Stephen Martinis <martiniss@chromium.org>
Reviewed-by: Stephen Martinis <martiniss@chromium.org>
Auto-Submit: Stephen Martinis <martiniss@chromium.org>

[modify] https://crrev.com/8b4820d36d804f7ee3a0a601d06708e66be3be24/scripts/slave/recipes/gatekeeper.expected/whitelist_config.json
[modify] https://crrev.com/8b4820d36d804f7ee3a0a601d06708e66be3be24/scripts/slave/recipes/gatekeeper.expected/production_data.json
[modify] https://crrev.com/8b4820d36d804f7ee3a0a601d06708e66be3be24/scripts/slave/recipes/gatekeeper.expected/keep_going.json
[modify] https://crrev.com/8b4820d36d804f7ee3a0a601d06708e66be3be24/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/basic.json
[modify] https://crrev.com/8b4820d36d804f7ee3a0a601d06708e66be3be24/scripts/slave/recipe_modules/gatekeeper/api.py
[modify] https://crrev.com/8b4820d36d804f7ee3a0a601d06708e66be3be24/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/keep_going.json
[modify] https://crrev.com/8b4820d36d804f7ee3a0a601d06708e66be3be24/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/whitelist_config.json
[modify] https://crrev.com/8b4820d36d804f7ee3a0a601d06708e66be3be24/scripts/slave/recipes/gatekeeper.expected/basic.json
[modify] https://crrev.com/8b4820d36d804f7ee3a0a601d06708e66be3be24/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/production_data.json

Status: Fixed (was: Started)
#14 should fix this. Re-open and assign to me if it ever breaks again.
Status: Assigned (was: Fixed)
We're still seeing this:
http://shortn/_Lf8UyqJj6Q
Project Member

Comment 19 by bugdroid1@chromium.org, Sep 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/cf2c2ecb004c78f579f95843e43e736c3a39a9b4

commit cf2c2ecb004c78f579f95843e43e736c3a39a9b4
Author: Michael Achenbach <machenbach@chromium.org>
Date: Fri Sep 21 13:19:55 2018

Revert "gatekeeper: Put build db in a persistent location"

This reverts commit 8b4820d36d804f7ee3a0a601d06708e66be3be24.

Reason for revert: Breaks gatekeeper:
https://logs.chromium.org/v/?s=chromium%2Fbb%2Fchromium.gatekeeper%2FChromium_Gatekeeper%2F1276138%2F%2B%2Frecipes%2Fsteps%2Fgatekeeper%3A_build%2F0%2Fstdout

Original change's description:
> gatekeeper: Put build db in a persistent location
> 
> TBR=nodir
> 
> Recipe-Nontrivial-Roll: build_limited_scripts_slave
> Bug:  886575 
> Change-Id: Id835b7d8f068185e27c043902d04a51204a86751
> Reviewed-on: https://chromium-review.googlesource.com/1236544
> Commit-Queue: Stephen Martinis <martiniss@chromium.org>
> Reviewed-by: Stephen Martinis <martiniss@chromium.org>
> Auto-Submit: Stephen Martinis <martiniss@chromium.org>

TBR=nodir@chromium.org,seanmccullough@chromium.org,martiniss@chromium.org

Change-Id: I1eb6c409c8b6f62e4f9979055990597ac9d141a8
No-Presubmit: true
No-Tree-Checks: true
No-Try: true
Bug:  886575 
Reviewed-on: https://chromium-review.googlesource.com/1238555
Reviewed-by: Michael Achenbach <machenbach@chromium.org>
Commit-Queue: Michael Achenbach <machenbach@chromium.org>

[modify] https://crrev.com/cf2c2ecb004c78f579f95843e43e736c3a39a9b4/scripts/slave/recipes/gatekeeper.expected/whitelist_config.json
[modify] https://crrev.com/cf2c2ecb004c78f579f95843e43e736c3a39a9b4/scripts/slave/recipes/gatekeeper.expected/production_data.json
[modify] https://crrev.com/cf2c2ecb004c78f579f95843e43e736c3a39a9b4/scripts/slave/recipes/gatekeeper.expected/keep_going.json
[modify] https://crrev.com/cf2c2ecb004c78f579f95843e43e736c3a39a9b4/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/basic.json
[modify] https://crrev.com/cf2c2ecb004c78f579f95843e43e736c3a39a9b4/scripts/slave/recipe_modules/gatekeeper/api.py
[modify] https://crrev.com/cf2c2ecb004c78f579f95843e43e736c3a39a9b4/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/keep_going.json
[modify] https://crrev.com/cf2c2ecb004c78f579f95843e43e736c3a39a9b4/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/whitelist_config.json
[modify] https://crrev.com/cf2c2ecb004c78f579f95843e43e736c3a39a9b4/scripts/slave/recipes/gatekeeper.expected/basic.json
[modify] https://crrev.com/cf2c2ecb004c78f579f95843e43e736c3a39a9b4/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/production_data.json

Project Member

Comment 20 by bugdroid1@chromium.org, Sep 21

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/ba7ea34bdd37502d30b52422d96282aeb75a6477

commit ba7ea34bdd37502d30b52422d96282aeb75a6477
Author: Stephen Martinis <martiniss@chromium.org>
Date: Fri Sep 21 18:50:16 2018

Reland "gatekeeper: Put build db in a persistent location"

This is a reland of 8b4820d36d804f7ee3a0a601d06708e66be3be24

Original change's description:
> gatekeeper: Put build db in a persistent location
>
> TBR=nodir
>
> Recipe-Nontrivial-Roll: build_limited_scripts_slave
> Bug:  886575 
> Change-Id: Id835b7d8f068185e27c043902d04a51204a86751
> Reviewed-on: https://chromium-review.googlesource.com/1236544
> Commit-Queue: Stephen Martinis <martiniss@chromium.org>
> Reviewed-by: Stephen Martinis <martiniss@chromium.org>
> Auto-Submit: Stephen Martinis <martiniss@chromium.org>

TBR=nodir

Recipe-Nontrivial-Roll: build_limited_scripts_slave
Bug:  886575 
Change-Id: I7be3722d0003a8eeb2e3f9e4a291bd24a10279df
Reviewed-on: https://chromium-review.googlesource.com/1239259
Commit-Queue: Stephen Martinis <martiniss@chromium.org>
Reviewed-by: Nodir Turakulov <nodir@chromium.org>
Reviewed-by: Stephen Martinis <martiniss@chromium.org>

[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipes/gatekeeper.expected/whitelist_config.json
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/whitelist_config.json
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipes/gatekeeper.expected/production_data.json
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipes/gatekeeper.expected/keep_going.json
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/basic.json
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipe_modules/gatekeeper/api.py
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/keep_going.json
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/README.recipes.md
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipe_modules/gatekeeper/__init__.py
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipes/gatekeeper.expected/basic.json
[modify] https://crrev.com/ba7ea34bdd37502d30b52422d96282aeb75a6477/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/production_data.json

Status: Fixed (was: Assigned)
https://ci.chromium.org/buildbot/chromium.gatekeeper/Chromium%20Gatekeeper/1276207 ran with this CL successfully. Hopefully this will fix everything for real this time. Re-open if it breaks again.
Blocking: 889005
Status: Assigned (was: Fixed)
This is still happening. See  issue 889005  for more analysis. The symptom observed there is that also some builds are omitted and the tree is not closed. Both this and present issue have the same root cause: the db not being saved in cache.
Cc: zhangtiff@chromium.org seanmccullough@chromium.org
 Issue 889005  has been merged into this issue.
I think the temporary plan to make sure gatekeeper isn't broken is to roll back the migration to remote run. I don't think we can easily migrate the bot to LUCI right now, and I don't want to be time pressured when doing the migration.

Sean, any thoughts?
Project Member

Comment 26 by bugdroid1@chromium.org, Oct 4

Project Member

Comment 27 by bugdroid1@chromium.org, Oct 5

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/build/+/5f2af917e61a1f7d008c7acbf1f9156303b0b08e

commit 5f2af917e61a1f7d008c7acbf1f9156303b0b08e
Author: Michael Achenbach <machenbach@chromium.org>
Date: Fri Oct 05 06:44:54 2018

[Gatekeeper] Switch gatekeeper to kitchen path config

Bug:  886575 
Change-Id: I28852b05e5519486e676155ae8c22c956685db74
Reviewed-on: https://chromium-review.googlesource.com/c/1260166
Commit-Queue: Michael Achenbach <machenbach@chromium.org>
Reviewed-by: Sergiy Byelozyorov <sergiyb@chromium.org>

[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/recipes/gatekeeper.expected/whitelist_config.json
[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/recipe_modules/gatekeeper/tests/call.py
[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/recipes/gatekeeper.expected/production_data.json
[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/masters/master.chromium.gatekeeper/master.cfg
[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/recipes/gatekeeper.expected/keep_going.json
[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/recipe_modules/gatekeeper/api.py
[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/README.recipes.md
[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/recipe_modules/gatekeeper/__init__.py
[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/recipes/gatekeeper.expected/basic.json
[modify] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/recipes/gatekeeper.py
[add] https://crrev.com/5f2af917e61a1f7d008c7acbf1f9156303b0b08e/scripts/slave/recipe_modules/gatekeeper/tests/call.expected/kitchen.json

Owner: machenb...@chromium.org
Restarted gatekeeper master manually for change above.
Status: Fixed (was: Assigned)
This seems fixed now. Observation: From the second run on (https://ci.chromium.org/buildbot/chromium.gatekeeper/Chromium%20Gatekeeper/1280212) the extra debug output now shows the 'db before' steps, proving the dbs exist on startup, i.e. they got cached.

It's a bit strange that there is no property in UI stating path_config=kitchen, but it seems to work nonetheless.

I'll keep some of the debug output until all gatekeeper instances have been migrated to LUCI. There's some potential that it'll break again...
Cc: tandrii@chromium.org
+ tandrii fyi. You stated in tree message that infra tree doesn't get closed properly - maybe same as  issue 889005 ? I hope that should be fixed now with the CL above. If not, please paste the build that should have closed the tree.
Thanks! We have a couple of long time red builders. So, we'll need to beak yet another builder first. I'll keep an eye on things.

Sign in to add a comment