New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 659178 link

Starred by 1 user

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Oct 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Windows
Pri: 3
Type: Bug



Sign in to add a comment

Exception on win64 builder

Project Member Reported by katthomas@google.com, Oct 25 2016

Issue description

(See also https://groups.google.com/a/google.com/forum/#!topic/chrome-troopers/vn8DUlrCQ5E)

Win-x64-Builder has kept failing since #26217, and here are what I found.

At #26217. git.bat failed with the following message: 
WARNING: subprocess '"git.bat" "fetch" "-v" "--progress" "origin" "+refs/heads/*:refs/heads/*"' in C:\b\c\git_cache\chromium.googlesource.com-chromium-deps-libsrtp failed; will retry after a short nap...

We failed gclient sync, lets delete the checkout and retry.
.git detected in checkout, deleting C:\b\c\b\win\.gclient... Marking for removal C:\b\c\b\win\.gclient => C:\b\c\b\build.dead\86d61fa41fe2472c9114d89b63617d00
done
.git detected in checkout, deleting C:\b\c\b\win\.gclient_entries... Marking for removal C:\b\c\b\win\.gclient_entries => C:\b\c\b\build.dead\63930a14eb274ceab95299ce3d239614
done
.git detected in checkout, deleting C:\b\c\b\win\src... Marking for removal C:\b\c\b\win\src => C:\b\c\b\build.dead\54c45235d3174f628fc3209b42a8e5b6

Please note that there is no "done" message for the last remove.
: https://cs.corp.google.com/eureka_internal/chromium/tools/depot_tools/recipe_modules/bot_update/resources/bot_update.py?q=def+ensure_no_checkout&dr=CSs&l=694

Because the removal of c:\b\c\b\win\src failed, it caused all the following builds failing with the following message
: from #26218 to #26285

fatal: Unable to create 'C:/b/c/b/win/src/.git/index.lock': File exists.
... If it still fails, a git process may have crashed in this repository earlier:
remove the file manually to continue.

The log of #26217 shows that the removal of the following two folders have been succeeded. Nevertheless, the folder c:\b\c\b\build.dead didn't exist at Oct 24, 12:30 pm. (where did it go?)
I manually moved c:\b\c\b\win\src to C:\b\c\b\build.dead\54c45235d3174f628fc3209b42a8e5b6 when #26286 was in sleep for another attempt after 4 times of failure at checkout,
which caused the following error and the machine rebooted.
===backing off, sleeping for 280 secs===
...
WindowsError: [Error 267] The directory name is invalid

The next build #26287 passed the point at which all the previous builders failed. It is still spinning, but has run for 1.5 hours without an error.
Here are my questions.
It seems that os.rename(".../win/src", "..../build.dead/xxx_hash_xx") wasn't successfully done. However, the log doesn't contain any message for the failure.
Any thoughts about not-seeing a log message for the unsuccessful os.rename() for win/src?
Since both folders are within the same partition, renaming would have been done really quickly.

The exception has occurred for approximately 2 days and 6 hours. Nevertheless, I couldn't find an alert mail for the exceptions.
I wonder if I missed the alert mail or we should create a new alert if a single builder fails with an exception for a certain period of time.
I could find that http://o/e/m24b3c04d88000047 and http://o/e/m24b3c54248000058 were created soon after the initial failure, but it only checks the overall failure rate, and,
therefore, soon after the overall build failure rate dropped down to < 0.15, the alert stopped.

The log from the first failure shows that c:\b\c\b\win\.gclient_entries and C:\b\c\b\win\.gclient were succeeded.
Nevertheless, c:\b\c\b\build.dead didn't exist when I started looking into the builder. Is build.dead cleaned up regularly by a script?
 
Status: Fixed (was: Assigned)
Project Member

Comment 4 by bugdroid1@chromium.org, Oct 31 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/depot_tools.git/+/86fe47d1a076563a664adaaf2ae9e4d8dbe86932

commit 86fe47d1a076563a664adaaf2ae9e4d8dbe86932
Author: katthomas <katthomas@google.com>
Date: Mon Oct 31 20:49:14 2016

Make bot_update on win more resilient

Sometimes Windows has trouble deleting files. This can cause problems
when lockfiles are left in .git directories.

R=agable@google.com
BUG= 659178 

Review-Url: https://codereview.chromium.org/2454463002

[modify] https://crrev.com/86fe47d1a076563a664adaaf2ae9e4d8dbe86932/recipe_modules/bot_update/resources/bot_update.py
[modify] https://crrev.com/86fe47d1a076563a664adaaf2ae9e4d8dbe86932/tests/bot_update_coverage_test.py

Project Member

Comment 5 by bugdroid1@chromium.org, Oct 31 2016

The following revision refers to this bug:
  https://chromium.googlesource.com/chromium/tools/depot_tools.git/+/86fe47d1a076563a664adaaf2ae9e4d8dbe86932

commit 86fe47d1a076563a664adaaf2ae9e4d8dbe86932
Author: katthomas <katthomas@google.com>
Date: Mon Oct 31 20:49:14 2016

Make bot_update on win more resilient

Sometimes Windows has trouble deleting files. This can cause problems
when lockfiles are left in .git directories.

R=agable@google.com
BUG= 659178 

Review-Url: https://codereview.chromium.org/2454463002

[modify] https://crrev.com/86fe47d1a076563a664adaaf2ae9e4d8dbe86932/recipe_modules/bot_update/resources/bot_update.py
[modify] https://crrev.com/86fe47d1a076563a664adaaf2ae9e4d8dbe86932/tests/bot_update_coverage_test.py

Sign in to add a comment