Chrome PFQ failing in cheets_CTS_N tests ("No such file or directory: ... .#copy_images.sh") |
|||||||||
Issue description+ARC constables (risan, levarum) +Those who landed CTS autotest changes over the weekend (rohitbm, ihf) https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_minnie-chrome-pfq/builds/2134 Unhandled OSError: [Errno 2] No such file or directory: '/usr/local/autotest/results/shared/cache/cache/XXXXXXXXXX/android-cts-media-1.3/android-cts-media-1.3/.#copy_images.sh'
,
Dec 11 2017
Hmmm but the same failure is seen on the versions before the CL 523762 (the _clea**r**_download_cache_if_needed() stacktrace below is from the old rev.) https://stainless.corp.google.com/search?exclude_retried=true&first_date=20171112&master_builder_name=&builder_name_number=&shard=&exclude_acts=true&builder_name=&master_builder_name_number=&owner=&retry=&exclude_cts=false&exclude_non_production=true&hostname=&board=%5Easuka%24&test=cheets_GTS.5&exclude_not_run=false&build=%5ER65%5C-10193%5C.0%5C.0%24&status=FAIL&status=ERROR&status=ABORT&reason=&waterfall=&suite=&last_date=20171211&exclude_non_release=true&exclude_au=true&view=list https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/160889759-chromeos-test/chromeos6-row2-rack22-host6/debug/ 12/10 00:09:14.900 ERROR| logging_manager:0626| tko parser: update RUNNING reason: Unhandled OSError: [Errno 2] No such file or directory: '/usr/local/autotest/results/shared/cache/cache/3a05d334ad72728dd6b03104185219b8/android-cts-media-1.3/android-cts-media-1.3/.#copy_images.sh' 12/10 00:09:14.901 ERROR| logging_manager:0626| tko parser: The following lines were ignored: 12/10 00:09:14.901 ERROR| logging_manager:0626| tko parser: Traceback (most recent call last): 12/10 00:09:14.901 ERROR| logging_manager:0626| 12/10 00:09:14.902 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/client/common_lib/test.py", line 598, in _exec 12/10 00:09:14.902 ERROR| logging_manager:0626| 12/10 00:09:14.903 ERROR| logging_manager:0626| tko parser: _cherry_pick_call(self.initialize, *args, **dargs) 12/10 00:09:14.903 ERROR| logging_manager:0626| 12/10 00:09:14.904 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/client/common_lib/test.py", line 746, in _cherry_pick_call 12/10 00:09:14.904 ERROR| logging_manager:0626| 12/10 00:09:14.905 ERROR| logging_manager:0626| tko parser: return func(*p_args, **p_dargs) 12/10 00:09:14.905 ERROR| logging_manager:0626| 12/10 00:09:14.905 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/server/cros/tradefed_test.py", line 455, in initialize 12/10 00:09:14.906 ERROR| logging_manager:0626| 12/10 00:09:14.906 ERROR| logging_manager:0626| tko parser: self._clear_download_cache_if_needed() 12/10 00:09:14.907 ERROR| logging_manager:0626| 12/10 00:09:14.907 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/server/cros/tradefed_test.py", line 653, in _clear_download_cache_if_needed 12/10 00:09:14.908 ERROR| logging_manager:0626| 12/10 00:09:14.908 ERROR| logging_manager:0626| tko parser: size = self._dir_size(self._tradefed_cache) 12/10 00:09:14.908 ERROR| logging_manager:0626| 12/10 00:09:14.909 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/server/cros/tradefed_test.py", line 644, in _dir_size 12/10 00:09:14.909 ERROR| logging_manager:0626| 12/10 00:09:14.910 ERROR| logging_manager:0626| tko parser: os.path.getsize(os.path.join(root, name)) for name in files) 12/10 00:09:14.911 ERROR| logging_manager:0626| 12/10 00:09:14.911 ERROR| logging_manager:0626| tko parser: File "/usr/local/autotest/server/cros/tradefed_test.py", line 644, in <genexpr> 12/10 00:09:14.912 ERROR| logging_manager:0626| 12/10 00:09:14.913 ERROR| logging_manager:0626| tko parser: os.path.getsize(os.path.join(root, name)) for name in files) 12/10 00:09:14.913 ERROR| logging_manager:0626| 12/10 00:09:14.914 ERROR| logging_manager:0626| tko parser: File "/usr/lib/python2.7/genericpath.py", line 49, in getsize 12/10 00:09:14.915 ERROR| logging_manager:0626| 12/10 00:09:14.916 ERROR| logging_manager:0626| tko parser: return os.stat(filename).st_size 12/10 00:09:14.916 ERROR| logging_manager:0626|
,
Dec 11 2017
+ChromeOS sheriffs
,
Dec 11 2017
Downloaded and extracted the file locally: https://source.android.com/compatibility/cts/downloads#cts-media-files kinaba: ~/Desktop> unzip android-cts-media-1.3.zip Archive: android-cts-media-1.3.zip creating: android-cts-media-1.3/ inflating: android-cts-media-1.3/copy_media.sh inflating: android-cts-media-1.3/make_zip.sh linking: android-cts-media-1.3/.#copy_images.sh -> jinpark@jinpark.seo.corp.google.com.153778:1471402662 inflating: android-cts-media-1.3/README.txt ... so the .#copy_images.sh is a presumably unintended symlink that may be causing weird outcome. Though not sure why it wasn't causing the problem since now.
,
Dec 11 2017
OK I think I got the whole picture now. (1) The broken symlink has always been included in the CTS-media zip. (2) We didn't hit the issue until https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/523762, because we only cached and shared the zip file; each job decompressed the zip file and just delete the decompressed tree each time. (3) Now after 523762, the decompressed zip tree is put also onto the shared cache dir, which subjects to the recursive getsize() for occasional cache-cleanup for not to fill up the server disk space. (2) is the reason why we recently started seeing the failure, and (3) is the reason why we see the failure even with older versions. (A job scheduled after (2) will fail regardless of 523762 is used or not, because the cleanup code for (3) has always been there from the older versions.)
,
Dec 11 2017
The server-side directory is already tainted by the broken symlink, so the fix has to deal with it at least. Just reverting the mentioned change does not fix the issue. As the quickest fix I'll ignore the error at (3). Here's the CL: https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/818670/2/server/cros/tradefed_test.py CTS Media 1.4 (https://source.android.com/compatibility/cts/downloads#cts-media-files) does not contain the problematic .# link. So a longer term fix may be to switch to it.
,
Dec 11 2017
,
Dec 11 2017
,
Dec 11 2017
Locally confirmed https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/818670/2/server/cros/tradefed_test.py will make the tests surviving even with the existence of bad symlinks. It still causes the "directly affected" CTS tests (i.e., CtsMediaStressTestCases and cheets_CTS_N.all that uses the media files) because shuitl.copytree() fails due to the file, but since it is not a part of CQ. We should be able to deal with them separately.
,
Dec 11 2017
,
Dec 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/6fd2f1e240b9e4064a595d1be8535b752abe7b2d commit 6fd2f1e240b9e4064a595d1be8535b752abe7b2d Author: Kazuhiro Inaba <kinaba@chromium.org> Date: Mon Dec 11 03:53:28 2017 cheets_CTS: workaround for broken symlink file in CTS media zip. For some reason the archive contains a broken symlink which causes our recursive stat to fail. As a very quick workaround, catch the exception and assume the file to be size 0. BUG= chromium:793696 TEST=trybot Change-Id: I25e259ca9de77bf4adb36cf1960a75ecbd5e861c Reviewed-on: https://chromium-review.googlesource.com/818670 Tested-by: Kazuhiro Inaba <kinaba@chromium.org> Trybot-Ready: Kazuhiro Inaba <kinaba@chromium.org> Reviewed-by: Yuichiro Hanada <yhanada@chromium.org> Reviewed-by: Shuo-Peng Liao <deanliao@chromium.org> [modify] https://crrev.com/6fd2f1e240b9e4064a595d1be8535b752abe7b2d/server/cros/tradefed_test.py
,
Dec 11 2017
Landed to M65 (ToT). This should at least unblock the Chrome PFQ. CTS failures will still remain on M63 and M64 until we cherry-pick the CL above, but let's wait a moment until we confirm the M65 status.
,
Dec 11 2017
I'm not familiar with how each bot picks up the version to run, but 10204.0.0-rc2 paladin runs include my autotest fix: https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_minnie-paladin/builds/4729 so CQs/PFQs using 10204.0.0-rc2 or above should pass. (In other words, 10204.0.0-rc1 will fail.)
,
Dec 11 2017
Let's see if ongoing master-paladin passes: https://chromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/17165
,
Dec 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/a146c0e9aa187a9920cf81d44aa9a3f1015817a6 commit a146c0e9aa187a9920cf81d44aa9a3f1015817a6 Author: Kazuhiro Inaba <kinaba@chromium.org> Date: Mon Dec 11 11:51:16 2017 cheets_CTS: workaround for broken symlink file in CTS media zip. For some reason the archive contains a broken symlink which causes our recursive stat to fail. As a very quick workaround, catch the exception and assume the file to be size 0. BUG= chromium:793696 TEST=trybot Change-Id: I25e259ca9de77bf4adb36cf1960a75ecbd5e861c Reviewed-on: https://chromium-review.googlesource.com/818670 Tested-by: Kazuhiro Inaba <kinaba@chromium.org> Trybot-Ready: Kazuhiro Inaba <kinaba@chromium.org> Reviewed-by: Yuichiro Hanada <yhanada@chromium.org> Reviewed-by: Shuo-Peng Liao <deanliao@chromium.org> (cherry picked from commit 6fd2f1e240b9e4064a595d1be8535b752abe7b2d) Reviewed-on: https://chromium-review.googlesource.com/819071 Reviewed-by: Kazuhiro Inaba <kinaba@chromium.org> [modify] https://crrev.com/a146c0e9aa187a9920cf81d44aa9a3f1015817a6/server/cros/tradefed_test.py
,
Dec 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/e2b5536702b07f8ed365e18702c96dc1e92868f7 commit e2b5536702b07f8ed365e18702c96dc1e92868f7 Author: Kazuhiro Inaba <kinaba@chromium.org> Date: Mon Dec 11 11:51:34 2017 cheets_CTS: workaround for broken symlink file in CTS media zip. For some reason the archive contains a broken symlink which causes our recursive stat to fail. As a very quick workaround, catch the exception and assume the file to be size 0. BUG= chromium:793696 TEST=trybot Change-Id: I25e259ca9de77bf4adb36cf1960a75ecbd5e861c Reviewed-on: https://chromium-review.googlesource.com/818670 Tested-by: Kazuhiro Inaba <kinaba@chromium.org> Trybot-Ready: Kazuhiro Inaba <kinaba@chromium.org> Reviewed-by: Yuichiro Hanada <yhanada@chromium.org> Reviewed-by: Shuo-Peng Liao <deanliao@chromium.org> (cherry picked from commit 6fd2f1e240b9e4064a595d1be8535b752abe7b2d) Reviewed-on: https://chromium-review.googlesource.com/819070 Reviewed-by: Kazuhiro Inaba <kinaba@chromium.org> [modify] https://crrev.com/e2b5536702b07f8ed365e18702c96dc1e92868f7/server/cros/tradefed_test.py
,
Dec 11 2017
master-paladin went green, as well as all other PFQ bots dead due to this failure. Things should be all set now.
,
Dec 11 2017
I still don't fully understand how this slowly fell over, but thank you for fixing!
,
Dec 11 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/third_party/autotest/+/63d0c669f5f9af01f69318fb0649e028ff013c53 commit 63d0c669f5f9af01f69318fb0649e028ff013c53 Author: Kazuhiro Inaba <kinaba@chromium.org> Date: Mon Dec 11 23:44:13 2017 cheets_CTS: workaround for broken symlink file in CTS media zip. For some reason the archive contains a broken symlink which causes our recursive stat to fail. As a very quick workaround, catch the exception and assume the file to be size 0. BUG= chromium:793696 TEST=trybot Change-Id: I25e259ca9de77bf4adb36cf1960a75ecbd5e861c Reviewed-on: https://chromium-review.googlesource.com/818670 Tested-by: Kazuhiro Inaba <kinaba@chromium.org> Trybot-Ready: Kazuhiro Inaba <kinaba@chromium.org> Reviewed-by: Yuichiro Hanada <yhanada@chromium.org> Reviewed-by: Shuo-Peng Liao <deanliao@chromium.org> (cherry picked from commit 6fd2f1e240b9e4064a595d1be8535b752abe7b2d) Reviewed-on: https://chromium-review.googlesource.com/819071 Reviewed-by: Kazuhiro Inaba <kinaba@chromium.org> (cherry picked from commit a146c0e9aa187a9920cf81d44aa9a3f1015817a6) Reviewed-on: https://chromium-review.googlesource.com/820685 Reviewed-by: Grace Kihumba <gkihumba@chromium.org> Commit-Queue: Grace Kihumba <gkihumba@chromium.org> Tested-by: Grace Kihumba <gkihumba@chromium.org> [modify] https://crrev.com/63d0c669f5f9af01f69318fb0649e028ff013c53/server/cros/tradefed_test.py
,
Dec 12 2017
Re #19: the media file archive is expanded only when cheets_CTS_N.CtsMediaStressTestCases was run. Only each time a job for MediaStress test is scheduled to a different instance it breaks the environment. So it somehow gradually killed the tests. |
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by kinaba@chromium.org
, Dec 11 201712/10 13:18:20.494 WARNI| test:0637| The test failed with the following exception Traceback (most recent call last): File "/usr/local/autotest/client/common_lib/test.py", line 598, in _exec _cherry_pick_call(self.initialize, *args, **dargs) File "/usr/local/autotest/client/common_lib/test.py", line 746, in _cherry_pick_call return func(*p_args, **p_dargs) File "/usr/local/autotest/server/cros/tradefed_test.py", line 472, in initialize self._clean_download_cache_if_needed() File "/usr/local/autotest/server/cros/tradefed_test.py", line 708, in _clean_download_cache_if_needed size = self._dir_size(self._tradefed_cache) File "/usr/local/autotest/server/cros/tradefed_test.py", line 681, in _dir_size os.path.getsize(os.path.join(root, name)) for name in files) File "/usr/local/autotest/server/cros/tradefed_test.py", line 681, in <genexpr> os.path.getsize(os.path.join(root, name)) for name in files) File "/usr/lib/python2.7/genericpath.py", line 49, in getsize return os.stat(filename).st_size OSError: [Errno 2] No such file or directory: '/usr/local/autotest/results/shared/cache/cache/3a05d334ad72728dd6b03104185219b8/android-cts-media-1.3/android-cts-media-1.3/.#copy_images.sh' Not yet sure what's causing this failure but the failing code path is from https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/523762 so I think we should revert it for now as a quick recovery.