Error in self._Finish not properly handled. |
|||||||||
Issue descriptionWe saw the same error on several builders, which all displayed it incorrectly. https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/gcc_toolchain/builds/163 https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/llvm_next_toolchain/builds/53 https://uberchromegw.corp.google.com/i/chromiumos.tryserver/builders/llvm_next_toolchain/builds/52 It appears that the failure was raised in the stages's _Finish() method, but that GenericStage.Run() and/or ._RunStage() didn't catch and handle it appropriately. This meant that no stage was blamed for the failure, and our waterfall incorrectly displayed success for whatever stage was at fault. Very confusing. Exception thrown, but all stages marked successful. This is an internal error, because the stage that threw the exception should be marked as failing. cbuildbot: Unhandled exception: Traceback (most recent call last): File "chromite/bin/cbuildbot", line 168, in <module> DoMain() File "chromite/bin/cbuildbot", line 164, in DoMain commandline.ScriptWrapperMain(FindTarget) File "/b/cbuild/internal_master/chromite/lib/commandline.py", line 834, in ScriptWrapperMain ret = target(argv[1:]) File "/b/cbuild/internal_master/chromite/scripts/cbuildbot.py", line 1317, in main logging.info('One stage exited early: %s', ex) File "/b/cbuild/internal_master/chromite/scripts/cbuildbot.py", line 1314, in main _RunBuildStagesWrapper(options, site_config, build_config) File "/b/cbuild/internal_master/chromite/scripts/cbuildbot.py", line 248, in _RunBuildStagesWrapper if not builder.Run(): File "/b/cbuild/internal_master/chromite/cbuildbot/builders/generic_builders.py", line 316, in Run self.RunStages() File "/b/cbuild/internal_master/chromite/cbuildbot/builders/simple_builders.py", line 345, in RunStages self._RunDefaultTypeBuild() File "/b/cbuild/internal_master/chromite/cbuildbot/builders/simple_builders.py", line 330, in _RunDefaultTypeBuild self.RunEarlySyncAndSetupStages() File "/b/cbuild/internal_master/chromite/cbuildbot/builders/simple_builders.py", line 264, in RunEarlySyncAndSetupStages self._RunStage(android_stages.AndroidMetadataStage) File "/b/cbuild/internal_master/chromite/cbuildbot/builders/generic_builders.py", line 107, in _RunStage return stage_instance.Run() File "/b/cbuild/internal_master/chromite/cbuildbot/stages/generic_stages.py", line 648, in Run self._Finish() File "/b/cbuild/internal_master/chromite/cbuildbot/stages/android_stages.py", line 144, in _Finish self._WriteAndroidVersionToMetadata() File "/b/cbuild/internal_master/chromite/cbuildbot/stages/android_stages.py", line 137, in _WriteAndroidVersionToMetadata self.UploadMetadata(filename=constants.PARTIAL_METADATA_JSON) File "/b/cbuild/internal_master/chromite/lib/failures_lib.py", line 172, in wrapped_functor return functor(*args, **kwargs) File "/b/cbuild/internal_master/chromite/cbuildbot/stages/generic_stages.py", line 1056, in UploadMetadata self.UploadArtifact(filename, archive=False) File "/b/cbuild/internal_master/chromite/lib/failures_lib.py", line 172, in wrapped_functor return functor(*args, **kwargs) File "/b/cbuild/internal_master/chromite/cbuildbot/stages/generic_stages.py", line 992, in UploadArtifact update_list=True, acl=self.acl) File "/b/cbuild/internal_master/chromite/lib/failures_lib.py", line 172, in wrapped_functor return functor(*args, **kwargs) File "/b/cbuild/internal_master/chromite/cbuildbot/commands.py", line 1843, in UploadArchivedFile _UploadPathToGS(file_path, upload_urls, debug, timeout, acl=acl) File "/b/cbuild/internal_master/chromite/lib/failures_lib.py", line 187, in wrapped_functor raise self.category_exception(exc_infos=exc_infos) chromite.lib.failures_lib.GSUploadFailure: <class 'chromite.lib.gs.GSCommandError'>: return code: 1; command: /b/cbuild/internal_master/.cache/common/gsutil_4.19.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' -m cp -v -- /b/cbuild/internal_master/trybot_archive/trybot-squawks-llvm-next-toolchain/R57-9170.0.0-b52/partial-metadata.json gs://chromeos-image-archive/trybot-squawks-llvm-next-toolchain/R57-9170.0.0-b52/partial-metadata.json Copying file:///b/cbuild/internal_master/trybot_archive/trybot-squawks-llvm-next-toolchain/R57-9170.0.0-b52/partial-metadata.json [Content-Type=text/plain]... Uploading ...toolchain/R57-9170.0.0-b52/partial-metadata.json: 0 B/1.79 KiB Uploading ...toolchain/R57-9170.0.0-b52/partial-metadata.json: 1.79 KiB/1.79 KiB ServiceException: 503 Backend Error CommandException: 1 file/object could not be transferred. cmd=['/b/cbuild/internal_master/.cache/common/gsutil_4.19.tar.gz/gsutil/gsutil', '-o', 'Boto:num_retries=10', '-m', 'cp', '-v', '--', '/b/cbuild/internal_master/trybot_archive/trybot-squawks-llvm-next-toolchain/R57-9170.0.0-b52/partial-metadata.json', 'gs://chromeos-image-archive/trybot-squawks-llvm-next-toolchain/R57-9170.0.0-b52/partial-metadata.json'], extra env={'BOTO_CONFIG': '/b/build/site_config/.boto'} Traceback (most recent call last): File "/b/cbuild/internal_master/chromite/lib/failures_lib.py", line 172, in wrapped_functor return functor(*args, **kwargs) File "/b/cbuild/internal_master/chromite/cbuildbot/commands.py", line 1824, in _UploadPathToGS recursive=True) File "/b/cbuild/internal_master/chromite/lib/gs.py", line 584, in CopyInto **kwargs) File "/b/cbuild/internal_master/chromite/lib/gs.py", line 874, in Copy raise File "/b/cbuild/internal_master/chromite/lib/gs.py", line 851, in Copy result = self.DoCommand(cmd, **kwargs) File "/b/cbuild/internal_master/chromite/lib/gs.py", line 782, in DoCommand raise GSCommandError(e.msg, e.result, e.exception) GSCommandError: return code: 1; command: /b/cbuild/internal_master/.cache/common/gsutil_4.19.tar.gz/gsutil/gsutil -o 'Boto:num_retries=10' -m cp -v -- /b/cbuild/internal_master/trybot_archive/trybot-squawks-llvm-next-toolchain/R57-9170.0.0-b52/partial-metadata.json gs://chromeos-image-archive/trybot-squawks-llvm-next-toolchain/R57-9170.0.0-b52/partial-metadata.json Copying file:///b/cbuild/internal_master/trybot_archive/trybot-squawks-llvm-next-toolchain/R57-9170.0.0-b52/partial-metadata.json [Content-Type=text/plain]... Uploading ...toolchain/R57-9170.0.0-b52/partial-metadata.json: 0 B/1.79 KiB Uploading ...toolchain/R57-9170.0.0-b52/partial-metadata.json: 1.79 KiB/1.79 KiB ServiceException: 503 Backend Error CommandException: 1 file/object could not be transferred.
,
Jan 12 2017
I found that _Finish() declares that it can be overridden, and two stages do so. "SyncChromeStage" and "AndroidMetadataStage". However, we don't invoke this method until after error handling is over, and the default does nothing except log that the stage is finished. Both Chrome/Android stages use _Finish to ensure that metadata is updated, even if the stage is failing.
,
Jan 14 2017
The following revision refers to this bug: https://chromium.googlesource.com/chromiumos/chromite/+/c5c5f25f35f4c3cdf034b69f4077ce8ac8b988a0 commit c5c5f25f35f4c3cdf034b69f4077ce8ac8b988a0 Author: Don Garrett <dgarrett@google.com> Date: Thu Jan 12 01:36:34 2017 SyncChromeStage, AndroidMetadataStage: Make _Finish error safe. The _Finish method is available for stages to override, but errors in it are not associated with the stage in question. This leads to very bad waterfall reporting, confusion, and finally, cats and dogs living together. Stop the only two users of this method from raising errors if GS is behaving badly. BUG= chromium:680248 TEST=Unittests Change-Id: Id3b0696e057aae7fe4c5483d629b056f43c42485 Reviewed-on: https://chromium-review.googlesource.com/427384 Commit-Ready: Don Garrett <dgarrett@chromium.org> Tested-by: Don Garrett <dgarrett@chromium.org> Reviewed-by: Aviv Keshet <akeshet@chromium.org> [modify] https://crrev.com/c5c5f25f35f4c3cdf034b69f4077ce8ac8b988a0/cbuildbot/stages/android_stages.py [modify] https://crrev.com/c5c5f25f35f4c3cdf034b69f4077ce8ac8b988a0/cbuildbot/stages/chrome_stages.py
,
Jan 17 2017
,
Jan 17 2017
Short term fix is in place, but a more general cleanup would be better. In short, either ensure that we properly handle errors in _Finish, or (better) just get rid of the method.
,
Jan 31 2017
,
Feb 2 2017
,
Apr 17 2017
,
May 30 2017
,
Aug 1 2017
,
Oct 14 2017
|
|||||||||
►
Sign in to add a comment |
|||||||||
Comment 1 by dgarr...@chromium.org
, Jan 12 2017Status: Started (was: Untriaged)