New issue
Advanced search Search tips

Issue 893677 link

Starred by 1 user

Issue metadata

Status: Available
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Crash in stage causes builder class crash, which prevents build result reporting.

Project Member Reported by ahass...@chromium.org, Oct 9

Issue description

The CL crrev.com/c/1238974 is stuck in the pre-cq. The failure seems to be cause by lack of libbrotlidec.so:

16:15:48: INFO: RunCommand: delta_generator '--major_version=1' '--out_file=/tmp/generate_payloads3iT1iX/update.gz' '--private_key=' '--out_metadata_size_file=' '--new_image=/tmp/cros_generate_update_payload7080V4' '--new_kernel=/tmp/generate_payloads3iT1iX/full_dev_part_KERN.bin' '--new_channel=' '--new_board=' '--new_version=' '--new_key=' '--new_build_channel=' '--new_build_version=' '--rootfs_partition_size=2516582400'
delta_generator: error while loading shared libraries: libbrotlidec.so.1.0.1: cannot open shared object file: No such file or directory
cros_generate_update_payload: Unhandled exception:
Traceback (most recent call last):
  File "/mnt/host/source/chromite/bin/cros_generate_update_payload", line 169, in <module>
    DoMain()
  File "/mnt/host/source/chromite/bin/cros_generate_update_payload", line 165, in DoMain
    commandline.ScriptWrapperMain(FindTarget)
  File "/mnt/host/source/chromite/lib/commandline.py", line 912, in ScriptWrapperMain
    ret = target(argv[1:])
  File "/mnt/host/source/chromite/scripts/cros_generate_update_payload.py", line 350, in main
    return GenerateUpdatePayload(opts)
  File "/mnt/host/source/chromite/scripts/cros_generate_update_payload.py", line 228, in GenerateUpdatePayload
    cros_build_lib.RunCommand([_DELTA_GENERATOR] + generator_args)
  File "/mnt/host/source/chromite/lib/cros_build_lib.py", line 647, in RunCommand
    raise RunCommandError(msg, cmd_result)

and this causes the artifacts being unable to be uploaded.
 
I think the solution is to add a version blocker for bsdiff.
And example build that demonstrates the failure:

https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?buildbucketId=8933195794831733152


I believe the failure that causes the reporting to screw up is in here somewhere:

The problem is that some of those backtraces are OUTSIDE of stages, and so not captured and reported correctly. We do do not properly handle or report crashes in the builder classes, nor can we reasonably do so with our current stage based reporting structure.


************************************************************
** Finished Stage Archive - Mon, 08 Oct 2018 16:15:02 -0700 (PDT)
************************************************************
16:27:47: ERROR: BaseException in _RunParallelStages <class 'chromite.lib.cros_build_lib.RunCommandError'>: return code: 1; command: cros_sdk -- cros_generate_update_payload --image /mnt/host/source/src/build/images/eve/R71-11139.0.2018_10_08_1606-a1/chromiumos_test_image.bin --output /tmp/generate_payloads3iT1iX/update.gz --kern_path /tmp/generate_payloads3iT1iX/full_dev_part_KERN.bin --root_pretruncate_path /tmp/generate_payloads3iT1iX/full_dev_part_ROOT.bin
cmd=['cros_sdk', '--', 'cros_generate_update_payload', '--image', u'/mnt/host/source/src/build/images/eve/R71-11139.0.2018_10_08_1606-a1/chromiumos_test_image.bin', '--output', '/tmp/generate_payloads3iT1iX/update.gz', '--kern_path', '/tmp/generate_payloads3iT1iX/full_dev_part_KERN.bin', '--root_pretruncate_path', '/tmp/generate_payloads3iT1iX/full_dev_part_ROOT.bin'], cwd=/b/swarming/w/ir/cache/cbuild/repository/src/scripts
Traceback (most recent call last):
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/parallel.py", line 441, in _Run
    self._task(*self._task_args, **self._task_kwargs)
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/artifact_stages.py", line 734, in BuildUpdatePayloads
    self._GeneratePayloads(image_name, full=True, stateful=True, delta=True)
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/artifact_stages.py", line 707, in _GeneratePayloads
    **kwargs)
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/commands.py", line 2943, in GeneratePayloads
    cros_build_lib.RunCommand(cmd_full, enter_chroot=True, cwd=cwd)
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/cros_build_lib.py", line 647, in RunCommand
    raise RunCommandError(msg, cmd_result)
RunCommandError: return code: 1; command: cros_sdk -- cros_generate_update_payload --image /mnt/host/source/src/build/images/eve/R71-11139.0.2018_10_08_1606-a1/chromiumos_test_image.bin --output /tmp/generate_payloads3iT1iX/update.gz --kern_path /tmp/generate_payloads3iT1iX/full_dev_part_KERN.bin --root_pretruncate_path /tmp/generate_payloads3iT1iX/full_dev_part_ROOT.bin
cmd=['cros_sdk', '--', 'cros_generate_update_payload', '--image', u'/mnt/host/source/src/build/images/eve/R71-11139.0.2018_10_08_1606-a1/chromiumos_test_image.bin', '--output', '/tmp/generate_payloads3iT1iX/update.gz', '--kern_path', '/tmp/generate_payloads3iT1iX/full_dev_part_KERN.bin', '--root_pretruncate_path', '/tmp/generate_payloads3iT1iX/full_dev_part_ROOT.bin'], cwd=/b/swarming/w/ir/cache/cbuild/repository/src/scripts
Traceback (most recent call last):
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/builders/generic_builders.py", line 120, in _RunParallelStages
    parallel.RunParallelSteps(steps)
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/parallel.py", line 678, in RunParallelSteps
    return [queue.get_nowait() for queue in queues]
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/parallel.py", line 675, in RunParallelSteps
    pass
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/parallel.py", line 561, in ParallelTasks
    raise BackgroundFailure(exc_infos=errors)
BackgroundFailure: <class 'chromite.lib.cros_build_lib.RunCommandError'>: return code: 1; command: cros_sdk -- cros_generate_update_payload --image /mnt/host/source/src/build/images/eve/R71-11139.0.2018_10_08_1606-a1/chromiumos_test_image.bin --output /tmp/generate_payloads3iT1iX/update.gz --kern_path /tmp/generate_payloads3iT1iX/full_dev_part_KERN.bin --root_pretruncate_path /tmp/generate_payloads3iT1iX/full_dev_part_ROOT.bin
cmd=['cros_sdk', '--', 'cros_generate_update_payload', '--image', u'/mnt/host/source/src/build/images/eve/R71-11139.0.2018_10_08_1606-a1/chromiumos_test_image.bin', '--output', '/tmp/generate_payloads3iT1iX/update.gz', '--kern_path', '/tmp/generate_payloads3iT1iX/full_dev_part_KERN.bin', '--root_pretruncate_path', '/tmp/generate_payloads3iT1iX/full_dev_part_ROOT.bin'], cwd=/b/swarming/w/ir/cache/cbuild/repository/src/scripts
Traceback (most recent call last):
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/parallel.py", line 441, in _Run
    self._task(*self._task_args, **self._task_kwargs)
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/artifact_stages.py", line 734, in BuildUpdatePayloads
    self._GeneratePayloads(image_name, full=True, stateful=True, delta=True)
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/stages/artifact_stages.py", line 707, in _GeneratePayloads
    **kwargs)
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/cbuildbot/commands.py", line 2943, in GeneratePayloads
    cros_build_lib.RunCommand(cmd_full, enter_chroot=True, cwd=cwd)
  File "/b/swarming/w/ir/cache/cbuild/repository/chromite/lib/cros_build_lib.py", line 647, in RunCommand
    raise RunCommandError(msg, cmd_result)
RunCommandError: return code: 1; command: cros_sdk -- cros_generate_update_payload --image /mnt/host/source/src/build/images/eve/R71-11139.0.2018_10_08_1606-a1/chromiumos_test_image.bin --output /tmp/generate_payloads3iT1iX/update.gz --kern_path /tmp/generate_payloads3iT1iX/full_dev_part_KERN.bin --root_pretruncate_path /tmp/generate_payloads3iT1iX/full_dev_part_ROOT.bin
cmd=['cros_sdk', '--', 'cros_generate_update_payload', '--image', u'/mnt/host/source/src/build/images/eve/R71-11139.0.2018_10_08_1606-a1/chromiumos_test_image.bin', '--output', '/tmp/generate_payloads3iT1iX/update.gz', '--kern_path', '/tmp/generate_payloads3iT1iX/full_dev_part_KERN.bin', '--root_pretruncate_path', '/tmp/generate_payloads3iT1iX/full_dev_part_ROOT.bin'], cwd=/b/swarming/w/ir/cache/cbuild/repository/src/scripts

16:27:47: INFO: Created cidb engine bot@130.211.191.11 for pid 10516
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22ca8f610; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22ca8f850; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22ca8f110; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6c1390; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6c1f90; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6c10d0; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6c1f10; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd250; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd490; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd510; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd610; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd690; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd790; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd810; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd910; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd990; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fda90; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fd750; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6fdc50; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22ca8f710; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22ca8fb50; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22ca8f210; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22ca8fad0; Select object>
16:27:47: INFO: Running cidb query on pid 10516, repr(query) starts with <sqlalchemy.sql.expression.Select at 0x7fb22d6c1490; Select object>
16:15:02: INFO: Created cidb engine bot@130.211.191.11 for pid 10568
16:15:02: INFO: Running cidb query on pid 10568, repr(query) starts with <sqlalchemy.sql.expression.Insert object at 0x7fb22ca8fb90>
16:15:02: INFO: Running cidb query on pid 10568, repr(query) starts with <sqlalchemy.sql.expression.Update object at 0x7fb22ca8f2d0>
16:15:02: INFO: Running cidb query on pid 10568, repr(query) starts with <sqlalchemy.sql.expression.Update object at 0x7fb22d644d10>
************************************************************
** Finished Stage VMTest (attempt 1) - Mon, 08 Oct 2018 16:15:02 -0700 (PDT)
************************************************************



Summary: Crash in stage causes builder class crash, which prevents build result reporting. (was: cl got stuck in pre-cq)
Project Member

Comment 4 by bugdroid1@chromium.org, Oct 15

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/965e1d26f2fa46598dde694e12b1adb13fb67618

commit 965e1d26f2fa46598dde694e12b1adb13fb67618
Author: Amin Hassani <ahassani@chromium.org>
Date: Mon Oct 15 23:01:28 2018

bsdiff: block brotli-1.0.1

brotli-1.0.1 has API incompatibility with bsdiff. This patch adds a version
dependency to use only brotli version 1.0.6 and newer.

BUG=chromium:893677
BUG= chromium:878728 
TEST=sudo FEATURES=test bsdiff
CQ-DEPEND=CL:1238974

Change-Id: Ia216005a56bd6c3921641a82cbbfbf1da512daad
Reviewed-on: https://chromium-review.googlesource.com/1272156
Commit-Ready: Amin Hassani <ahassani@chromium.org>
Tested-by: Amin Hassani <ahassani@chromium.org>
Reviewed-by: Chirantan Ekbote <chirantan@chromium.org>

[modify] https://crrev.com/965e1d26f2fa46598dde694e12b1adb13fb67618/dev-util/bsdiff/bsdiff-9999.ebuild
[rename] https://crrev.com/965e1d26f2fa46598dde694e12b1adb13fb67618/dev-util/bsdiff/bsdiff-4.3.1-r18.ebuild

Components: Infra>Platform>Buildbot
Status: Available (was: Untriaged)
Just an FYI, the cause of the crash is fixed by #4 but still the reporting is not optimal AFAIK.
Components: -Infra>Platform>Buildbot Infra>Client>ChromeOS>CI
A crash in a stage is always supposed to be captured and handled, which is how most of our reporting works.

A crash escaping the stage is a bug in cbuildbot that needs to be addressed.
Labels: -Pri-1 Pri-2

Sign in to add a comment