New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 879206 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Closed: Aug 30
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug-Regression

Blocked on:
issue 776369
issue 849137



Sign in to add a comment

grunt-paladin:2669 failed in UnitTest for quipper, bsdiff and libbrillo

Project Member Reported by sheriff-...@appspot.gserviceaccount.com, Aug 30

Issue description

Tests are consistently failing during UnitTest: 

quipper-0.0.1-r2413: [==========] Running 7 tests from 2 test cases.
quipper-0.0.1-r2413: [----------] Global test environment set-up.
quipper-0.0.1-r2413: [----------] 6 tests from PerfRecorderTest
quipper-0.0.1-r2413: [ RUN      ] PerfRecorderTest.RecordToProtobuf
quipper-0.0.1-r2413: [0830/144659:INFO:perf_reader.cc(798)] Number of events stored: 107
quipper-0.0.1-r2413: [0830/144659:INFO:perf_parser.cc(262)] Parser processed: 82 MMAP/MMAP2 events, 3 COMM events, 0 FORK events, 1 EXIT events, 19 SAMPLE events, 19 of these were mapped
quipper-0.0.1-r2413: [libprotobuf FATAL ../../protobuf-3.3.0/src/google/protobuf/message_lite.cc:71] CHECK failed: (bytes_produced_by_serialization) == (byte_size_before_serialization): Byte size calculation and serialization were inconsistent.  This may indicate a bug in protocol buffers or it may be caused by concurrent modification of quipper.PerfDataProto.
quipper-0.0.1-r2413: Error: /var/cache/portage/chromeos-base/quipper/out/Default/perf_recorder_test: failed with signal SIGIOT|SIGABRT(6)
quipper-0.0.1-r2413:  * ERROR: chromeos-base/quipper-0.0.1-r2413::chromiumos failed (test phase):
quipper-0.0.1-r2413:  *   (no error message)
quipper-0.0.1-r2413:  * 
quipper-0.0.1-r2413:  * Call stack:
quipper-0.0.1-r2413:  *     ebuild.sh, line  133:  Called src_test
quipper-0.0.1-r2413:  *   environment, line 3791:  Called platform_src_test
quipper-0.0.1-r2413:  *   environment, line 3375:  Called platform_pkg_test
quipper-0.0.1-r2413:  *   environment, line 3356:  Called platform_test 'run' '/build/grunt/var/cache/portage/chromeos-base/quipper/out/Default/perf_recorder_test' '1'
quipper-0.0.1-r2413:  *   environment, line 3408:  Called die
quipper-0.0.1-r2413:  * The specific snippet of code:
quipper-0.0.1-r2413:  *       "${cmd[@]}" || die
Cc: pprabhu@chromium.org djkurtz@chromium.org sque@chromium.org
Components: OS>Packages
Summary: grunt-paladin:2669 failed in UnitTest for quipper-0.0.1 in perf_recorder_test (was: grunt-paladin:2669 failed)
Cc: vapier@chromium.org
Labels: OS-Chrome Type-Bug-Regression
We've seen flaky quipper unittests before: crbug.com/812425.

Is quipper even maintained any more?
Summary: grunt-paladin:2669 failed in UnitTest for quipper, bsdiff and libbrillo (was: grunt-paladin:2669 failed in UnitTest for quipper-0.0.1 in perf_recorder_test)
Actually build 2669 (and 2668 & 2670) all had three unittest failures:
Packages failed:
	chromeos-base/quipper-0.0.1-r2413
	dev-util/bsdiff-4.3.1-r16
	chromeos-base/libbrillo-0.0.1-r1361


bsdiff-4.3.1-r16: cwd: /tmp/portage/dev-util/bsdiff-4.3.1-r16/work/bsdiff-4.3.1/platform2/bsdiff
bsdiff-4.3.1-r16: cmd: {/var/cache/portage/dev-util/bsdiff/out/Default/bsdiff_unittest} '/var/cache/portage/dev-util/bsdiff/out/Default/bsdiff_unittest'
bsdiff-4.3.1-r16: [==========] Running 66 tests from 14 test cases.
bsdiff-4.3.1-r16: [----------] Global test environment set-up.
bsdiff-4.3.1-r16: [----------] 2 tests from BrotliCompressorTest
bsdiff-4.3.1-r16: [ RUN      ] BrotliCompressorTest.BrotliCompressorSmoke
bsdiff-4.3.1-r16: ERROR 08-30 10:51:58 ../../../../../../../tmp/portage/dev-util/bsdiff-4.3.1-r16/work/bsdiff-4.3.1/platform2/bsdiff/brotli_decompressor.cc:49: Decompressor reached EOF while reading from input stream.
bsdiff-4.3.1-r16: ../../../../../../../tmp/portage/dev-util/bsdiff-4.3.1-r16/work/bsdiff-4.3.1/platform2/bsdiff/brotli_compressor_unittest.cc:33: Failure
bsdiff-4.3.1-r16: Value of: brotli_decompressor.Read(decompressed_data.data(), sizeof(kHelloWorld))
bsdiff-4.3.1-r16:   Actual: false
bsdiff-4.3.1-r16: Expected: true
bsdiff-4.3.1-r16: terminating with uncaught exception of type testing::internal::GoogleTestFailureException: ../../../../../../../tmp/portage/dev-util/bsdiff-4.3.1-r16/work/bsdiff-4.3.1/platform2/bsdiff/brotli_compressor_unittest.cc:33: Failure
bsdiff-4.3.1-r16: Value of: brotli_decompressor.Read(decompressed_data.data(), sizeof(kHelloWorld))
bsdiff-4.3.1-r16:   Actual: false
bsdiff-4.3.1-r16: Expected: true
bsdiff-4.3.1-r16: Error: /var/cache/portage/dev-util/bsdiff/out/Default/bsdiff_unittest: failed with signal SIGIOT|SIGABRT(6)
bsdiff-4.3.1-r16:  * ERROR: dev-util/bsdiff-4.3.1-r16::chromiumos failed (test phase):



libbrillo-0.0.1-r1361: [ RUN      ] DBusUtils.Protobuf
libbrillo-0.0.1-r1361: [ERROR:message.cc(907)] Failed to parse protocol buffer from array
libbrillo-0.0.1-r1361: ../../../../../../../tmp/portage/chromeos-base/libbrillo-0.0.1-r1361/work/libbrillo-0.0.1/libbrillo/brillo/dbus/data_serialization_unittest.cc:785: Failure
libbrillo-0.0.1-r1361: Value of: PopValueFromReader(&reader, &test_message_out)
libbrillo-0.0.1-r1361:   Actual: false
libbrillo-0.0.1-r1361: Expected: true
libbrillo-0.0.1-r1361: [  FAILED  ] DBusUtils.Protobuf (0 ms)
libbrillo-0.0.1-r1361: [----------] 33 tests from DBusUtils (78 ms total)
libbrillo-0.0.1-r1361: 
libbrillo-0.0.1-r1361: [----------] 4 tests from DBusMethodInvokerTest
libbrillo-0.0.1-r1361: [ RUN      ] DBusMethodInvokerTest.TestSuccess
libbrillo-0.0.1-r1361: [       OK ] DBusMethodInvokerTest.TestSuccess (1 ms)
libbrillo-0.0.1-r1361: [ RUN      ] DBusMethodInvokerTest.TestFailure
libbrillo-0.0.1-r1361: [ERROR:dbus_method_invoker.h(112)] CallMethodAndBlockWithTimeout(...): Domain=dbus, Code=org.MyError, Message=My error message
libbrillo-0.0.1-r1361: [       OK ] DBusMethodInvokerTest.TestFailure (0 ms)
libbrillo-0.0.1-r1361: [ RUN      ] DBusMethodInvokerTest.TestProtobuf
libbrillo-0.0.1-r1361: [ERROR:message.cc(907)] Failed to parse protocol buffer from array
libbrillo-0.0.1-r1361: [ERROR:dbus_method_invoker_unittest.cc(134)] Unexpected method call: message_type: MESSAGE_METHOD_CALL
libbrillo-0.0.1-r1361: interface: org.test.Object.TestInterface
libbrillo-0.0.1-r1361: member: TestMethod3
libbrillo-0.0.1-r1361: signature: ay
libbrillo-0.0.1-r1361: 
libbrillo-0.0.1-r1361: array [
libbrillo-0.0.1-r1361:   byte 8
libbrillo-0.0.1-r1361:   byte 123
libbrillo-0.0.1-r1361:   byte 18
libbrillo-0.0.1-r1361:   byte 3
libbrillo-0.0.1-r1361:   byte 98
libbrillo-0.0.1-r1361:   byte 97
libbrillo-0.0.1-r1361:   byte 114
libbrillo-0.0.1-r1361:   byte 0
libbrillo-0.0.1-r1361:   byte 0
libbrillo-0.0.1-r1361:   byte 0
libbrillo-0.0.1-r1361:   byte 0
libbrillo-0.0.1-r1361:   byte 0
libbrillo-0.0.1-r1361:   byte 0
libbrillo-0.0.1-r1361:   byte 0
libbrillo-0.0.1-r1361: ]
libbrillo-0.0.1-r1361: 
libbrillo-0.0.1-r1361: [ERROR:dbus_method_invoker.h(118)] CallMethodAndBlockWithTimeout(...): Domain=dbus, Code=org.freedesktop.DBus.Error.Failed, Message=Failed to call D-Bus method: org.test.Object.TestInterface.TestMethod3
libbrillo-0.0.1-r1361: ../../../../../../../tmp/portage/chromeos-base/libbrillo-0.0.1-r1361/work/libbrillo-0.0.1/libbrillo/brillo/dbus/dbus_method_invoker_unittest.cc:156: Failure
libbrillo-0.0.1-r1361: Expected: (nullptr) != (response.get()), actual: 8-byte object <00-00 00-00 00-00 00-00> vs NULL
libbrillo-0.0.1-r1361: [FATAL:dbus_method_invoker.h(232)] Check failed: message. Unable to extract parameters from a NULL message.
libbrillo-0.0.1-r1361: /usr/lib64/libbase-core-395517.so(_ZN4base5debug10StackTraceC1Ev+0x13) [0x7f8bed29cea3]
libbrillo-0.0.1-r1361: 
libbrillo-0.0.1-r1361: Error: /var/cache/portage/chromeos-base/libbrillo/out/Default/libbrillo-395517_unittests: failed with signal SIGIOT|SIGABRT(6)
libbrillo-0.0.1-r1361:  * ERROR: chromeos-base/libbrillo-0.0.1-r1361::chromiumos failed (test phase):

Cc: athilenius@chromium.org jclinton@chromium.org
Immediately before these failures was build 2668 which was aborted during BuildPackages due to an exception [0].  Prior to the aborted run, these unittests were all passing.  Is it possible that this exception left the builder in a bad state such that these unittests now fail?

Can we restart the builder or something to see if it recovers?

[0] https://cros-goldeneye.corp.google.com/chromeos/healthmonitoring/buildDetails?id=2892010
Owner: mikenichols@chromium.org
Mike volunteered to restart the builder.
A reboot is harmless but we shouldn't be reimaging builders: there's no immediately obvious technical way that corruption of the chroot can happen during an abort. We do not persist aborted builds disk states in to the next build in any way. On the other hand, reimaging has very negative impacts on the CQ and our bandwidth utilization.

The only way there could be persistent impacts on a build from failed-to-failed is if the kernel was impacted or there was a chroot escape and a process is somehow still running (hasn't happened in a long time).


The issue appears to be related to this being AMD and it having some mixed results when executing on the GOLO phyiscal machines.  The past success builds were all executed on cros-beefy-29-c2, which is a GCE instance, which suddenly went offline causing the GOLO builder to take over.  

I'm pushing this back to GCE to resolve and we're going to see about removing GOLO completely from the Grunt mix to prevent this until buildbot is completely shut down.  

-- Mike
This is  issue 849137  again.
Blockedon: 776369
Blockedon: 849137
Status: Fixed (was: Available)
i think we've addressed this now by shifting the builder back to a cpu that's compatible with the ISA used by grunt/stonyridge.  general follow ups are tracked in issue 856686.
We are waiting for the next CQ run to validate, but generally I agree that this should be resolved now with the GCE instance back as primary. 

-- Mike

Sign in to add a comment