New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 910079 link

Starred by 1 user

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 3
Type: Bug



Sign in to add a comment

Flaky AsyncGrpcClientServerTest.ExcessivelyBigRpcResponse test

Project Member Reported by lamzin@google.com, Nov 29

Issue description

I found one flaky test in grpc_async_adapter library. This test failed twice on my machine. 

Do not know how to reproduce it.

Command to run tests:
cros_workon_make --board sarien diagnostics --test

Here is test output:

/build/sarien/tmp/portage/chromeos-base/diagnostics-9999/work/diagnostics-9999/common-mk/platform2_test.py --action=run --sysroot=/build/sarien -- /build/sarien/var/cache/portage/chromeos-base/diagnostics/out/Default/libgrpc_async_adapter_test --vmodule=*diag*=2
ERROR: ld.so: object 'libsandbox.so' from LD_PRELOAD cannot be preloaded (cannot open shared object file): ignored.
chroot: /build/sarien
cwd: /mnt/host/source/src/platform2/diagnostics
cmd: {/var/cache/portage/chromeos-base/diagnostics/out/Default/libgrpc_async_adapter_test} '/var/cache/portage/chromeos-base/diagnostics/out/Default/libgrpc_async_adapter_test' '--vmodule=*diag*=2'
[==========] Running 19 tests from 3 test cases.
[----------] Global test environment set-up.
[----------] 10 tests from AsyncGrpcClientServerTest
[ RUN      ] AsyncGrpcClientServerTest.NoRpcs
[       OK ] AsyncGrpcClientServerTest.NoRpcs (4 ms)
[ RUN      ] AsyncGrpcClientServerTest.OneRpcWithResponse
[       OK ] AsyncGrpcClientServerTest.OneRpcWithResponse (2 ms)
[ RUN      ] AsyncGrpcClientServerTest.MultipleRpcTypes
[       OK ] AsyncGrpcClientServerTest.MultipleRpcTypes (2 ms)
[ RUN      ] AsyncGrpcClientServerTest.OneRpcExplicitCancellation
[       OK ] AsyncGrpcClientServerTest.OneRpcExplicitCancellation (2 ms)
[ RUN      ] AsyncGrpcClientServerTest.ShutdownWhileRpcIsPending
[       OK ] AsyncGrpcClientServerTest.ShutdownWhileRpcIsPending (1 ms)
[ RUN      ] AsyncGrpcClientServerTest.SendResponseAfterInitiatingShutdown
[       OK ] AsyncGrpcClientServerTest.SendResponseAfterInitiatingShutdown (2 ms)
[ RUN      ] AsyncGrpcClientServerTest.ManyRpcs
[       OK ] AsyncGrpcClientServerTest.ManyRpcs (2 ms)
[ RUN      ] AsyncGrpcClientServerTest.HeavyRpcData
[       OK ] AsyncGrpcClientServerTest.HeavyRpcData (37 ms)
[ RUN      ] AsyncGrpcClientServerTest.ExcessivelyBigRpcRequest
[       OK ] AsyncGrpcClientServerTest.ExcessivelyBigRpcRequest (13 ms)
[ RUN      ] AsyncGrpcClientServerTest.ExcessivelyBigRpcResponse
E1129 11:17:17.108993579      36 chttp2_transport.c:705]     server stream 1 still included in list 0
Error: /var/cache/portage/chromeos-base/diagnostics/out/Default/libgrpc_async_adapter_test: failed with signal SIGIOT|SIGABRT(6)
 * ERROR: chromeos-base/diagnostics-9999::chromiumos failed (test phase):
 *   (no error message)
 * 
 * Call stack:
 *     ebuild.sh, line  133:  Called src_test
 *   environment, line 4301:  Called platform_src_test
 *   environment, line 3878:  Called platform_pkg_test
 *   environment, line 3859:  Called platform_test 'run' '/build/sarien/var/cache/portage/chromeos-base/diagnostics/out/Default/libgrpc_async_adapter_test'
 *   environment, line 3911:  Called die
 * The specific snippet of code:
 *       "${cmd[@]}" || die
 * 
 * If you need support, post the output of `emerge --info '=chromeos-base/diagnostics-9999::chromiumos'`,
 * the complete build log and the output of `emerge -pqv '=chromeos-base/diagnostics-9999::chromiumos'`.
 * The complete build log is located at '/build/sarien/tmp/portage/logs/chromeos-base:diagnostics-9999:20181129-111703.log'.
 * For convenience, a symlink to the build log is located at '/build/sarien/tmp/portage/chromeos-base/diagnostics-9999/temp/build.log'.
 * The ebuild environment file is located at '/build/sarien/tmp/portage/chromeos-base/diagnostics-9999/temp/environment'.
 * Working directory: '/build/sarien/tmp/portage/chromeos-base/diagnostics-9999/work/diagnostics-9999/diagnostics'
 * S: '/build/sarien/tmp/portage/chromeos-base/diagnostics-9999/work/diagnostics-9999/diagnostics'

 
Description: Show this description
Description: Show this description
Components: OS>Systems Enterprise
Labels: Enterprise-Triaged
Thanks for filing this. I can reproduce it as well (the flakiness rate is about 5% in my case).

This crashes in gRPC internals, presumably during client/server teardown.
Either it's a bug in gRPC (Chrome OS is using a pretty old version of it - 1.3.0, while the latest one is 1.17.0) or in how we're doing gRPC teardown.
For the reference, the complete stack trace:

#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x00007ffff75b4d61 in __GI_abort () at abort.c:79
#2  0x00007ffff6c61656 in destroy_stream_locked (exec_ctx=0x7ffff62418f0, sp=0x7ffff0010b88, error=0x0) at src/core/ext/transport/chttp2/transport/chttp2_transport.c:707
#3  0x00007ffff7eedb6b in grpc_combiner_continue_exec_ctx (exec_ctx=0x7ffff62418f0) at src/core/lib/iomgr/combiner.c:325
#4  0x00007ffff7efa428 in grpc_exec_ctx_flush (exec_ctx=0x7ffff62418f0) at src/core/lib/iomgr/exec_ctx.c:83
#5  0x00007ffff7ef2742 in pollset_work (exec_ctx=0x7ffff62418f0, pollset=0x555555aa0080, worker_hdl=0x0, now=..., deadline=...) at src/core/lib/iomgr/ev_epoll_linux.c:1573
#6  0x00007ffff7ef9d9f in grpc_pollset_work (exec_ctx=0x7ffff62418f0, pollset=0x555555aa0080, worker=0x0, now=..., deadline=...) at src/core/lib/iomgr/ev_posix.c:207
#7  0x00007ffff7f1f7fa in grpc_completion_queue_next (cc=0x555555a9ff90, deadline=..., reserved=0x0) at src/core/lib/surface/completion_queue.c:428
#8  0x00007ffff7ed0551 in grpc::CompletionQueue::AsyncNextInternal (this=0x5555555a48d0, tag=0x7ffff6241a30, ok=0x7ffff6241a2f, deadline=...) at src/cpp/common/completion_queue_cc.cc:71
#9  0x0000555555577a9c in grpc::CompletionQueue::Next (this=0x5555555a48d0, tag=0x7ffff000b1a0, ok=0x7ffff000b1a001) at ../../../../../../../usr/include/grpc++/impl/codegen/completion_queue.h:152
#10 diagnostics::internal::MonitoringThreadDelegate::Run (this=0x5555555a28e0) at ../../../../../../../../../mnt/host/source/src/platform2/diagnostics/grpc_async_adapter/grpc_completion_queue_dispatcher.cc:49
#11 0x00007ffff7d6ad4d in base::DelegateSimpleThread::Run (this=0x7ffff0031860) at base/threading/simple_thread.cc:92
#12 0x00007ffff7d6aa68 in base::SimpleThread::ThreadMain (this=0x7ffff0031860) at base/threading/simple_thread.cc:68
#13 0x00007ffff7d6385f in base::(anonymous namespace)::ThreadFunc (params=0x555555a692f0) at base/threading/platform_thread_posix.cc:71
#14 0x00007ffff73664fe in start_thread (arg=0x7ffff6242700) at pthread_create.c:463
#15 0x00007ffff767bbef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
Project Member

Comment 5 by bugdroid1@chromium.org, Nov 30

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/fdd4e481c798c0e49c84b901533f835b6be24dba

commit fdd4e481c798c0e49c84b901533f835b6be24dba
Author: Oleh Lamzin <lamzin@google.com>
Date: Fri Nov 30 03:31:21 2018

diagnostics: disable flaky test in grpc adapter

Disable AsyncGrpcClientServerTest.ExcessivelyBigRpcResponse flaky test
in grpc_async_adapter.

BUG=chromium:910079
TEST=existing unit tests

Change-Id: Ic128c1253e174a5df571608c0fcbd4c0473e1a67
Reviewed-on: https://chromium-review.googlesource.com/1355220
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Oleh Lamzin <lamzin@google.com>
Reviewed-by: Oleh Lamzin <lamzin@google.com>
Reviewed-by: Maksim Ivanov <emaxx@chromium.org>

[modify] https://crrev.com/fdd4e481c798c0e49c84b901533f835b6be24dba/diagnostics/grpc_async_adapter/async_grpc_client_server_test.cc

Labels: -Pri-2 Pri-3

Sign in to add a comment