New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 845967 link

Starred by 2 users

Issue metadata

Status: Fixed
Owner:
Last visit > 30 days ago
Closed: Jul 10
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: ----
Pri: 1
Type: Bug



Sign in to add a comment

cheets_PlayStoreTest failure - "adb is not ready in 60 seconds."

Reported by jrbarnette@chromium.org, May 23 2018

Issue description

This CQ run failed:
    https://uberchromegw.corp.google.com/i/chromeos/builders/master-paladin/builds/18684

The failure was because of quawks-paladin:
    https://luci-milo.appspot.com/buildbot/chromeos/quawks-paladin/3040

That failure was due to this test failure:
    cheets_PlayStoreTest: retry_count: 2, FAIL: adb is not ready in 60 seconds.

The logs are here:
    http://cautotest-prod/tko/retrieve_logs.cgi?job=/results/202333679-chromeos-test/

The same symptom also occurred with this quawks build:
    https://luci-milo.appspot.com/buildbot/chromeos/quawks-paladin/3027

P1 - if this happens too much, we'll have to mark quawks experimental.


 
I should add (lest anyone wonder):  I checked the blamelist on both
failures.  There's no CL that could be responsible for the failure in
the more recent failure, no CL in common with the prior failure, and
no CL in the prior CL that seems likely to have caused the failure.

Thus:  Bug in ToT, not a bad CL.

Cc: levarum@chromium.org
Owner: phshah@chromium.org

Comment 3 by phshah@chromium.org, May 23 2018

I've run into this previously during development. The root cause was due to leaking mount points http://b/77494303

Comment 4 by phshah@chromium.org, May 23 2018

Owner: lhchavez@chromium.org
pshah@ - The leaking mount points bug b/77494303 is marked fixed.  Have
you confirmed that the underlying cause of the error message is the same?
Could you fill in the details of what you learned in debug?

Comment 6 by phshah@chromium.org, May 24 2018

Owner: nya@chromium.org

Comment 7 by nya@chromium.org, May 24 2018

Hmm, yes, something looks wrong on quawks.
https://stainless.corp.google.com/search?view=list&first_date=2018-04-27&last_date=2018-05-24&master_builder_name=master-paladin&test=%5Echeets_PlayStoreTest%24&status=FAIL&status=ERROR&status=ABORT&exclude_cts=true&exclude_not_run=false&exclude_non_release=false&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false

In the last 2 cases I see crashes of arc-networkd. Below is the stack trace of job 202436153.

=== arc-network.log
[FATAL:ip_helper.cc(88)] Check failed: len > 0 (-1 vs. 0)
/usr/lib64/libbase-core-395517.so(_ZN4base5debug10StackTraceC1Ev+0x13) [0x7b685afdb5b3]

=== arc_networkd.20180523.192053.1448.dmp
Crash reason:  SIGABRT
Crash address: 0x0
Process uptime: not available

Thread 0 (crashed)
 0  libc-2.23.so!raise [raise.c : 54 + 0x10]
    rax = 0x0000000000000000   rdx = 0x0000000000000006
    rcx = 0x00007b685a7bfdd2   rbx = 0x0000000000000000
    rsi = 0x00000000000005a8   rdi = 0x00000000000005a8
    rbp = 0x00007fff82ccb110   rsp = 0x00007fff82ccafe8
     r8 = 0x00007fff82ccb043    r9 = 0x00007fff82ccb039
    r10 = 0x0000000000000008   r11 = 0x0000000000000202
    r12 = 0x00007fff82ccb5a8   r13 = 0x00007fff82ccb148
    r14 = 0x00007fff82ccb5b8   r15 = 0x00007fff82ccb5b0
    rip = 0x00007b685a7bfdd2
    Found by: given as instruction pointer in context
 1  libc-2.23.so!abort [abort.c : 89 + 0xa]
    rbx = 0x0000000000000000   rbp = 0x00007fff82ccb110
    rsp = 0x00007fff82ccaff0   r12 = 0x00007fff82ccb5a8
    r13 = 0x00007fff82ccb148   r14 = 0x00007fff82ccb5b8
    r15 = 0x00007fff82ccb5b0   rip = 0x00007b685a7c1bf6
    Found by: call frame info
 2  libbase-core-395517.so!base::debug::BreakDebugger() [debugger_posix.cc : 219 + 0x5]
    rbx = 0x0000000000000000   rbp = 0x00007fff82ccb120
    rsp = 0x00007fff82ccb120   r12 = 0x00007fff82ccb5a8
    r13 = 0x00007fff82ccb148   r14 = 0x00007fff82ccb5b8
    r15 = 0x00007fff82ccb5b0   rip = 0x00007b685afd9e41
    Found by: call frame info
 3  libbase-core-395517.so!logging::LogMessage::~LogMessage() [logging.cc : 755 + 0x5]
    rbx = 0x0000000000000000   rbp = 0x00007fff82ccb590
    rsp = 0x00007fff82ccb130   r12 = 0x00007fff82ccb5a8
    r13 = 0x00007fff82ccb148   r14 = 0x00007fff82ccb5b8
    r15 = 0x00007fff82ccb5b0   rip = 0x00007b685affd754
    Found by: call frame info
 4  arc-networkd!arc_networkd::IpHelper::OnFileCanReadWithoutBlocking(int) [ip_helper.cc : 88 + 0x26]
    rbx = 0x00007fff82ccb5a8   rbp = 0x00007fff82ccbaf0
    rsp = 0x00007fff82ccb5a0   r12 = 0x00007fff82ccc420
    r13 = 0x00005b5c1538a270   r14 = 0x00007fff82ccc1e0
    r15 = 0x0000000000000002   rip = 0x00005b5c151b33c6
    Found by: call frame info
 5  libbase-core-395517.so!base::MessagePumpLibevent::OnLibeventNotification(int, short, void*) [message_pump_libevent.cc : 102 + 0x8]
    rbx = 0x00007b685b0c3240   rbp = 0x00007fff82ccbc90
    rsp = 0x00007fff82ccbb00   r12 = 0x00007fff82ccc420
    r13 = 0x00005b5c1538a270   r14 = 0x0000000000000004
    r15 = 0x0000000000000002   rip = 0x00007b685b00c55f
    Found by: call frame info
 6  libevent_core-2.1.so.6.0.2!event_process_active_single_queue [event.c : 1580 + 0x9]
    rbx = 0x0000000000000082   rbp = 0x00007fff82ccbd20
    rsp = 0x00007fff82ccbca0   r12 = 0x00005b5c1538a3c8
    r13 = 0x00005b5c1538a270   r14 = 0x00005b5c1538a270
    r15 = 0x00005b5c153952b0   rip = 0x00007b685a155dc9
    Found by: call frame info
 7  libevent_core-2.1.so.6.0.2!event_base_loop [event.c : 0 + 0x5]
    rbx = 0x0000000000000000   rbp = 0x00007fff82ccbdc0
    rsp = 0x00007fff82ccbd30   r12 = 0x00005b5c1538a380
    r13 = 0x0000000000000000   r14 = 0x00005b5c1538a270
    r15 = 0x0000000000000000   rip = 0x00007b685a15359e
    Found by: call frame info
 8  libbase-core-395517.so!base::MessagePumpLibevent::Run(base::MessagePump::Delegate*) [message_pump_libevent.cc : 251 + 0xa]
    rbx = 0x00005b5c1538a1c0   rbp = 0x00007fff82ccbe30
    rsp = 0x00007fff82ccbdd0   r12 = 0x00005b5c1538a208
    r13 = 0x0000000000000000   r14 = 0x00005b5c15389f40
    r15 = 0x0000000000000000   rip = 0x00007b685b00c88f
    Found by: call frame info
 9  libbase-core-395517.so!base::MessageLoop::RunHandler() [message_loop.cc : 442 + 0x9]
    rbx = 0x00005b5c15389f40   rbp = 0x00007fff82ccbf90
    rsp = 0x00007fff82ccbe40   r12 = 0x00007fff82ccc068
    r13 = 0x00007fff82ccc610   r14 = 0x00007fff82ccbfa8
    r15 = 0x00007fff82ccc058   rip = 0x00007b685b0064a1
    Found by: call frame info
10  libbase-core-395517.so!base::RunLoop::Run() [run_loop.cc : 35 + 0x5]
    rbx = 0x00007fff82ccc000   rbp = 0x00007fff82ccbff0
    rsp = 0x00007fff82ccbfa0   r12 = 0x00007fff82ccc068
    r13 = 0x00007fff82ccc610   r14 = 0x00007fff82ccbfa8
    r15 = 0x00007fff82ccc058   rip = 0x00007b685b0383e0
    Found by: call frame info
11  libbrillo-core-395517.so!brillo::BaseMessageLoop::Run() [base_message_loop.cc : 212 + 0x8]
    rbx = 0x00007fff82ccc258   rbp = 0x00007fff82ccc040
    rsp = 0x00007fff82ccc000   r12 = 0x00007fff82ccc068
    r13 = 0x00007fff82ccc610   r14 = 0x00007fff82ccc000
    r15 = 0x00007fff82ccc058   rip = 0x00007b685b0ef4c3
    Found by: call frame info
12  libbrillo-core-395517.so!brillo::Daemon::Run() [daemon.cc : 31 + 0x8]
    rbx = 0x00007fff82ccc258   rbp = 0x00007fff82ccc0b0
    rsp = 0x00007fff82ccc050   r12 = 0x00007fff82ccc068
    r13 = 0x00007fff82ccc610   r14 = 0x00007fff82ccc1e0
    r15 = 0x00007fff82ccc058   rip = 0x00007b685b0e53f4
    Found by: call frame info
13  arc-networkd!main [main.cc : 49 + 0x5]
    rbx = 0x00007fff82ccc1c0   rbp = 0x00007fff82ccc520
    rsp = 0x00007fff82ccc0c0   r12 = 0x00005b5c15384e50
    r13 = 0x00007fff82ccc610   r14 = 0x00007fff82ccc618
    r15 = 0x0000000000000004   rip = 0x00005b5c151b4ba3
    Found by: call frame info
14  libc-2.23.so!__libc_start_main [libc-start.c : 289 + 0x1a]
    rbx = 0x0000000000000000   rbp = 0x00007fff82ccc5f0
    rsp = 0x00007fff82ccc530   r12 = 0x00005b5c151da040
    r13 = 0x00007fff82ccc610   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x00007b685a7ac736
    Found by: call frame info
15  arc-networkd!_start + 0x29
    rbx = 0x0000000000000000   rbp = 0x0000000000000000
    rsp = 0x00007fff82ccc600   r12 = 0x00005b5c151a67e0
    r13 = 0x00007fff82ccc610   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x00005b5c151a6809
    Found by: call frame info
16  0x7fff82ccc608
    rbx = 0x0000000000000000   rbp = 0x0000000000000000
    rsp = 0x00007fff82ccc608   r12 = 0x00005b5c151a67e0
    r13 = 0x00007fff82ccc610   r14 = 0x0000000000000000
    r15 = 0x0000000000000000   rip = 0x00007fff82ccc608
    Found by: call frame info
17  arc-networkd!_init + 0xd98
    rsp = 0x00007fff82ccc738   rip = 0x00005b5c151a67e0
    Found by: stack scanning

Comment 8 by nya@chromium.org, May 24 2018

Cc: abhishekbh@chromium.org matthewmwang@chromium.org
Owner: cernekee@chromium.org
So this looks a problem in arc-networkd, possibly related to Issue 825365.
Reassigning to the owner of the other bug.

cernekee@, could you take a look please?

Uploaded CL:1086116 to resolve this issue. Would appreciate it if someone could take a look. Thanks!
Project Member

Comment 10 by bugdroid1@chromium.org, Jun 15 2018

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/e080482760b221df7ea9372a25d2f74c9f7a9c01

commit e080482760b221df7ea9372a25d2f74c9f7a9c01
Author: Matthew Wang <matthewmwang@chromium.org>
Date: Fri Jun 15 17:57:49 2018

arc: Exit ip_helper normally when read fails with errno ECONNRESET

In CL:1065120, ip_helper was modified to send a NO-OP message upon setup to
it's parent process to confirm that the socket connection is still alive.
However, when the parent process exits later on (and closes the socket), the
child, which normally would get a 0 upon calling read() and exit normally,
now gets -1 with errno ECONNRESET. This happens when one side of a socket
connection writes, the other side exits without reading, and the first side
attempts to read.

This change has ip_helper exit normally when the read fails with ECONNRESET.

BUG= chromium:845967 
TEST=Put a log statement in the if clause. Log statement was reached and
     arc-networkd stopped crashing.

Change-Id: I44b201761e29645abf23a10a08a630b079fc341d
Reviewed-on: https://chromium-review.googlesource.com/1086116
Commit-Ready: Matthew Wang <matthewmwang@chromium.org>
Tested-by: Matthew Wang <matthewmwang@chromium.org>
Reviewed-by: Matthew Wang <matthewmwang@chromium.org>
Reviewed-by: Abhishek Bhardwaj <abhishekbh@chromium.org>

[modify] https://crrev.com/e080482760b221df7ea9372a25d2f74c9f7a9c01/arc/network/ip_helper.cc

Status: Fixed (was: Assigned)
Fixed, according to the comments above.
Project Member

Comment 12 by bugdroid1@chromium.org, Jul 18

Labels: merge-merged-release-R68-10718.B
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform2/+/aa7187a1676a125f2da67daa2ad3a098a8f0c668

commit aa7187a1676a125f2da67daa2ad3a098a8f0c668
Author: Matthew Wang <matthewmwang@chromium.org>
Date: Wed Jul 18 00:27:56 2018

arc: Exit ip_helper normally when read fails with errno ECONNRESET

In CL:1065120, ip_helper was modified to send a NO-OP message upon setup to
it's parent process to confirm that the socket connection is still alive.
However, when the parent process exits later on (and closes the socket), the
child, which normally would get a 0 upon calling read() and exit normally,
now gets -1 with errno ECONNRESET. This happens when one side of a socket
connection writes, the other side exits without reading, and the first side
attempts to read.

This change has ip_helper exit normally when the read fails with ECONNRESET.

BUG= chromium:845967 
TEST=Put a log statement in the if clause. Log statement was reached and
     arc-networkd stopped crashing.

Change-Id: I44b201761e29645abf23a10a08a630b079fc341d
Reviewed-on: https://chromium-review.googlesource.com/1086116
Commit-Ready: Matthew Wang <matthewmwang@chromium.org>
Tested-by: Matthew Wang <matthewmwang@chromium.org>
Reviewed-by: Matthew Wang <matthewmwang@chromium.org>
Reviewed-by: Abhishek Bhardwaj <abhishekbh@chromium.org>
(cherry picked from commit e080482760b221df7ea9372a25d2f74c9f7a9c01)
Reviewed-on: https://chromium-review.googlesource.com/1138933
Reviewed-by: Hugo Benichi <hugobenichi@google.com>
Commit-Queue: Matthew Wang <matthewmwang@chromium.org>

[modify] https://crrev.com/aa7187a1676a125f2da67daa2ad3a098a8f0c668/arc/network/ip_helper.cc

Sign in to add a comment