New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 722809 link

Starred by 1 user

Issue metadata

Status: Archived
Owner:
Last visit > 30 days ago
Closed: May 2017
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

shill connection_diagnostics crash on NAT64 network

Project Member Reported by cernekee@chromium.org, May 16 2017

Issue description

When connected to a pure IPv6 (NAT64) network, connection_diagnostics crashes every few seconds with:

2017-05-16T19:08:15.188265+09:00 ERR shill[1489]: [ERROR:icmp.cc(75)] Not implemented reached in virtual bool shill::Icmp::TransmitEchoRequest(const shill::IPAddress &, uint16_t, uint16_t)Only IPv4 destination addresses are implemented.
2017-05-16T19:08:15.933118+09:00 INFO shill[1489]: [INFO:wifi.cc(323)] Scan on wlan0 from RequestScan
2017-05-16T19:08:16.189569+09:00 ERR shill[1489]: [ERROR:icmp.cc(75)] Not implemented reached in virtual bool shill::Icmp::TransmitEchoRequest(const shill::IPAddress &, uint16_t, uint16_t)Only IPv4 destination addresses are implemented.
2017-05-16T19:08:17.188009+09:00 CRIT shill[1489]: [FATAL:message_loop_task_runner.cc(29)] Check failed: !task.is_null(). FindNeighborTableEntry@../../../../../../../tmp/portage/chromeos-base/shill-9999/work/shill-9999/aosp/system/connectivity/shill/connection_diagnostics.cc:533#012
2017-05-16T19:08:17.190756+09:00 ERR shill[1497]: [FATAL:message_loop_task_runner.cc(29)] Check failed: !task.is_null(). FindNeighborTableEntry@../../../../../../../tmp/portage/chromeos-base/shill-9999/work/shill-9999/aosp/system/connectivity/shill/connection_diagnostics.cc:533
 
More problems found after fixing the crash:

1) Portal detection times out on our NAT64 network.

2) connection_diagnostics does not support IPv6.

The service gets stuck in Portal state due to (1), and it repeatedly triggers connection_diagnostics.  This does not appear to be fatal or user-visible, but it is still unwanted behavior.

2017-05-17T13:50:46.195554+09:00 INFO dhcpcd[15995]: wlan0: sending DISCOVER (xid 0x41c288ac), next in 64.6 seconds
2017-05-17T13:50:48.065359+09:00 ERR shill[15858]: [ERROR:http_request.cc(188)] Could not resolve hostname www.gstatic.com: The network connection was timed out
2017-05-17T13:50:48.067377+09:00 INFO shill[15858]: [INFO:portal_detector.cc(130)] Portal detection completed attempt 1 with phase==DNS, status==Timeout, failures in content==0
2017-05-17T13:50:52.484335+09:00 ERR shill[15858]: [ERROR:http_request.cc(188)] Could not resolve hostname www.gstatic.com: The network connection was timed out
2017-05-17T13:50:52.486189+09:00 INFO shill[15858]: [INFO:portal_detector.cc(130)] Portal detection completed attempt 2 with phase==DNS, status==Timeout, failures in content==0
2017-05-17T13:50:57.196204+09:00 ERR shill[15858]: [ERROR:http_request.cc(188)] Could not resolve hostname www.gstatic.com: The network connection was timed out
2017-05-17T13:50:57.198051+09:00 INFO shill[15858]: [INFO:portal_detector.cc(130)] Portal detection completed attempt 3 with phase==DNS, status==Timeout, failures in content==0
2017-05-17T13:50:57.205571+09:00 ERR shill[15858]: [ERROR:icmp.cc(75)] Not implemented reached in virtual bool shill::Icmp::TransmitEchoRequest(const shill::IPAddress &, uint16_t, uint16_t)Only IPv4 destination addresses are implemented.
2017-05-17T13:51:00.225637+09:00 ERR shill[15858]: message repeated 7 times: [ [ERROR:icmp.cc(75)] Not implemented reached in virtual bool shill::Icmp::TransmitEchoRequest(const shill::IPAddress &, uint16_t, uint16_t)Only IPv4 destination addresses are implemented.]
2017-05-17T13:51:01.403580+09:00 INFO shill[15858]: [INFO:wifi_service.cc(785)] Representative endpoint updated for service 1. [SSID=?????????], bssid: 30:b5:c2:33:da:bf, signal: -44, security: rsn, frequency: 5745
2017-05-17T13:51:02.211304+09:00 INFO shill[15858]: [INFO:connection_diagnostics.cc(277)] Connection diagnostics events:
2017-05-17T13:51:02.211664+09:00 INFO shill[15858]: [INFO:connection_diagnostics.cc(279)]   #0: Event: Portal detection          Phase: End (DNS)        Result: Timeout
2017-05-17T13:51:02.211943+09:00 INFO shill[15858]: [INFO:connection_diagnostics.cc(279)]   #1: Event: Ping DNS servers          Phase: Start            Result: Success
2017-05-17T13:51:02.212466+09:00 INFO shill[15858]: [INFO:connection_diagnostics.cc(279)]   #2: Event: Ping DNS servers          Phase: End              Result: Failure   Msg: No DNS servers responded to pings. Pinging first DNS server at 2001:4860:4860::6464
2017-05-17T13:51:02.212777+09:00 INFO shill[15858]: [INFO:connection_diagnostics.cc(279)]   #3: Event: Find route                Phase: Start            Result: Success   Msg: Requesting route to 2001:4860:4860::6464
2017-05-17T13:51:02.212777+09:00 INFO shill[15858]: [INFO:connection_diagnostics.cc(279)]   #3: Event: Find route                Phase: Start            Result: Success   Msg: Requesting route to 2001:4860:4860::6464
2017-05-17T13:51:02.213053+09:00 INFO shill[15858]: [INFO:connection_diagnostics.cc(279)]   #4: Event: Find route                Phase: End              Result: Failure
2017-05-17T13:51:02.213289+09:00 INFO shill[15858]: [INFO:connection_diagnostics.cc(282)] Connection diagnostics completed. Connection issue: Routing problem detected.
Components: OS>Systems>Network
The lack of an IPv4 address on wlan0 results in log spam from arc-networkd:

2017-05-19T11:49:37.167746+09:00 ERR arc-networkd[2926]: [ERROR:multicast_socket.cc(44)] SIOCGIFADDR failed
2017-05-19T11:49:37.167857+09:00 ERR arc-networkd[2926]: [ERROR:multicast_socket.cc(44)] SIOCGIFADDR failed
2017-05-19T11:49:38.186174+09:00 ERR arc-networkd[2926]: [ERROR:multicast_socket.cc(44)] SIOCGIFADDR failed
2017-05-19T11:49:38.186853+09:00 ERR arc-networkd[2926]: [ERROR:multicast_socket.cc(44)] SIOCGIFADDR failed
Project Member

Comment 3 by bugdroid1@chromium.org, May 19 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/aosp/platform/system/connectivity/shill/+/59a3305626d164a56b5c5753bde65732e2ce2c81

commit 59a3305626d164a56b5c5753bde65732e2ce2c81
Author: Kevin Cernekee <cernekee@chromium.org>
Date: Fri May 19 20:57:36 2017

shill: Fix incorrect callback usage

shill tries to schedule a callback to route_query_timeout_callback_
(which doesn't get initialized) instead of
neighbor_request_timeout_callback_ (which does get initialized).
This results in a crash.  Use the correct variable so this works
correctly.

BUG= chromium:722809 
TEST=manually connect to an affected network

Change-Id: I77d20533b5fed5df0bc35de1f23bede235952aea
Reviewed-on: https://chromium-review.googlesource.com/507070
Commit-Ready: Kevin Cernekee <cernekee@chromium.org>
Tested-by: Kevin Cernekee <cernekee@chromium.org>
Reviewed-by: Ben Chan <benchan@chromium.org>

[modify] https://crrev.com/59a3305626d164a56b5c5753bde65732e2ce2c81/connection_diagnostics.cc

Project Member

Comment 4 by bugdroid1@chromium.org, May 19 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/aosp/platform/system/connectivity/shill/+/3992ba5fa99819dc022126246d5f0bb96cb6b22d

commit 3992ba5fa99819dc022126246d5f0bb96cb6b22d
Author: Kevin Cernekee <cernekee@chromium.org>
Date: Fri May 19 20:57:36 2017

shill: Remove workaround for old c-ares resolver bug

Per https://github.com/c-ares/c-ares/pull/11 the c-ares library used
to require callers to specify IPv6 DNS server addresses in the format:

    2001:4860:4860::6464:53
    (the actual host is [2001:4860:4860::6464])

shill did this, but then the API changed.  As a result, c-ares
now parses the above string as a literal IPv6 host address (without a
port), causing portal detection to time out because it's using incorrect
DNS server IPs.  On a pure IPv6 network this causes portal detection to
keep rerunning continuously.

Fix this by reverting the workaround.

BUG= chromium:722809 
TEST=manually connect to a NAT64 network
TEST=run unit tests

Change-Id: I64b429255274ca45d4fe1332e4eb98b8a4f59c5e
Reviewed-on: https://chromium-review.googlesource.com/509268
Commit-Ready: Kevin Cernekee <cernekee@chromium.org>
Tested-by: Kevin Cernekee <cernekee@chromium.org>
Reviewed-by: Ben Chan <benchan@chromium.org>

[modify] https://crrev.com/3992ba5fa99819dc022126246d5f0bb96cb6b22d/dns_client.cc
[modify] https://crrev.com/3992ba5fa99819dc022126246d5f0bb96cb6b22d/dns_client.h

Project Member

Comment 5 by bugdroid1@chromium.org, May 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/aosp/platform/system/connectivity/shill/+/705744a02beb727a88270e50d9440f0f2554ca4c

commit 705744a02beb727a88270e50d9440f0f2554ca4c
Author: Kevin Cernekee <cernekee@chromium.org>
Date: Sun May 28 03:43:39 2017

shill: Plumb up Icmp::Start() arguments to support IPv6

ICMPv6 support will require two API changes:

 - Pass in |destination_address| before the socket is created, because
   the socket() parameters are different between IPv4 and IPv6.  This
   is done by moving the |destination| argument from
   TransmitEchoRequest() into Start().  This is safe because callers
   do not reuse a single Icmp object to ping different hosts.

 - Propagate |interface_index|, because pinging a link-local address
   (such as the default gateway) requires specifying the scope ID.

Implement the new APIs, and adjust all callers / test cases accordingly.
This is mostly boilerplate, so the next commit will add IPv6 support
itself.

BUG= chromium:722809 
TEST=unit tests

Change-Id: Ic28c792df08bad5e2b1ae04befd502542499d333
Reviewed-on: https://chromium-review.googlesource.com/516731
Commit-Ready: Kevin Cernekee <cernekee@chromium.org>
Tested-by: Kevin Cernekee <cernekee@chromium.org>
Reviewed-by: Ben Chan <benchan@chromium.org>

[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/icmp_session.cc
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/icmp_unittest.cc
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/icmp_session_unittest.cc
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/connection_diagnostics_unittest.cc
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/icmp.cc
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/device.cc
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/mock_icmp_session.h
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/mock_icmp.h
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/connection_diagnostics.cc
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/icmp.h
[modify] https://crrev.com/705744a02beb727a88270e50d9440f0f2554ca4c/icmp_session.h

Project Member

Comment 6 by bugdroid1@chromium.org, May 28 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/overlays/chromiumos-overlay/+/75634cfb1c423ce6c60bc378f03cd1e0d3a36131

commit 75634cfb1c423ce6c60bc378f03cd1e0d3a36131
Author: Kevin Cernekee <cernekee@chromium.org>
Date: Sun May 28 03:43:38 2017

net-dns/c-ares: Delete redundant ebuild

For a short time, this ebuild carried a local security patch and it
lived in chromiumos-overlay.  But now we have the latest upstream version
in portage-stable, which includes the security fix.  Delete the old
version.

BUG= chromium:722809 
TEST=buildbots
TEST=verify that OS images were already using the new version (1.12.0)
     prior to nuking this ebuild

Change-Id: I78eccd6db340bb984c7a1a9d7c635a7dd058ae39
Reviewed-on: https://chromium-review.googlesource.com/509270
Commit-Ready: Kevin Cernekee <cernekee@chromium.org>
Tested-by: Kevin Cernekee <cernekee@chromium.org>
Reviewed-by: Mattias Nissler <mnissler@chromium.org>
Reviewed-by: Mike Frysinger <vapier@chromium.org>

[delete] https://crrev.com/d2d757dd4f8e22fdac78b49b2f868e0d2788413e/net-dns/c-ares/c-ares-1.7.5-r2.ebuild
[delete] https://crrev.com/d2d757dd4f8e22fdac78b49b2f868e0d2788413e/net-dns/c-ares/metadata.xml
[delete] https://crrev.com/d2d757dd4f8e22fdac78b49b2f868e0d2788413e/net-dns/c-ares/c-ares-1.7.5.ebuild
[delete] https://crrev.com/d2d757dd4f8e22fdac78b49b2f868e0d2788413e/net-dns/c-ares/Manifest
[delete] https://crrev.com/d2d757dd4f8e22fdac78b49b2f868e0d2788413e/net-dns/c-ares/files/c-ares-1.7.5-mkquery-heap-overflow.patch

Project Member

Comment 7 by bugdroid1@chromium.org, May 29 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/aosp/platform/system/connectivity/shill/+/ce4c6eae9dbf21c1c8277e34e467243e2652c3e5

commit ce4c6eae9dbf21c1c8277e34e467243e2652c3e5
Author: Kevin Cernekee <cernekee@chromium.org>
Date: Mon May 29 21:30:06 2017

shill: Add ICMPv6 support to connection diagnostics

Handle ICMPv6 ping requests and replies if the DNS server or gateway
is an IPv6 host.

BUG= chromium:722809 
TEST=manually test on broken IPv4-only and IPv6-only networks
TEST=unit tests
TEST=verify correct sendto() parameters using strace
TEST=verify correct ICMPv6 checksums using tcpdump/wireshark

Change-Id: Ib483dca195db17a3830774c0b5cfe9108a935fb1
Reviewed-on: https://chromium-review.googlesource.com/516732
Commit-Ready: Kevin Cernekee <cernekee@chromium.org>
Tested-by: Kevin Cernekee <cernekee@chromium.org>
Reviewed-by: Ben Chan <benchan@chromium.org>

[modify] https://crrev.com/ce4c6eae9dbf21c1c8277e34e467243e2652c3e5/icmp_session.cc
[modify] https://crrev.com/ce4c6eae9dbf21c1c8277e34e467243e2652c3e5/icmp.cc
[modify] https://crrev.com/ce4c6eae9dbf21c1c8277e34e467243e2652c3e5/icmp_session_unittest.cc
[modify] https://crrev.com/ce4c6eae9dbf21c1c8277e34e467243e2652c3e5/icmp_session.h
[modify] https://crrev.com/ce4c6eae9dbf21c1c8277e34e467243e2652c3e5/icmp.h

Status: Fixed (was: Untriaged)
The shill fixes have all landed.  The arc-networkd issues will be addressed separately as part of an upcoming refactor that provides better IPv6 support in ARC++.

Comment 9 by dchan@chromium.org, Aug 1 2017

Labels: VerifyIn-61

Comment 10 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)

Sign in to add a comment