New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.
Starred by 2 users

Issue metadata

Status: Started
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug

Blocked on:
issue 758781

Blocking:
issue 671760



Sign in to add a comment

eth0 smsc95xx issues in DUTs cause daily test failures

Project Member Reported by semenzato@chromium.org, Dec 29 2016 Back to list

Issue description

We see some reboots of devices in the test lab where ethernet connectivity is not re-established for 5 minutes, which makes the autotest server believe that the device is not coming back up from reboot.

If there is a failure to connect eth0 after a reboot for whatever reason, maybe we should be a bit more aggressive and retry after a minute or two the first time, instead of 5 minutes.  This would prevent an autotest timeout and failure.  It also seems reasonable for general use.  (We could do this on the test image only, but it's always better to keep the images as similar as possible.)

https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_jaq-release/builds/712

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/93314222-chromeos-test/chromeos4-row4-rack7-host2/debug/



 
messages
400 KB View Download
net.log
459 KB View Download
autoserv.DEBUG.txt
5.5 KB View Download
Blocking: 671760
Here's another one, I think.

https://uberchromegw.corp.google.com/i/chromeos/builders/daisy_spring-release/builds/2057

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/94151597-chromeos-test/chromeos6-row2-rack16-host2/

from autotest.DEBUG:

01/03 05:56:36.316 DEBUG|          ssh_host:0177| Running (ssh) 'if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi'
01/03 05:56:36.512 DEBUG|        base_utils:0280| [stdout] 548eff36-58f6-4c63-998a-3f234ca6bc13
01/03 05:56:36.514 DEBUG|          ssh_host:0177| Running (ssh) '( sleep 1; reboot & sleep 10; reboot -f ) </dev/null >/dev/null 2>&1 & echo -n $!'
01/03 05:56:36.709 DEBUG|        base_utils:0299| [stdout] 9766
01/03 05:56:36.711 DEBUG|      abstract_ssh:0643| Host chromeos6-row2-rack16-host2 pre-shutdown boot_id is 548eff36-58f6-4c63-998a-3f234ca6bc13
01/03 05:56:36.711 DEBUG|          ssh_host:0177| Running (ssh) 'if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi'
01/03 05:56:36.919 DEBUG|        base_utils:0280| [stdout] 548eff36-58f6-4c63-998a-3f234ca6bc13
01/03 05:56:37.976 DEBUG|          ssh_host:0177| Running (ssh) 'if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi'
01/03 05:57:29.045 DEBUG|          monitors:0191| ts_mon: sending 14 metrics to PubSub
01/03 05:58:37.501 WARNI|        base_utils:0912| run process timeout (119) fired on: /usr/bin/ssh -a -x    -o ControlPath=/tmp/_autotmp_a92FHSssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos6-row2-rack16-host2 "export LIBC_FATAL_STDERR_=1; if type \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\" \"server[stack::wait_for_restart|wait_down|get_boot_id] -> ssh_run(if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi)\";fi; if [ -f '/proc/sys/kernel/random/boot_id' ]; then cat '/proc/sys/kernel/random/boot_id'; else echo 'no boot_id available'; fi"
01/03 05:58:38.505 DEBUG|      abstract_ssh:0664| Host chromeos6-row2-rack16-host2 is now unreachable over ssh, is down
01/03 05:58:38.505 DEBUG|          ssh_host:0177| Running (ssh) 'true'
01/03 06:02:39.432 WARNI|        base_utils:0912| run process timeout (240) fired on: /usr/bin/ssh -a -x    -o ControlPath=/tmp/_autotmp_a92FHSssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos6-row2-rack16-host2 "export LIBC_FATAL_STDERR_=1; if type \"logger\" > /dev/null 2>&1; then logger -tag \"autotest\" \"server[stack::wait_up|is_up|ssh_ping] -> ssh_run(true)\";fi; true"
01/03 06:02:41.442 DEBUG|      abstract_ssh:0589| Host chromeos6-row2-rack16-host2 is still down after waiting 243 seconds
01/03 06:02:41.442 INFO |        server_job:0153| 		ABORT	----	reboot.verify	timestamp=1483452161	localtime=Jan 03 06:02:41	Host did not return from reboot
01/03 06:02:41.443 INFO |        server_job:0153| 	END FAIL	----	reboot	timestamp=1483452161	localtime=Jan 03 06:02:41	Host did not return from reboot
  Traceback (most recent call last):
    File "/usr/local/autotest/server/server_job.py", line 917, in run_op
      op_func()
    File "/usr/local/autotest/server/hosts/remote.py", line 150, in reboot
      **dargs)
    File "/usr/local/autotest/server/hosts/remote.py", line 219, in wait_for_restart
      self.log_op(self.OP_REBOOT, op_func)
    File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 548, in log_op
      op_func()
    File "/usr/local/autotest/server/hosts/remote.py", line 218, in op_func
      super(RemoteHost, self).wait_for_restart(timeout=timeout, **dargs)
    File "/usr/local/autotest/client/common_lib/hosts/base_classes.py", line 309, in wait_for_restart
      raise error.AutoservRebootError("Host did not return from reboot")
  AutoservRebootError: Host did not return from reboot
01/03 06:02:41.445 DEBUG|          monitors:0191| ts_mon: sending 15 metrics to PubSub
01/03 06:02:41.453 DEBUG|          ssh_host:0177| Running (ssh) 'ls "/tmp/sysinfo/autoserv-WwY3F3" > /dev/null'
01/03 06:02:41.627 DEBUG|        base_utils:0280| [stderr] mux_client_request_session: read from master failed: Broken pipe
01/03 06:02:43.001 DEBUG|        base_utils:0280| [stderr] ls: cannot access /tmp/sysinfo/autoserv-WwY3F3: No such file or directory
01/03 06:02:43.004 DEBUG|          ssh_host:0177| Running (ssh) 'true'
01/03 06:02:43.012 INFO |      abstract_ssh:0774| Master ssh connection to chromeos6-row2-rack16-host2 is down.
01/03 06:02:43.012 DEBUG|      abstract_ssh:0735| Nuking master_ssh_job.
01/03 06:02:43.015 DEBUG|      abstract_ssh:0741| Cleaning master_ssh_tempdir.
01/03 06:02:43.016 INFO |      abstract_ssh:0788| Starting master ssh connection '/usr/bin/ssh -a -x   -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_zENtWbssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos6-row2-rack16-host2'
01/03 06:02:43.016 DEBUG|        base_utils:0185| Running '/usr/bin/ssh -a -x   -N -o ControlMaster=yes -o ControlPath=/tmp/_autotmp_zENtWbssh-master/socket -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o BatchMode=yes -o ConnectTimeout=30 -o ServerAliveInterval=900 -o ServerAliveCountMax=3 -o ConnectionAttempts=4 -o Protocol=2 -l root -p 22 chromeos6-row2-rack16-host2'
01/03 06:02:44.062 DEBUG|      abstract_ssh:0577| Host chromeos6-row2-rack16-host2 is now up


From /var/log/messages:

2017-01-03T05:56:38.401779-08:00 INFO session_manager[8749]: [INFO:policy_service.cc(202)] Persisted policy to disk.
2017-01-03T05:56:38.667991-08:00 WARNING chapsd[8551]: SRK does not exist - this is normal when the TPM is not yet owned.
2017-01-03T05:56:45.002210-08:00 INFO laptop-mode[495]: disabled, not active
2017-01-03T05:56:45.030089-08:00 INFO kernel: [    0.000000] Booting Linux on physical CPU 0x0
2017-01-03T05:56:45.030114-08:00 INFO kernel: [    0.000000] Initializing cgroup subsys cpu
2017-01-03T05:56:45.030121-08:00 NOTICE kernel: [    0.000000] Linux version 3.8.11 (chrome-bot@cros-beefy27-c2) (gcc version 4.9.x 20150123 (prerelease) (4.9.2_cos_gg_4.9.2-r145-41f3e25635616c067b9ee272304e6f86ac8ee9db_4.9.2-r145) ) #1 SMP Tue Jan 3 03:00:05 PST 2017

...

2017-01-03T05:56:46.679570-08:00 ERR kernel: [    4.173773] usb 3-2.3.4: device not accepting address 8, error -32
2017-01-03T05:56:46.679597-08:00 ERR kernel: [    4.173963] hub 3-2.3:1.0: unable to enumerate USB device on port 4
2017-01-03T05:56:46.749527-08:00 INFO kernel: [    4.243934] usb 3-2.3.1: reset high-speed USB device number 4 using s5p-ehci
2017-01-03T05:56:46.839616-08:00 INFO kernel: [    4.335454] r8152 3-2.3.1:1.0 (unregistered net_device): Unknown version 0x6010
2017-01-03T05:56:46.849613-08:00 INFO kernel: [    4.345448] r8152 3-2.3.1:1.0 eth0: v1.06.0 (2014/03/03)
2017-01-03T05:56:46.849665-08:00 INFO kernel: [    4.345628] usbcore: registered new interface driver r8152
2017-01-03T05:56:46.864551-08:00 INFO kernel: [    4.362072] usbcore: registered new interface driver cdc_ether

...

2017-01-03T05:56:48.544550-08:00 INFO kernel: [    6.038966] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready

...

2017-01-03T05:56:52.189558-08:00 INFO kernel: [    9.687858] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

...

2017-01-03T05:56:57.351966-08:00 ERR dhcpcd[1734]: eth0: DHCP lease expired

...

2017-01-03T05:57:22.186136-08:00 ERR shill[1495]: [ERROR:dhcp_config.cc(298)] Timed out waiting for DHCP lease on eth0 (after 30 seconds).
2017-01-03T05:57:52.230163-08:00 ERR shill[1495]: [ERROR:dhcp_config.cc(298)] Timed out waiting for DHCP lease on eth0 (after 30 seconds).
2017-01-03T05:58:22.262622-08:00 ERR shill[1495]: [ERROR:dhcp_config.cc(298)] Timed out waiting for DHCP lease on eth0 (after 30 seconds).
2017-01-03T05:58:52.303549-08:00 ERR shill[1495]: [ERROR:dhcp_config.cc(298)] Timed out waiting for DHCP lease on eth0 (after 30 seconds).
2017-01-03T05:59:22.339082-08:00 ERR shill[1495]: [ERROR:dhcp_config.cc(298)] Timed out waiting for DHCP lease on eth0 (after 30 seconds).
2017-01-03T05:59:52.383226-08:00 ERR shill[1495]: [ERROR:dhcp_config.cc(298)] Timed out waiting for DHCP lease on eth0 (after 30 seconds).
2017-01-03T06:00:22.421990-08:00 ERR shill[1495]: [ERROR:dhcp_config.cc(298)] Timed out waiting for DHCP lease on eth0 (after 30 seconds).
2017-01-03T06:00:54.426554-08:00 ERR shill[1495]: [ERROR:dhcp_config.cc(298)] Timed out waiting for DHCP lease on eth0 (after 30 seconds).
2017-01-03T06:01:54.427462-08:00 ERR shill[1495]: [ERROR:dhcp_config.cc(298)] Timed out waiting for DHCP lease on eth0 (after 30 seconds).
2017-01-03T06:01:56.480628-08:00 NOTICE simple-rotate[4590]: rotating /var/log/tlsdate.log
2017-01-03T06:02:00.333536-08:00 NOTICE check_ethernet.hook[4607]: Attempting recovery method "ensure_connection_manager_is_running"
2017-01-03T06:02:00.347933-08:00 NOTICE check_ethernet.hook[4611]: Attempting recovery method "toggle_ethernet_interfaces"
2017-01-03T06:02:00.457477-08:00 INFO kernel: [  317.999039] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-01-03T06:02:03.682546-08:00 INFO kernel: [  321.225578] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
2017-01-03T06:02:25.391217-08:00 INFO avahi-daemon[1869]: Joining mDNS multicast group on interface eth0.IPv4 with address 100.115.130.200.
2017-01-03T06:02:25.391328-08:00 INFO avahi-daemon[1869]: New relevant interface eth0.IPv4 for mDNS.
2017-01-03T06:02:25.391368-08:00 INFO avahi-daemon[1869]: Registering new address record for 100.115.130.200 on eth0.IPv4.
2017-01-03T06:02:38.523964-08:00 NOTICE check_ethernet.hook[4722]: Recovery method "toggle_ethernet_interfaces" successful

I am also attaching net.log, but it looks like maybe the DHCP server wasn't working?



messages
812 KB View Download
net.log
236 KB View Download
This keeps happening on the canaries.

https://uberchromegw.corp.google.com/i/chromeos/builders/nyan_blaze-release/builds/753/steps/PaygenTest/logs/stdio

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/95655799-chromeos-test/chromeos4-row7-rack9-host1/

The sysinfo.tgz in the bucket shows that eth0 wasn't up for about 5 minutes, and that was too long for the reboot timeout (about 4m)

2017-01-11T06:19:55.447817-08:00 WARNING kernel: [  178.704842] host1x 50000000.host1x: stop latency exceeded, new value 57083 ns
2017-01-11T06:20:53.367008-08:00 INFO kernel: [  236.400073] mwifiex_sdio mmc1:0001:1: info: scan: num_probes = 4
2017-01-11T06:21:56.917852-08:00 INFO kernel: [  299.694119] mwifiex_sdio mmc1:0001:1: info: scan: num_probes = 4
2017-01-11T06:22:04.617251-08:00 WARNING kernel: [  307.361277] gk20a 57000000.gk20a: stop latency exceeded, new value 175667 ns
2017-01-11T06:22:05.340795-08:00 NOTICE check_ethernet.hook[3979]: Attempting recovery method "ensure_connection_manager_is_running"
2017-01-11T06:22:05.351474-08:00 NOTICE check_ethernet.hook[3983]: Attempting recovery method "toggle_ethernet_interfaces"
2017-01-11T06:22:05.448398-08:00 INFO kernel: [  308.191760] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-01-11T06:22:05.476938-08:00 INFO kernel: [  308.216879] smsc95xx 1-1.1:1.0 eth1: hardware isn't capable of remote wakeup
2017-01-11T06:22:05.540904-08:00 INFO kernel: [  308.277320] smsc95xx 1-1.1:1.0 eth1: hardware isn't capable of remote wakeup
2017-01-11T06:22:05.540997-08:00 INFO kernel: [  308.278857] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
2017-01-11T06:22:07.907241-08:00 ERR kernel: [  310.635409] smsc75xx 1-2.2:1.0 eth0: kevent 4 may have been dropped
2017-01-11T06:22:07.907370-08:00 INFO kernel: [  310.640172] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
2017-01-11T06:22:07.907407-08:00 WARNING kernel: [  310.643281] smsc75xx 1-2.2:1.0 eth0: device_set_wakeup_enable error -22
2017-01-11T06:22:07.926957-08:00 INFO kernel: [  310.656802] smsc75xx 1-2.2:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1
2017-01-11T06:22:07.937090-08:00 INFO kernel: [  310.664486] smsc75xx 1-2.2:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1
2017-01-11T06:22:13.670674-08:00 ERR dhcpcd[4002]: eth0: DHCP lease expired
2017-01-11T06:22:18.369080-08:00 INFO avahi-daemon[1659]: Joining mDNS multicast group on interface eth0.IPv4 with address 100.115.198.89.
2017-01-11T06:22:18.369440-08:00 INFO avahi-daemon[1659]: New relevant interface eth0.IPv4 for mDNS.
2017-01-11T06:22:18.369464-08:00 INFO avahi-daemon[1659]: Registering new address record for 100.115.198.89 on eth0.IPv4.
2017-01-11T06:22:22.182372-08:00 WARNING kernel: [  324.060633] gk20a 57000000.gk20a: start latency exceeded, new value 208500 ns
2017-01-11T06:22:44.435360-08:00 NOTICE check_ethernet.hook[4131]: Recovery method "toggle_ethernet_interfaces" successful
2017-01-11T06:22:58.623925-08:00 INFO sshd[4132]: Accepted publickey for root from 172.25.90.25 port 42187 ssh2: RSA SHA256:Fp1qWjFLyK1cTpiI5rdk7iEJwoK9lcnYAgbQtGC3jzU
2017-01-11T06:22:58.806623-08:00 NOTICE ag[4135]: autotest server[stack::wrapper|_install|path_exists] -> ssh_run(ls "/tmp/sysinfo/autoserv-325SwS" > /dev/null)
2017-01-11T06:22:58.848001-08:00 INFO sshd[4132]: Received disconnect from 172.25.90.25 port 42187:11: disconnected by user
2017-01-11T06:22:58.848172-08:00 INFO sshd[4132]: Disconnected from 172.25.90.25 port 42187
2017-01-11T06:22:59.418937-08:00 INFO sshd[4137]: Accepted publickey for root from 172.25.90.25 port 42195 ssh2: RSA SHA256:Fp1qWjFLyK1cTpiI5rdk7iEJwoK9lcnYAgbQtGC3jzU
2017-01-11T06:22:59.727940-08:00 NOTICE ag[4140]: autotest server[stack::wait_up|is_up|ssh_ping] -> ssh_run(true)
Also see  issue 681198 .  I am not duplicating the bug because that issue did not happen at reboot, but it is a similar malfunction.
Interesting, both this bug and the linked one involved the smsc95xx driver.

We could enable netdev_dbg() for this driver.  Maybe we can use CONFIG_DYNAMIC_DEBUG and enable it only on test images.  (I don't think we can compile the kernel differently for test images.)

Comment 7 by snanda@chromium.org, Jan 14 2017

In c#3 logs the link does seem to come up reliably. Possible but unlikely scenario is that the DHCP server was out to lunch. That's why I was thinking of adding some ethtool stats (ethtool -i and perhaps ethtool -S) to check_ethernet.hook to see whether any packets were being received on the ethernet interface or not.  If packets were being received the issue would be at higher layers (DHCP) but if none were being received issue could be at ethernet interface or switch level.

Comment 8 by snanda@chromium.org, Jan 14 2017

Cc: cernekee@chromium.org
Oh and about enabling CONFIG_DYNAMIC_DEBUG, I (and dianders IIRC) had looked at that in the past. The kernel size increase wasn't too much so it could be something we could think about enabling for production kernels too.

Enabling netdev_dbg sounds interesting too although I am not sure how much debug spew we will get from it.
> 2017-01-03T05:56:57.351966-08:00 ERR dhcpcd[1734]: eth0: DHCP lease expired

Is this normal?  12 seconds after boot?

I did see these on my DUTs but they corresponded to the timestamp when my LAN port got disabled: http://t/24819174

> Oh and about enabling CONFIG_DYNAMIC_DEBUG

I think you will need to modify the OS image or kernel command line to enable verbose debug prints from that driver, regardless?

Seeing the driver/ethtool stats would be interesting.
Owner: semenzato@chromium.org
Status: Assigned
#7 yes I had forgotten that the initial report looks like a DHCP server failure.  For that one I was just advocating a faster retry, at least initially.

I may be able to go back and look at more cases and see if they all happen on the same type of device.  Does ethtool -i give us information that I cannot find in the kernel log at boot?

Ethtool -S will give us packet counts.  What would we compare those to?

#9: those drivers are all loaded as modules.  Maybe I can enable debugging with a module parameter?  But at this point I am not even sure that that information will be useful.





With dynamic debug you can make the output more verbose through debugfs, if you can execute shell commands (e.g. via ssh).  But if the bug happens at startup, there's a race, and it sounds like the system has no connectivity.  Which means you'll have to tweak the image either way.

I think dynamic debug settings can be changed from the kernel command line too.
c#10: regarding ethtool -S, since broadcast packets will continue to be received if the rx packets count is non-zero that would indicate that the interface came up since it's able to receive packets at least.

I haven't looked at ethtool -i recently but IIRC it gives you the link speed too.  I guess you could get that from kernel log too.
#12 ok I am not familiar with check_ethernet.hook, but I guess we would collect multiple samples of ethtool -S, and then we can compare them, right?

I'll look at this after the week-end, it's not urgent.  Thanks!

Yeah, 0 RX packets (or lots of RX errors) would be a red flag.

Multiple samples might help identify whether the counters suddenly stopped incrementing.
As long as we get the ethtool -S stats before bouncing the interface we should be good.


Yes, after weekend is absolutely fine.
Another failed reboot here:

https://uberchromegw.corp.google.com/i/chromeos/builders/veyron_speedy-release/builds/785


https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/97526304-chromeos-test/chromeos4-row4-rack10-host20/

However this is not just 5 minutes.  The DUT went down at 6:32 and there were many failed attempts to reconnect up until 7:03.

By looking at the DUT jobs I found one with the start time of about 7:15am, whose console-ramoops appears to show some trouble with eth0: 


[   11.877836] audit: type=1326 audit(1485183233.640:4): auid=4294967295 uid=226 gid=226 ses=4294967295 subj=u:r:chromeos:s0 pid=1449 comm="mtpd" exe="/opt/google/mtpd/mtpd" sig=31 syscall=384 compat=0 ip=0xa7007202 code=0x0
[   12.206877] max98090 2-0010: DMIC Mux: put_dmic_mux disable DMIC
[   12.523869] uvcvideo: Found UVC 1.00 device USB2.0 HD UVC WebCam (04f2:b53a)
[   12.527465] usbcore: registered new interface driver uvcvideo
[   12.527483] USB Video Class driver (1.1.1)
[   13.361212] init: report-power-metrics main process (2904) terminated with status 2
[  313.880020] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  313.941576] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  313.941990] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  344.077768] usbcore: deregistering interface driver smsc95xx
[  344.077946] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
[  344.078003] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  345.145229] usb 2-1.1: authorized to connect
[  345.157936] smsc95xx v1.0.4
[  345.223218] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
[  345.223468] usbcore: registered new interface driver smsc95xx
[  345.311587] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  345.312094] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  375.399848] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  375.661548] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  375.661755] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  435.724493] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  435.791602] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  435.792188] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  465.908216] usbcore: deregistering interface driver smsc95xx
[  465.908307] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
[  465.908356] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  466.968636] usb 2-1.1: authorized to connect
[  466.985627] smsc95xx v1.0.4
[  467.052535] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
[  467.052680] usbcore: registered new interface driver smsc95xx
[  467.131590] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  467.132016] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  484.826433] rockchip-vop ff940000.vop: Detached from iommu domain
[  497.311097] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  497.581587] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  497.581999] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  557.627457] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  557.691578] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  557.692078] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  587.848623] usbcore: deregistering interface driver smsc95xx
[  587.848771] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
[  587.848849] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  588.920913] usb 2-1.1: authorized to connect
[  588.940717] smsc95xx v1.0.4
[  589.006276] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
[  589.007239] usbcore: registered new interface driver smsc95xx
[  589.101707] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  589.102191] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  619.239371] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  619.491558] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  619.491911] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  679.550184] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  679.611573] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  679.611931] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  709.761228] usbcore: deregistering interface driver smsc95xx
[  709.761364] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
[  709.761430] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  710.821815] usb 2-1.1: authorized to connect
[  710.839412] smsc95xx v1.0.4
[  710.903240] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
[  710.903523] usbcore: registered new interface driver smsc95xx
[  710.991708] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  710.991924] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  741.160299] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  741.411547] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  741.411863] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  801.460547] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  801.521578] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  801.521774] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  831.667302] usbcore: deregistering interface driver smsc95xx
[  831.667417] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
[  831.667475] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  832.738946] usb 2-1.1: authorized to connect
[  832.756785] smsc95xx v1.0.4
[  832.822691] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
[  832.822894] usbcore: registered new interface driver smsc95xx
[  832.901611] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  832.902065] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  863.050217] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  863.331547] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  863.331755] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  923.413866] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  923.481582] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  923.482049] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  953.630587] usbcore: deregistering interface driver smsc95xx
[  953.630702] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
[  953.630761] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  954.732646] usb 2-1.1: authorized to connect
[  954.751029] smsc95xx v1.0.4
[  954.822706] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
[  954.822909] usbcore: registered new interface driver smsc95xx
[  954.911597] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  954.912059] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[  985.080092] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  985.341548] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[  985.341828] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1045.383311] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1045.441836] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1045.442493] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1075.572264] usbcore: deregistering interface driver smsc95xx
[ 1075.572385] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
[ 1075.572445] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1076.649351] usb 2-1.1: authorized to connect
[ 1076.666720] smsc95xx v1.0.4
[ 1076.735516] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
[ 1076.735836] usbcore: registered new interface driver smsc95xx
[ 1076.831696] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1076.832038] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1106.988987] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1107.241563] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1107.241845] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1167.331937] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1167.391577] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1167.391959] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1197.519726] usbcore: deregistering interface driver smsc95xx
[ 1197.519851] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
[ 1197.519921] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1198.607005] usb 2-1.1: authorized to connect
[ 1198.616633] smsc95xx v1.0.4
[ 1198.683121] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
[ 1198.683476] usbcore: registered new interface driver smsc95xx
[ 1198.761591] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1198.762029] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1228.929856] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1229.201553] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1229.201840] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[ 1264.324540] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
[ 1264.358008] init: debugd main process (1112) killed by TERM signal
[ 1265.562514] tpm_i2c_infineon 1-0020: command 0xba (size 18) returned code 0x0
[ 1265.566727] init: console-ttyS2 main process (633) killed by TERM signal
[ 1265.567773] init: chapsd main process (1121) killed by TERM signal
[ 1265.569010] init: crash-sender main process (1425) killed by TERM signal
[ 1265.569756] init: cros-machine-id-regen-periodic main process (1436) killed by TERM signal
[ 1265.570350] init: mtpd main process (1440) killed by TERM signal
[ 1265.570965] init: log-rotate main process (1451) killed by TERM signal
[ 1265.571587] init: bluetoothd main process (1487) killed by TERM signal
[ 1265.572083] init: cras main process (1497) killed by TERM signal
[ 1265.578935] init: powerd main process (1084) killed by TERM signal
[ 1265.580401] init: warn-collector main process (2863) killed by TERM signal
[ 1265.581020] init: activate_date main process (2876) killed by TERM signal



Summary: eth0 smsc95xx issues in DUTs cause daily test failures (was: eth0 took 5 minutes to connect after reboot)
I am renaming this bug to better capture the root cause.
Cc: abhishekbh@chromium.org
+abhishekbh as FYI although this seems unrelated to the CFM ethernet bug he is chasing.
https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/97531302-chromeos-test/chromeos4-row4-rack10-host20/sysinfo/var/log/

/var/log/messages confirms the times (+8 hours because of GMT):

2017-01-23T14:32:43.614629+00:00 INFO kernel: [   24.844340] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
2017-01-23T14:32:43.614656+00:00 INFO kernel: [   24.844444] usbcore: registered new interface driver smsc95xx
2017-01-23T14:32:43.623694+00:00 INFO laptop-mode[2863]: Laptop mode 
2017-01-23T14:32:43.625604+00:00 INFO laptop-mode[2864]: enabled, not active
2017-01-23T14:32:43.664283+00:00 ERR laptop-mode[2892]: Couldn't acquire lock. Retrying.... PID is 2862\n
2017-01-23T14:32:43.697961+00:00 INFO laptop-mode[2906]: Laptop mode 
2017-01-23T14:32:43.699674+00:00 INFO laptop-mode[2907]: enabled, not active
2017-01-23T14:32:43.846848+00:00 INFO laptop-mode[2947]: Laptop mode 
2017-01-23T14:32:43.849209+00:00 INFO laptop-mode[2948]: enabled, not active
2017-01-23T14:32:43.874588+00:00 INFO kernel: [   25.101701] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:32:43.874617+00:00 INFO kernel: [   25.101931] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-01-23T14:32:44.913477+00:00 INFO sshd[2973]: Server listening on 0.0.0.0 port 2222.
2017-01-23T14:32:44.913779+00:00 INFO sshd[2973]: Server listening on :: port 2222.
2017-01-23T14:32:45.128843+00:00 NOTICE veyron-emmc-upgrade[3027]: NOOP: System is in debug mode
2017-01-23T14:32:45.174610+00:00 WARNING kernel: [   26.400189] init: report-power-metrics main process (3043) terminated with status 2
2017-01-23T14:32:45.186301+00:00 INFO minijail0[2999]: libminijail[2999]: not running init loop, exiting immediately
2017-01-23T14:32:45.190881+00:00 INFO minijail0[3059]: libminijail[3059]: not running init loop, exiting immediately
2017-01-23T14:32:45.194594+00:00 DEBUG kernel: [   26.419231] SELinux: initialized (dev proc, type proc), uses genfs_contexts
2017-01-23T14:32:45.194885+00:00 NOTICE /usr/share/cros/init/crx-import.sh[3076]: CRX Source not present. No import performed.
2017-01-23T14:32:45.033543+00:00 INFO arc_camera_service[3081]: Starting ARC camera service provider
2017-01-23T14:32:45.054215+00:00 DEBUG kernel: [   26.520652] SELinux: initialized (dev loop1, type squashfs), uses xattr
2017-01-23T14:32:45.114540+00:00 DEBUG kernel: [   26.581386] SELinux: initialized (dev loop2, type squashfs), uses xattr
2017-01-23T14:32:45.134536+00:00 DEBUG kernel: [   26.598666] SELinux: initialized (dev loop3, type squashfs), uses xattr
2017-01-23T14:32:45.164231+00:00 DEBUG kernel: [   26.628666] SELinux: initialized (dev loop4, type squashfs), uses xattr
2017-01-23T14:32:45.184583+00:00 DEBUG kernel: [   26.654979] SELinux: initialized (dev loop5, type squashfs), uses xattr
2017-01-23T14:32:45.232675+00:00 INFO session_manager[1056]: [INFO:server_backed_state_key_generator.cc(131)] Stable device secret missing!
2017-01-23T14:37:40.990655+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.DailyUseTime for reading: No such file or directory
2017-01-23T14:37:40.991307+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UserCrashInterval for reading: No such file or directory
2017-01-23T14:37:40.991608+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.KernelCrashInterval for reading: No such file or directory
2017-01-23T14:37:40.992374+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/daily.cycle for reading: No such file or directory
2017-01-23T14:37:40.994071+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.AnyCrashesDaily for reading: No such file or directory
2017-01-23T14:37:40.994680+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UserCrashesDaily for reading: No such file or directory
2017-01-23T14:37:40.995644+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.KernelCrashesDaily for reading: No such file or directory
2017-01-23T14:37:40.996382+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UncleanShutdownsDaily for reading: No such file or directory
2017-01-23T14:37:40.997620+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/weekly.cycle for reading: No such file or directory
2017-01-23T14:37:40.998307+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.AnyCrashesWeekly for reading: No such file or directory
2017-01-23T14:37:40.998951+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UserCrashesWeekly for reading: No such file or directory
2017-01-23T14:37:40.999548+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.KernelCrashesWeekly for reading: No such file or directory
2017-01-23T14:37:41.000164+00:00 WARNING metrics_daemon[1598]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UncleanShutdownsWeekly for reading: No such file or directory
2017-01-23T14:37:45.766411+00:00 NOTICE check_ethernet.hook[4024]: Attempting recovery method "ensure_connection_manager_is_running"
2017-01-23T14:37:45.783621+00:00 NOTICE check_ethernet.hook[4028]: Attempting recovery method "toggle_ethernet_interfaces"
2017-01-23T14:37:45.814221+00:00 INFO kernel: [  327.279189] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:37:45.874318+00:00 INFO kernel: [  327.342960] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:37:45.874366+00:00 INFO kernel: [  327.343358] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-01-23T14:38:15.952294+00:00 NOTICE check_ethernet.hook[4102]: Attempting recovery method "reload_usb_ethernet_devices"
2017-01-23T14:38:15.990099+00:00 NOTICE /usr/sbin/reload_network_device[4111]: Restarting network interface eth0
2017-01-23T14:38:16.015858+00:00 INFO kernel: [  357.485392] usbcore: deregistering interface driver smsc95xx
2017-01-23T14:38:16.015899+00:00 INFO kernel: [  357.485491] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
2017-01-23T14:38:16.015906+00:00 INFO kernel: [  357.485546] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:38:16.028829+00:00 ERR shill[1189]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-01-23T14:38:16.029044+00:00 ERR shill[1189]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-01-23T14:38:17.084346+00:00 INFO kernel: [  358.552350] usb 2-1.1: authorized to connect
2017-01-23T14:38:17.094278+00:00 INFO kernel: [  358.563082] smsc95xx v1.0.4
2017-01-23T14:38:17.154319+00:00 INFO kernel: [  358.623358] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
2017-01-23T14:38:17.154412+00:00 INFO kernel: [  358.623623] usbcore: registered new interface driver smsc95xx
2017-01-23T14:38:17.225709+00:00 INFO laptop-mode[4154]: Laptop mode 
2017-01-23T14:38:17.228558+00:00 INFO laptop-mode[4155]: enabled, not active
2017-01-23T14:38:17.255995+00:00 INFO kernel: [  358.721841] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:38:17.256026+00:00 INFO kernel: [  358.722202] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-01-23T14:38:17.350795+00:00 INFO laptop-mode[4195]: Laptop mode 
2017-01-23T14:38:17.353090+00:00 INFO laptop-mode[4196]: enabled, not active
2017-01-23T14:38:47.175805+00:00 NOTICE check_ethernet.hook[4249]: Attempting recovery method "restart_connection_manager"
2017-01-23T14:38:47.434666+00:00 ERR wpa_supplicant[722]: rmdir[ctrl_interface=/var/run/wpa_supplicant]: Permission denied
2017-01-23T14:38:47.444222+00:00 INFO kernel: [  388.909534] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:38:47.588431+00:00 ERR shill[4265]: [ERROR:crypto_des_cbc.cc(103)] Unable to load key matter from /var/lib/whitelist/owner.key
2017-01-23T14:38:47.607596+00:00 ERR shill[4265]: [ERROR:crypto_des_cbc.cc(103)] Unable to load key matter from /var/lib/whitelist/owner.key
2017-01-23T14:38:47.608907+00:00 ERR shill[4265]: [ERROR:crypto_des_cbc.cc(103)] Unable to load key matter from /var/lib/whitelist/owner.key
2017-01-23T14:38:47.694205+00:00 INFO kernel: [  389.161700] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:38:47.694229+00:00 INFO kernel: [  389.161939] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-01-23T14:38:47.794794+00:00 INFO avahi-daemon[1635]: Server startup complete. Host name is dc50baf0143d083f492fd06b58861577.local. Local service cookie is 121004034.
2017-01-23T14:38:47.794823+00:00 INFO avahi-daemon[1635]: Changing host name to 'dc50baf0143d083f492fd06b58861577'.
2017-01-23T14:38:47.803225+00:00 NOTICE cros-machine-id-regen[4289]: Regenerated /var/lib/dbus/machine-id (reason: network).
2017-01-23T14:39:17.622325+00:00 NOTICE check_ethernet.hook[4335]: Restart failed; will retry recovery for ~15 minutes
2017-01-23T14:39:47.719362+00:00 NOTICE check_ethernet.hook[4390]: Attempting recovery method "ensure_connection_manager_is_running"
2017-01-23T14:39:47.735460+00:00 NOTICE check_ethernet.hook[4394]: Attempting recovery method "toggle_ethernet_interfaces"
2017-01-23T14:39:47.765065+00:00 INFO kernel: [  449.234673] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:39:47.824268+00:00 INFO kernel: [  449.291860] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:39:47.824311+00:00 INFO kernel: [  449.292509] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-01-23T14:40:17.910142+00:00 NOTICE check_ethernet.hook[4448]: Attempting recovery method "reload_usb_ethernet_devices"
2017-01-23T14:40:17.935096+00:00 NOTICE /usr/sbin/reload_network_device[4457]: Restarting network interface eth0
2017-01-23T14:40:17.966133+00:00 INFO kernel: [  479.430374] usbcore: deregistering interface driver smsc95xx
2017-01-23T14:40:17.966150+00:00 INFO kernel: [  479.430469] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
2017-01-23T14:40:17.966155+00:00 INFO kernel: [  479.430524] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:40:17.979103+00:00 ERR shill[4265]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-01-23T14:40:17.979289+00:00 ERR shill[4265]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-01-23T14:40:19.034276+00:00 INFO kernel: [  480.505862] usb 2-1.1: authorized to connect
2017-01-23T14:40:19.054327+00:00 INFO kernel: [  480.521694] smsc95xx v1.0.4
2017-01-23T14:40:19.114306+00:00 INFO kernel: [  480.584849] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
2017-01-23T14:40:19.114366+00:00 INFO kernel: [  480.585351] usbcore: registered new interface driver smsc95xx
2017-01-23T14:40:19.204263+00:00 INFO kernel: [  480.671864] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:40:19.204303+00:00 INFO kernel: [  480.672301] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready

etc. etc.
There are also shill errors:

2017-01-23T14:52:29.524270+00:00 INFO kernel: [ 1210.992959] usbcore: deregistering interface driver smsc95xx
2017-01-23T14:52:29.524322+00:00 INFO kernel: [ 1210.993075] smsc95xx 2-1.1:1.0 eth0: unregister 'smsc95xx' usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet
2017-01-23T14:52:29.524329+00:00 INFO kernel: [ 1210.993135] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:52:29.535769+00:00 ERR shill[6327]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-01-23T14:52:29.536026+00:00 ERR shill[6327]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-01-23T14:52:30.604341+00:00 INFO kernel: [ 1212.068249] usb 2-1.1: authorized to connect
2017-01-23T14:52:30.614348+00:00 INFO kernel: [ 1212.081724] smsc95xx v1.0.4
2017-01-23T14:52:30.674333+00:00 INFO kernel: [ 1212.146130] smsc95xx 2-1.1:1.0 eth0: register 'smsc95xx' at usb-ff540000.usb-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:10:02:28
2017-01-23T14:52:30.674408+00:00 INFO kernel: [ 1212.146666] usbcore: registered new interface driver smsc95xx
2017-01-23T14:52:30.774286+00:00 INFO kernel: [ 1212.241864] smsc95xx 2-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-01-23T14:52:30.774321+00:00 INFO kernel: [ 1212.242295] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-01-23T14:52:30.783889+00:00 INFO laptop-mode[6551]: Laptop mode 
2017-01-23T14:52:30.788190+00:00 INFO laptop-mode[6552]: enabled, not active
2017-01-23T14:52:30.902658+00:00 INFO laptop-mode[6592]: Laptop mode 
2017-01-23T14:52:30.905767+00:00 INFO laptop-mode[6593]: enabled, not active
2017-01-23T14:53:00.736980+00:00 NOTICE check_ethernet.hook[6648]: Attempting recovery method "restart_connection_manager"
2017-01-23T14:53:00.944530+00:00 ERR wpa_supplicant[722]: rmdir[ctrl_interface=/var/run/wpa_supplicant]: Permission denied

Comment 21 by kirtika@google.com, Jan 23 2017

The shill error could just be because that directory doesn't exist in sysfs since eth0 isn't up and hence it can't write the file. Ignore?


2017-01-23T14:52:29.535769+00:00 ERR shill[6327]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-01-23T14:52:29.536026+00:00 ERR shill[6327]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
Should it be reported as ERROR or ERR if it's not an error?  Should we file a separate bug?
Cc: englab-sys-cros@google.com dgarr...@chromium.org xixuan@chromium.org haoweiw@chromium.org rspangler@chromium.org semenzato@chromium.org kinaba@chromium.org
 Issue 681198  has been merged into this issue.
What about the second error:

2017-01-23T14:53:00.944530+00:00 ERR wpa_supplicant[722]: rmdir[ctrl_interface=/var/run/wpa_supplicant]: Permission denied

Is it worth filing a bug for it?
A quick scan for kernel versions shows that this happened on 3.14 and older kernels There are instances on 3.14, 3.10, and 3.8.  There are no instances on 3.18 and 4.4.
FYR - it happened before on a dongle attached to a veryron_minnie - Issue 642654. And we requested for a replacement of the dongle.

Comment 27 by kirtika@google.com, Jan 25 2017

Re: #c25, semenzato@ may have skinned this cat. 
Diff-ing 3.14 and 3.18, this is a probably a patch we want: 
https://patchwork.kernel.org/patch/4520541/

Thank you Kirtika.  That is indeed the only difference between 3.14 and 3.18 and it won't hurt to backport it.  I am not too hopeful though.  The commit message says:

    usbnet: smsc95xx: add reset_resume function with reset operation
    
    The smsc95xx needs to resume with reset operation. Otherwise it causes
    system hang by network error like below after resume. This case appears
    on odroid u3 board.
    
    [    9.727600] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
    [    9.727648] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
    [    9.727689] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
    [    9.727728] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
    [    9.729486] PM: resume of devices complete after 2011.219 msecs
    [   10.117609] Restarting tasks ... done.
    [   11.725099] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
    [   13.480846] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
    [   13.481361] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
    ...

but I don't think we're seeing this problem at suspend/resume.

There are two more patches in 4.4, but neither seems applicable either (one is just lint, the other removes a null pointer dereference).

I looked at Linus's tree (4.10) and there is one more patch, but I doubt we disable autonegotiation.

commit d69d169493463e6b1da9a1965d35126e479aa27f
Author:     Christoph Fritz <chf.fritz@googlemail.com>
AuthorDate: Thu May 26 04:06:47 2016 +0200
Commit:     David S. Miller <davem@davemloft.net>
CommitDate: Tue May 31 14:22:23 2016 -0700

    usbnet: smsc95xx: fix link detection for disabled autonegotiation
    
    To detect link status up/down for connections where autonegotiation is
    explicitly disabled, we don't get an irq but need to poll the status
    register for link up/down detection.
    This patch adds a workqueue to poll for link status.
    
    Signed-off-by: Christoph Fritz <chf.fritz@googlemail.com>
    Signed-off-by: David S. Miller <davem@davemloft.net>


However, the problem could be in some other part of the stack.
Project Member

Comment 29 by bugdroid1@chromium.org, Jan 27 2017

Labels: merge-merged-chromeos-3.10
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/3006e5f0359477be845b6074ded6e9ef8ad9df63

commit 3006e5f0359477be845b6074ded6e9ef8ad9df63
Author: Joonyoung Shim <jy0922.shim@samsung.com>
Date: Thu Jul 10 02:49:42 2014

UPSTREAM: usbnet: smsc95xx: add reset_resume function with reset operation

The smsc95xx needs to resume with reset operation. Otherwise it causes
system hang by network error like below after resume. This case appears
on odroid u3 board.

[    9.727600] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.727648] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.727689] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.727728] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.729486] PM: resume of devices complete after 2011.219 msecs
[   10.117609] Restarting tasks ... done.
[   11.725099] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[   13.480846] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[   13.481361] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
...

Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b4df480f68ae03b5dd4ab0db56536fbcec741705)
BUG=chromium:677572
TEST=None
Signed-off-by: Kirtika Ruchandani <kirtika@chromium.org>

Change-Id: I0fc3f398f4b8993b5c14436c3fa235024a93e672
Reviewed-on: https://chromium-review.googlesource.com/433082
Commit-Ready: Kirtika Ruchandani <kirtika@chromium.org>
Tested-by: Kirtika Ruchandani <kirtika@chromium.org>
Reviewed-by: Luigi Semenzato <semenzato@chromium.org>
Reviewed-by: Dmitry Torokhov <dtor@chromium.org>

[modify] https://crrev.com/3006e5f0359477be845b6074ded6e9ef8ad9df63/drivers/net/usb/smsc95xx.c

Project Member

Comment 30 by bugdroid1@chromium.org, Jan 27 2017

Labels: merge-merged-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/c34b8aeed8a4715eef34fb3314e1be90681d45fb

commit c34b8aeed8a4715eef34fb3314e1be90681d45fb
Author: Joonyoung Shim <jy0922.shim@samsung.com>
Date: Thu Jul 10 02:49:42 2014

UPSTREAM: usbnet: smsc95xx: add reset_resume function with reset operation

The smsc95xx needs to resume with reset operation. Otherwise it causes
system hang by network error like below after resume. This case appears
on odroid u3 board.

[    9.727600] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.727648] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.727689] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.727728] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.729486] PM: resume of devices complete after 2011.219 msecs
[   10.117609] Restarting tasks ... done.
[   11.725099] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[   13.480846] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[   13.481361] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
...

Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b4df480f68ae03b5dd4ab0db56536fbcec741705)
BUG=chromium:677572
TEST=None
Change-Id: I45f33805718d47843a03d5c7260abbcee7f74453
Signed-off-by: Kirtika Ruchandani <kirtika@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/433200
Reviewed-by: Dmitry Torokhov <dtor@chromium.org>

[modify] https://crrev.com/c34b8aeed8a4715eef34fb3314e1be90681d45fb/drivers/net/usb/smsc95xx.c

Project Member

Comment 31 by bugdroid1@chromium.org, Jan 27 2017

Labels: merge-merged-chromeos-3.8
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/f09b42f61e3588eca466c293f26f84e06947ff10

commit f09b42f61e3588eca466c293f26f84e06947ff10
Author: Joonyoung Shim <jy0922.shim@samsung.com>
Date: Thu Jul 10 02:49:42 2014

UPSTREAM: usbnet: smsc95xx: add reset_resume function with reset operation

The smsc95xx needs to resume with reset operation. Otherwise it causes
system hang by network error like below after resume. This case appears
on odroid u3 board.

[    9.727600] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.727648] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.727689] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.727728] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[    9.729486] PM: resume of devices complete after 2011.219 msecs
[   10.117609] Restarting tasks ... done.
[   11.725099] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[   13.480846] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
[   13.481361] smsc95xx 1-2:1.0 eth0: kevent 2 may have been dropped
...

Signed-off-by: Joonyoung Shim <jy0922.shim@samsung.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
(cherry picked from commit b4df480f68ae03b5dd4ab0db56536fbcec741705)
BUG=chromium:677572
TEST=None
Signed-off-by: Kirtika Ruchandani <kirtika@chromium.org>
Change-Id: I4975c53e2a2ba72580e2ce245f0ca5a3aeb9ab15
Reviewed-on: https://chromium-review.googlesource.com/433218
Reviewed-by: Dmitry Torokhov <dtor@chromium.org>

[modify] https://crrev.com/f09b42f61e3588eca466c293f26f84e06947ff10/drivers/net/usb/smsc95xx.c

Here's another case on a samus.  The autoserv requested a reboot, and the host rebooted, but did not have an ethernet connection for over 6 minutes, so autoserv gave up.

I found the kernel log in the repair bucket, running "dut-status -f chromeos2-row3-rack11-host9" and finding the first repair after the failed test.

    2017-02-03 01:56:49  OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row3-rack11-host9/1240876-provision/
    2017-02-03 01:37:44  OK http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row3-rack11-host9/1240743-repair/
    2017-02-03 01:35:10  -- http://cautotest/tko/retrieve_logs.cgi?job=/results/hosts/chromeos2-row3-rack11-host9/1240721-reset/
    2017-02-03 00:56:59  -- http://cautotest/tko/retrieve_logs.cgi?job=/results/99463171-chromeos-test/


https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos2-row3-rack11-host9/1240743-repair/20170302013740/

look in the "after_reset" log collection:

  after_reset.chromeos2-row3-rack11-host9.tgz

Looks like check_ethernet.hook tries to bring the interface up multiple times, but keeps failing.

There are several reboots (not included in snippet), then a block of 0's in the syslog (maybe a reset?) then eth1 finally works (after almost 1h).

I wonder if the dongle was in a bad state and it took the reset to power-cycle it.


2017-02-03T00:58:23.051484-08:00 INFO session_manager[1314]: [INFO:server_backed_state_key_generator.cc(131)] Stable device secret missing!
2017-02-03T00:58:23.059854-08:00 INFO kernel: [   16.284795] usb 1-2.1: new high-speed USB device number 6 using xhci_hcd
2017-02-03T00:58:23.143843-08:00 INFO kernel: [   16.368935] usb 1-2.1: New USB device found, idVendor=0424, idProduct=ec00
2017-02-03T00:58:23.143851-08:00 INFO kernel: [   16.368942] usb 1-2.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
2017-02-03T00:58:24.220843-08:00 INFO kernel: [   17.447075] smsc95xx v1.0.4
2017-02-03T00:58:24.258843-08:00 INFO kernel: [   17.485235] smsc95xx 1-2.1:1.0 eth0: register 'smsc95xx' at usb-0000:00:14.0-2.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:15:88:24
2017-02-03T00:58:24.258849-08:00 INFO kernel: [   17.485275] usbcore: registered new interface driver smsc95xx
2017-02-03T00:58:24.262908-08:00 DEBUG kernel: [   17.488792] SELinux: initialized (dev loop3, type squashfs), uses xattr
2017-02-03T00:58:24.266701-08:00 INFO laptop-mode[3059]: Warning: Configuration file /etc/laptop-mode/conf.d/board-specific/*.conf is not readable, skipping.
2017-02-03T00:58:24.276879-08:00 DEBUG kernel: [   17.503272] SELinux: initialized (dev loop4, type squashfs), uses xattr
2017-02-03T00:58:24.279847-08:00 DEBUG kernel: [   17.506209] SELinux: initialized (dev loop5, type squashfs), uses xattr
2017-02-03T00:58:24.285005-08:00 DEBUG kernel: [   17.511173] SELinux: initialized (dev loop6, type squashfs), uses xattr
2017-02-03T00:58:24.310541-08:00 INFO laptop-mode[3090]: Laptop mode 
2017-02-03T00:58:24.311844-08:00 INFO kernel: [   17.538729] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T00:58:24.311854-08:00 INFO kernel: [   17.538771] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-02-03T00:58:24.314707-08:00 INFO laptop-mode[3091]: enabled, not active
2017-02-03T00:58:25.860169-08:00 ERR mtpd[1815]: LIBMTP_Detect_Raw_Devices failed with 5
2017-02-03T01:03:24.360537-08:00 NOTICE check_ethernet.hook[4033]: Attempting recovery method "ensure_connection_manager_is_running"
2017-02-03T01:03:24.367035-08:00 NOTICE check_ethernet.hook[4037]: Attempting recovery method "toggle_ethernet_interfaces"
2017-02-03T01:03:24.378869-08:00 INFO kernel: [  317.848535] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:03:24.417892-08:00 INFO kernel: [  317.887067] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:03:24.417924-08:00 INFO kernel: [  317.887184] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-02-03T01:03:54.451511-08:00 NOTICE check_ethernet.hook[4104]: Attempting recovery method "reload_usb_ethernet_devices"
2017-02-03T01:03:54.468301-08:00 NOTICE /usr/sbin/reload_network_device[4113]: Restarting network interface eth0
2017-02-03T01:03:54.483872-08:00 INFO kernel: [  347.977889] usbcore: deregistering interface driver smsc95xx
2017-02-03T01:03:54.483897-08:00 INFO kernel: [  347.977986] smsc95xx 1-2.1:1.0 eth0: unregister 'smsc95xx' usb-0000:00:14.0-2.1, smsc95xx USB 2.0 Ethernet
2017-02-03T01:03:54.483902-08:00 INFO kernel: [  347.978033] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:03:54.496186-08:00 ERR shill[1612]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-02-03T01:03:54.496345-08:00 ERR shill[1612]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-02-03T01:03:55.506880-08:00 INFO kernel: [  349.001783] usb 1-2.1: authorized to connect
2017-02-03T01:03:55.511887-08:00 INFO kernel: [  349.006025] smsc95xx v1.0.4
2017-02-03T01:03:55.550977-08:00 INFO kernel: [  349.045357] smsc95xx 1-2.1:1.0 eth0: register 'smsc95xx' at usb-0000:00:14.0-2.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:15:88:24
2017-02-03T01:03:55.551008-08:00 INFO kernel: [  349.045508] usbcore: registered new interface driver smsc95xx
2017-02-03T01:03:55.567159-08:00 INFO laptop-mode[4136]: Warning: Configuration file /etc/laptop-mode/conf.d/board-specific/*.conf is not readable, skipping.
2017-02-03T01:03:55.602883-08:00 INFO kernel: [  349.097216] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:03:55.602905-08:00 INFO kernel: [  349.097322] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-02-03T01:03:55.625975-08:00 INFO laptop-mode[4160]: Laptop mode 
2017-02-03T01:03:55.627602-08:00 INFO laptop-mode[4161]: enabled, not active
2017-02-03T01:03:56.859679-08:00 ERR mtpd[1815]: LIBMTP_Detect_Raw_Devices failed with 5
2017-02-03T01:04:25.535333-08:00 NOTICE check_ethernet.hook[4212]: Attempting recovery method "restart_connection_manager"
2017-02-03T01:04:25.614971-08:00 ERR wpa_supplicant[710]: rmdir[ctrl_interface=/var/run/wpa_supplicant]: Permission denied
2017-02-03T01:04:25.615875-08:00 INFO kernel: [  379.134824] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:04:25.781878-08:00 INFO kernel: [  379.300714] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:04:25.781897-08:00 INFO kernel: [  379.300831] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-02-03T01:04:25.797278-08:00 ERR shill[4227]: [ERROR:attribute_list.cc(244)] Trying to re-add attribute: 3
2017-02-03T01:04:25.799245-08:00 ERR shill[4227]: message repeated 26 times: [ [ERROR:attribute_list.cc(244)] Trying to re-add attribute: 3]
2017-02-03T01:04:25.811863-08:00 INFO kernel: [  379.331049] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
2017-02-03T01:04:25.812851-08:00 INFO kernel: [  379.331824] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
2017-02-03T01:04:25.838662-08:00 INFO avahi-daemon[1871]: Server startup complete. Host name is 6498f56730113673f739af9158944799.local. Local service cookie is 247169854.
2017-02-03T01:04:25.838693-08:00 INFO avahi-daemon[1871]: Changing host name to '6498f56730113673f739af9158944799'.
2017-02-03T01:04:25.844777-08:00 NOTICE cros-machine-id-regen[4251]: Regenerated /var/lib/dbus/machine-id (reason: network).
2017-02-03T01:04:25.981007-08:00 INFO kernel: [  379.500399] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
2017-02-03T01:04:25.981897-08:00 INFO kernel: [  379.501026] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
2017-02-03T01:04:26.010867-08:00 INFO kernel: [  379.529593] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
2017-02-03T01:04:55.716773-08:00 NOTICE check_ethernet.hook[4295]: Restart failed; will retry recovery for ~15 minutes
2017-02-03T01:05:25.748070-08:00 NOTICE check_ethernet.hook[4338]: Attempting recovery method "ensure_connection_manager_is_running"
2017-02-03T01:05:25.752857-08:00 NOTICE check_ethernet.hook[4342]: Attempting recovery method "toggle_ethernet_interfaces"
2017-02-03T01:05:25.763863-08:00 INFO kernel: [  439.331667] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:05:25.802890-08:00 INFO kernel: [  439.370159] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:05:25.802920-08:00 INFO kernel: [  439.370273] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-02-03T01:05:55.831681-08:00 NOTICE check_ethernet.hook[4393]: Attempting recovery method "reload_usb_ethernet_devices"
2017-02-03T01:05:55.837685-08:00 NOTICE /usr/sbin/reload_network_device[4402]: Restarting network interface eth0
2017-02-03T01:05:55.844996-08:00 INFO kernel: [  469.437217] usbcore: deregistering interface driver smsc95xx
2017-02-03T01:05:55.845009-08:00 INFO kernel: [  469.437270] smsc95xx 1-2.1:1.0 eth0: unregister 'smsc95xx' usb-0000:00:14.0-2.1, smsc95xx USB 2.0 Ethernet
2017-02-03T01:05:55.845011-08:00 INFO kernel: [  469.437297] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:05:55.855631-08:00 ERR shill[4227]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-02-03T01:05:55.855791-08:00 ERR shill[4227]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-02-03T01:05:56.865880-08:00 INFO kernel: [  470.458268] usb 1-2.1: authorized to connect
2017-02-03T01:05:56.869870-08:00 INFO kernel: [  470.462726] smsc95xx v1.0.4
2017-02-03T01:05:56.908707-08:00 INFO kernel: [  470.501157] smsc95xx 1-2.1:1.0 eth0: register 'smsc95xx' at usb-0000:00:14.0-2.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:15:88:24
2017-02-03T01:05:56.908727-08:00 INFO kernel: [  470.501349] usbcore: registered new interface driver smsc95xx
2017-02-03T01:05:56.924239-08:00 INFO laptop-mode[4425]: Warning: Configuration file /etc/laptop-mode/conf.d/board-specific/*.conf is not readable, skipping.
2017-02-03T01:05:56.960918-08:00 INFO kernel: [  470.553740] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:05:56.960927-08:00 INFO kernel: [  470.553853] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-02-03T01:05:56.986416-08:00 INFO laptop-mode[4449]: Laptop mode 
2017-02-03T01:05:56.988640-08:00 INFO laptop-mode[4450]: enabled, not active
2017-02-03T01:05:57.800323-08:00 ERR mtpd[1815]: LIBMTP_Detect_Raw_Devices failed with 5
2017-02-03T01:06:26.894219-08:00 NOTICE check_ethernet.hook[4503]: Attempting recovery method "restart_connection_manager"
2017-02-03T01:06:26.970964-08:00 ERR wpa_supplicant[710]: rmdir[ctrl_interface=/var/run/wpa_supplicant]: Permission denied
2017-02-03T01:06:26.971905-08:00 INFO kernel: [  500.588912] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:06:27.140964-08:00 INFO kernel: [  500.757855] smsc95xx 1-2.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-03T01:06:27.140988-08:00 INFO kernel: [  500.757967] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready

Cc: diand...@chromium.org
Doug has noticed this happen in a reboot stress test.  He said it happens once every 5000 reboots or so.

#33 happens on a Kevin, with the dongle on a USB Type C port.  Also smsc95xx driver.
@33: Let me get more logs next time it happens.  IIRC "lsusb" doesn't even show the smsc95xx on kevin.
Another one, on a samus.

https://pantheon.corp.google.com/storage/browser/chromeos-autotest-results/hosts/chromeos2-row3-rack10-host1/1357984-provision/20171702071849/sysinfo/var/log/

messages:

2017-02-17T14:27:49.236020+00:00 INFO kernel: [   56.289769] usb 1-1.1: new high-speed USB device number 6 using xhci_hcd
2017-02-17T14:27:49.320030+00:00 INFO kernel: [   56.373918] usb 1-1.1: New USB device found, idVendor=0424, idProduct=ec00
2017-02-17T14:27:49.320043+00:00 INFO kernel: [   56.373929] usb 1-1.1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
2017-02-17T14:27:49.337035+00:00 INFO kernel: [   56.390943] smsc95xx v1.0.4
2017-02-17T14:27:49.376601+00:00 INFO kernel: [   56.429316] smsc95xx 1-1.1:1.0 eth0: register 'smsc95xx' at usb-0000:00:14.0-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:15:86:c8
2017-02-17T14:27:49.376612+00:00 INFO kernel: [   56.429358] usbcore: registered new interface driver smsc95xx
2017-02-17T14:27:49.425029+00:00 INFO kernel: [   56.479047] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-17T14:27:49.425038+00:00 INFO kernel: [   56.479098] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
...
2017-02-17T14:32:48.504905+00:00 NOTICE check_ethernet.hook[4112]: Attempting recovery method "ensure_connection_manager_is_running"
2017-02-17T14:32:48.508907+00:00 NOTICE check_ethernet.hook[4116]: Attempting recovery method "toggle_ethernet_interfaces"
2017-02-17T14:32:48.516135+00:00 INFO kernel: [  356.676481] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-17T14:32:48.554143+00:00 INFO kernel: [  356.714344] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-17T14:32:48.554165+00:00 INFO kernel: [  356.714387] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-02-17T14:32:48.903968+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.DailyUseTime for reading: No such file or directory
2017-02-17T14:32:48.904454+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UserCrashInterval for reading: No such file or directory
2017-02-17T14:32:48.904692+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.KernelCrashInterval for reading: No such file or directory
2017-02-17T14:32:48.905231+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/daily.cycle for reading: No such file or directory
2017-02-17T14:32:48.905899+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.AnyCrashesDaily for reading: No such file or directory
2017-02-17T14:32:48.906235+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UserCrashesDaily for reading: No such file or directory
2017-02-17T14:32:48.906514+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.KernelCrashesDaily for reading: No such file or directory
2017-02-17T14:32:48.906778+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UncleanShutdownsDaily for reading: No such file or directory
2017-02-17T14:32:48.907201+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/weekly.cycle for reading: No such file or directory
2017-02-17T14:32:48.907429+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.AnyCrashesWeekly for reading: No such file or directory
2017-02-17T14:32:48.907694+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UserCrashesWeekly for reading: No such file or directory
2017-02-17T14:32:48.907957+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.KernelCrashesWeekly for reading: No such file or directory
2017-02-17T14:32:48.908254+00:00 WARNING metrics_daemon[2420]: [WARNING:persistent_integer.cc(79)] cannot open /var/lib/metrics/Platform.UncleanShutdownsWeekly for reading: No such file or directory
2017-02-17T14:33:18.580780+00:00 NOTICE check_ethernet.hook[4190]: Attempting recovery method "reload_usb_ethernet_devices"
2017-02-17T14:33:18.626051+00:00 NOTICE /usr/sbin/reload_network_device[4199]: Restarting network interface eth0
2017-02-17T14:33:18.644223+00:00 INFO kernel: [  386.827758] usbcore: deregistering interface driver smsc95xx
2017-02-17T14:33:18.644248+00:00 INFO kernel: [  386.827860] smsc95xx 1-1.1:1.0 eth0: unregister 'smsc95xx' usb-0000:00:14.0-1.1, smsc95xx USB 2.0 Ethernet
2017-02-17T14:33:18.644253+00:00 INFO kernel: [  386.827910] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-17T14:33:18.657895+00:00 ERR shill[1477]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-02-17T14:33:18.658058+00:00 ERR shill[1477]: [ERROR:device.cc(1274)] IP flag write failed: 1 to /proc/sys/net/ipv6/conf/eth0/disable_ipv6
2017-02-17T14:33:19.672154+00:00 INFO kernel: [  387.857506] usb 1-1.1: authorized to connect
2017-02-17T14:33:19.676158+00:00 INFO kernel: [  387.861734] smsc95xx v1.0.4
2017-02-17T14:33:19.714488+00:00 INFO kernel: [  387.900099] smsc95xx 1-1.1:1.0 eth0: register 'smsc95xx' at usb-0000:00:14.0-1.1, smsc95xx USB 2.0 Ethernet, 94:eb:2c:15:86:c8
2017-02-17T14:33:19.714515+00:00 INFO kernel: [  387.900304] usbcore: registered new interface driver smsc95xx
2017-02-17T14:33:19.798159+00:00 INFO kernel: [  387.983469] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-17T14:33:19.798183+00:00 INFO kernel: [  387.983523] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-02-17T14:33:21.198715+00:00 ERR mtpd[2383]: LIBMTP_Detect_Raw_Devices failed with 5
2017-02-17T14:33:49.700854+00:00 NOTICE check_ethernet.hook[4260]: Attempting recovery method "restart_connection_manager"

etc. etc.

finally:

2017-02-17T14:47:59.592121+00:00 INFO kernel: [ 1268.487796] smsc95xx 1-1.1:1.0 eth0: hardware isn't capable of remote wakeup
2017-02-17T14:47:59.592128+00:00 INFO kernel: [ 1268.487838] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-02-17T14:47:59.601660+00:00 ERR shill[6467]: [ERROR:attribute_list.cc(244)] Trying to re-add attribute: 3
2017-02-17T14:47:59.602232+00:00 ERR shill[6467]: message repeated 26 times: [ [ERROR:attribute_list.cc(244)] Trying to re-add attribute: 3]
2017-02-17T14:47:59.609163+00:00 INFO kernel: [ 1268.505376] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
2017-02-17T14:47:59.610126+00:00 INFO kernel: [ 1268.506149] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
2017-02-17T14:47:59.641086+00:00 INFO avahi-daemon[2465]: Server startup complete. Host name is 449056ae9b9f2b2007bd8e4258a70d1f.local. Local service cookie is 3771879074.
2017-02-17T14:47:59.641159+00:00 INFO avahi-daemon[2465]: Changing host name to '449056ae9b9f2b2007bd8e4258a70d1f'.
2017-02-17T14:47:59.647925+00:00 NOTICE cros-machine-id-regen[6491]: Regenerated /var/lib/dbus/machine-id (reason: network).
2017-02-17T14:47:59.793198+00:00 INFO kernel: [ 1268.689665] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
2017-02-17T14:47:59.794137+00:00 INFO kernel: [ 1268.690473] iwlwifi 0000:01:00.0: L1 Enabled - LTR Enabled
2017-02-17T14:47:59.822151+00:00 INFO kernel: [ 1268.718115] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
2017-02-17T14:48:29.547655+00:00 NOTICE check_ethernet.hook[6555]: All efforts to recover ethernet have been exhausted. Rebooting.

That took a looong time though.


@33: So in my reboot stress tests (on kevin), I'm pretty sure that the problem was not the same as the one discussed here.  In my case it appears that an EC comms failure caused the extcon code to fail which then caused the USB port not to enumerate.

Said another way: smsc95xx seems to be working fine for me on my kevin reboots (kernel 4.4) so far.
Note that I have confirmed that my USB Ethernet issues on Kevin were not related to smsc95xx.  I've done tens of thousands of reboots on a farm of many machines and not seen any smsc95xx problems on kernel 4.4.
#38 thanks!  Also see comment #25: I haven't seen this happen on 3.18 or 4.4.
Since we aren't using smsc95xx in the test lab,  can we close this bug as obsolete? or is there still something necessary here?

If something is still necessary, please re-assign this bug to me with some instructions on what still needs to be done.
If we got rid of them then yes, it's fine to close it, thanks!
I believe the smsc95xx is what servo v2/v3 has/uses.  However it is not used in the lab for network connectivity.

A similiarish one is used the smsc7500 not sure if relevant.
Owner: ----
Status: Started
It was my impression that the dongles in a number of systems used the smsc95xx driver.  At least that's what the logs show.  (For instance see comment #4.)  It appears that those dongles get in a bad state too frequently, sometime even requiring a full power cycle to work normally.  Several earlier comments show exactly which DUTs were using those dongles.  If all those dongles are gone, we can close the bug, otherwise, I don't think so.

I am not working on it right now so I will leave it unassigned.
Latest occurrence was on a Caroline testlab DUT, which was down from 7/12 to 7/30.
Logs: https://storage.googleapis.com/chromiumos-test-logs/Caroline_Fail/debug-logs_20170731-135605.tgz

I see repeated shill restarts and a spew of device_set_wakeup_enable error. 

2017-07-31T06:02:18.540564-07:00 WARNING crash_reporter[10336]: Could not load the device policy file.
2017-07-31T06:02:18.540779-07:00 WARNING crash_reporter[10336]: [user] Received crash notification for shill[10309] sig 6, user 0 (developer build - not testing - always dumping)
2017-07-31T06:02:18.542481-07:00 INFO crash_reporter[10336]: State of crashed process [10309]: S (sleeping)
2017-07-31T06:02:18.543193-07:00 WARNING crash_reporter[10336]: Crash directory /var/spool/crash already full with 32 pending reports
2017-07-31T06:02:18.543250-07:00 ERR crash_reporter[10336]: Could not create crash directory
2017-07-31T06:02:18.543288-07:00 ERR crash_reporter[10336]: Unable to find/create process-specific crash path
2017-07-31T06:02:18.545886-07:00 WARNING kernel: [  563.815587] init: shill main process (10309) killed by ABRT signal
2017-07-31T06:02:18.545902-07:00 WARNING kernel: [  563.815616] init: shill respawning too fast, stopped
2017-07-31T06:02:18.645094-07:00 INFO kernel: [  563.915222] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
2017-07-31T06:02:18.645923-07:00 INFO kernel: [  563.916022] smsc75xx 1-2.3:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1
2017-07-31T06:02:27.863880-07:00 ERR shill[10363]: [ERROR:crypto_des_cbc.cc(103)] Unable to load key matter from /var/lib/whitelist/owner.key
2017-07-31T06:02:27.900219-07:00 ERR shill[10363]: [ERROR:crypto_des_cbc.cc(103)] Unable to load key matter from /var/lib/whitelist/owner.key
2017-07-31T06:02:27.901209-07:00 ERR shill[10363]: [ERROR:crypto_des_cbc.cc(103)] Unable to load key matter from /var/lib/whitelist/owner.key
2017-07-31T06:02:27.919888-07:00 WARNING kernel: [  573.189659] smsc75xx 1-2.3:1.0 eth0: device_set_wakeup_enable error -22


John, wasn't there another bug open to replace all Startech devices?
(Those are smsc75xx right?)

The Caroline mentioned in comment #44 is not likely due to a problem with smsc75xx *driver* since the following recover attempts should have cleared any driver issues:
grundler <1590>fgrep "All efforts to recover ethernet have been exhausted. Rebooting." messages
2017-07-31T06:15:22.173795-07:00 NOTICE check_ethernet.hook[23008]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T06:37:57.574826-07:00 NOTICE check_ethernet.hook[23326]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T07:00:23.477640-07:00 NOTICE check_ethernet.hook[22761]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T07:22:59.397459-07:00 NOTICE check_ethernet.hook[23360]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T07:45:34.194010-07:00 NOTICE check_ethernet.hook[22790]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T08:08:09.354806-07:00 NOTICE check_ethernet.hook[22888]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T08:30:44.415445-07:00 NOTICE check_ethernet.hook[23333]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T08:52:38.951401-07:00 NOTICE check_ethernet.hook[21843]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T09:15:14.440089-07:00 NOTICE check_ethernet.hook[23546]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T09:37:39.644493-07:00 NOTICE check_ethernet.hook[22430]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T10:00:15.471319-07:00 NOTICE check_ethernet.hook[23333]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T10:22:41.430158-07:00 NOTICE check_ethernet.hook[22444]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T10:45:07.325786-07:00 NOTICE check_ethernet.hook[22415]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T11:07:44.559204-07:00 NOTICE check_ethernet.hook[23000]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T11:30:10.417667-07:00 NOTICE check_ethernet.hook[22380]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T11:52:36.615073-07:00 NOTICE check_ethernet.hook[22285]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T12:14:00.982927-07:00 NOTICE check_ethernet.hook[21545]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T12:49:07.809473-07:00 NOTICE check_ethernet.hook[22781]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T13:11:33.101491-07:00 NOTICE check_ethernet.hook[22350]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T13:33:58.813817-07:00 NOTICE check_ethernet.hook[22277]: All efforts to recover ethernet have been exhausted. Rebooting.
2017-07-31T13:55:26.921225-07:00 NOTICE check_ethernet.hook[25233]: All efforts to recover ethernet have been exhausted. Rebooting.


I don't think the device firmware crashed either since each recovery method appears to get the smsc USB dongle to believe it's link is up:
2017-07-31T12:55:42.280172-07:00 INFO kernel: [  386.870795] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
2017-07-31T12:55:45.353379-07:00 INFO kernel: [  389.944397] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
2017-07-31T12:55:45.354209-07:00 INFO kernel: [  389.945048] smsc75xx 1-2.3:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1
2017-07-31T12:55:45.357183-07:00 INFO kernel: [  389.947926] smsc75xx 1-2.3:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1
2017-07-31T12:57:28.216181-07:00 INFO kernel: [  492.807274] smsc75xx 1-2.3:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1

I suspect something is hosed in the switch in this case and it might be better to reboot the switches or diagnose problems there (I can't : no access to them).


I have some changes pending (waiting lab testing) to check_ethernet.hook before I add one more method (use suspect/resume to drop power on USB ethernet) that should handle "USB FW crashed" case.

FTR, the change I hope to land soon:
    https://chromium-review.googlesource.com/c/chromiumos/platform/crostestutils/+/608612

Components: -Infra>Client>Chrome
No.  The only NICs we are replacing are the Plugable (ASIX) ones.

Which DUT have this issue?  I can take a poke at the switch but I suspect there's not much that can be done if it is a LB4 switch.
Project Member

Comment 48 by bugdroid1@chromium.org, Sep 16 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crostestutils/+/8845a1b6b120ad2850989f0fe4139a8761b06e00

commit 8845a1b6b120ad2850989f0fe4139a8761b06e00
Author: Grant Grundler <grundler@chromium.org>
Date: Fri Sep 15 23:59:23 2017

check_ethernet: add toggle_usb_interface() method

Add "toggle USB interface" method which uses "authorized" field which
controls if a USB port can have a driver bind to it. On a few platforms,
this also results in the port dropping power while "0".

BUG=chromium:677572
TEST=TBD

Change-Id: I7a27199539a7f6c94dfbc4f6af821ed76ba4d614
Reviewed-on: https://chromium-review.googlesource.com/608612
Commit-Ready: Grant Grundler <grundler@chromium.org>
Tested-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Richard Barnette <jrbarnette@google.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/8845a1b6b120ad2850989f0fe4139a8761b06e00/recover_duts/hooks/check_ethernet.hook

Project Member

Comment 49 by bugdroid1@chromium.org, Sep 27 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crostestutils/+/59e3a67f3483e9354347649093f564e87e144817

commit 59e3a67f3483e9354347649093f564e87e144817
Author: Grant Grundler <grundler@google.com>
Date: Wed Sep 27 01:24:07 2017

check_ethernet: fix up nits in toggle_usb_interface() method

Recently added "toggle USB interface" method wasn't consistently
using that same pattern to find/test ethernet devices.

Add quotes around use of the complete path for "authorized" field.

These changes were supposed to land with patchset 9 but the CQ picked up
patchset 8 instead. :/
   https://chromium-review.googlesource.com/c/chromiumos/platform/crostestutils/+/608612/

BUG=chromium:677572
TEST=on local machine from console:
    touch /mnt/stateful_partition/.labmachine
    # disconnect ethernet cable
    bash -x /usr/local/bin/hooks/check_ethernet.hook
    # review output for all methods
    rm /mnt/stateful_partition/.labmachine

Change-Id: I4837c7e55676cb90b7a6d51dcefca1f51e5a243b
Reviewed-on: https://chromium-review.googlesource.com/674327
Commit-Ready: Grant Grundler <grundler@chromium.org>
Tested-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Richard Barnette <jrbarnette@google.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/59e3a67f3483e9354347649093f564e87e144817/recover_duts/hooks/check_ethernet.hook

Blockedon: 758781
This is still happening.  I noticed this when testing my quick provision code and was provisioning a nyan_blaze chromeos4-row7-rack9-host13 running nyan_blaze-release/R64-10145.0.0.

dmesg:
[  966.224874] smsc95xx 1-1.1:1.0 eth1: hardware isn't capable of remote wakeup
[  966.282324] smsc95xx 1-1.1:1.0 eth1: hardware isn't capable of remote wakeup
[  966.283179] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
[  966.350408] smsc75xx 1-2.2:1.0 eth0: device_set_wakeup_enable error -22
[  969.349458] smsc75xx 1-2.2:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1
[  969.361689] smsc75xx 1-2.2:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1
 
uptime -s and quick-provision logs:
localhost log # uptime -s
2017-11-23 19:07:46
localhost log # tail /var/log/quick-provision.log
  Server: CherryPy/3.2.2
  Last-Modified: Thu, 23 Nov 2017 01:31:52 GMT
  Date: Thu, 23 Nov 2017 19:21:41 GMT
  Content-Type: application/x-gtar-compressed
Length: 386227674 (368M) [application/x-gtar-compressed]
Saving to: 'STDOUT'

     0K ........ ........ ........ ........  8% 15.1M 22s
 32768K ........ ........ ........ ........ 17% 12.4M 22s
 65536K ........ ....localhost log #

Transfer appears to stop at:
localhost log # stat /var/log/quick-provision.log
  File: '/var/log/quick-provision.log'
  Size: 4719      	Blocks: 16         IO Block: 4096   regular file
Device: fe01h/65025d	Inode: 201         Links: 1
Access: (0644/-rw-r--r--)  Uid: (    0/    root)   Gid: (    0/    root)
Access: 2017-11-23 19:20:59.229999821 +0000
Modify: 2017-11-23 19:21:47.059999809 +0000
Change: 2017-11-23 19:21:47.059999809 +0000

messages -- it's not clear to me why Attempting recovery method "ensure_connection_manager_is_running" is triggering.
2017-11-23T19:21:41.077764+00:00 NOTICE quick-provision[4646]: Updated status: DUT: Stateful update
2017-11-23T19:22:39.296791+00:00 INFO kernel: [  890.098542] mwifiex_sdio mmc1:0001:1: info: scan: num_probes = 4
2017-11-23T19:22:42.703177+00:00 ERR wpa_supplicant[854]: dbus: wpa_dbus_get_object_properties: failed to get object properties: (org.freedesktop.DBus.Error.Failed) failed to parse RSN IE
2017-11-23T19:22:42.703187+00:00 ERR wpa_supplicant[854]: dbus: Failed to construct signal
2017-11-23T19:23:42.716057+00:00 INFO kernel: [  953.268775] mwifiex_sdio mmc1:0001:1: info: scan: num_probes = 4
2017-11-23T19:23:55.590826+00:00 NOTICE check_ethernet.hook[4845]: Attempting recovery method "ensure_connection_manager_is_running"
2017-11-23T19:23:55.601782+00:00 NOTICE check_ethernet.hook[4849]: Attempting recovery method "toggle_ethernet_interfaces"
2017-11-23T19:23:55.633080+00:00 INFO avahi-daemon[1867]: Interface eth0.IPv4 no longer relevant for mDNS.
2017-11-23T19:23:55.633099+00:00 INFO avahi-daemon[1867]: Leaving mDNS multicast group on interface eth0.IPv4 with address 100.115.198.95.
2017-11-23T19:23:55.646193+00:00 ERR dhcpcd[1684]: eth0: dhcp if_readrawpacket: Network is down
2017-11-23T19:23:55.646604+00:00 INFO avahi-daemon[1867]: Withdrawing address record for 100.115.198.95 on eth0.
2017-11-23T19:23:55.719285+00:00 INFO avahi-daemon[1867]: Joining mDNS multicast group on interface eth0.IPv4 with address 100.115.198.95.
2017-11-23T19:23:55.719314+00:00 INFO avahi-daemon[1867]: New relevant interface eth0.IPv4 for mDNS.
2017-11-23T19:23:55.719328+00:00 INFO avahi-daemon[1867]: Registering new address record for 100.115.198.95 on eth0.IPv4.
2017-11-23T19:23:55.725918+00:00 INFO kernel: [  966.224874] smsc95xx 1-1.1:1.0 eth1: hardware isn't capable of remote wakeup
2017-11-23T19:23:55.785941+00:00 INFO kernel: [  966.282324] smsc95xx 1-1.1:1.0 eth1: hardware isn't capable of remote wakeup
2017-11-23T19:23:55.785967+00:00 INFO kernel: [  966.283179] IPv6: ADDRCONF(NETDEV_UP): eth1: link is not ready
2017-11-23T19:23:55.786126+00:00 INFO avahi-daemon[1867]: Withdrawing address record for 100.115.198.95 on eth0.
2017-11-23T19:23:55.786258+00:00 INFO avahi-daemon[1867]: Leaving mDNS multicast group on interface eth0.IPv4 with address 100.115.198.95.
2017-11-23T19:23:55.786394+00:00 INFO avahi-daemon[1867]: Interface eth0.IPv4 no longer relevant for mDNS.
2017-11-23T19:23:55.856136+00:00 WARNING kernel: [  966.350408] smsc75xx 1-2.2:1.0 eth0: device_set_wakeup_enable error -22
2017-11-23T19:23:56.135522+00:00 INFO avahi-daemon[1867]: Server startup complete. Host name is 197c0681f527639c7502e0425a17204b.local. Local service cookie is 4054530940.
2017-11-23T19:23:56.135544+00:00 INFO avahi-daemon[1867]: Changing host name to '197c0681f527639c7502e0425a17204b'.
2017-11-23T19:23:56.141949+00:00 NOTICE cros-machine-id-regen[4891]: Regenerated /var/lib/dbus/machine-id (reason: network).
2017-11-23T19:23:58.866535+00:00 INFO kernel: [  969.349458] smsc75xx 1-2.2:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1
2017-11-23T19:23:58.876170+00:00 INFO kernel: [  969.361689] smsc75xx 1-2.2:1.0 eth0: link up, 1000Mbps, full-duplex, lpa 0xC1E1
2017-11-23T19:24:08.504959+00:00 INFO avahi-daemon[1867]: Joining mDNS multicast group on interface eth0.IPv4 with address 100.115.198.95.
2017-11-23T19:24:08.507129+00:00 INFO avahi-daemon[1867]: New relevant interface eth0.IPv4 for mDNS.
2017-11-23T19:24:08.507152+00:00 INFO avahi-daemon[1867]: Registering new address record for 100.115.198.95 on eth0.IPv4.
2017-11-23T19:24:14.423803+00:00 INFO sshd[4913]: Accepted publickey for root from 100.115.245.201 port 44208 ssh2: RSA SHA256:Fp1qWjFLyK1cTpiI5rdk7iEJwoK9lcnYAgbQtGC3jzU
2017-11-23T19:24:14.544530+00:00 INFO sshd[4913]: Received disconnect from 100.115.245.201 port 44208:11: disconnected by user
2017-11-23T19:24:14.544600+00:00 INFO sshd[4913]: Disconnected from user root 100.115.245.201 port 44208s

net.log:
2017-11-23T19:21:35.856434+00:00 INFO shill[1354]: [INFO:wifi.cc(324)] Scan on mlan0 from ScanTimerHandler
2017-11-23T19:21:39.269629+00:00 INFO shill[1354]: [INFO:wifi.cc(391)] ScanDone
2017-11-23T19:22:39.280804+00:00 INFO shill[1354]: [INFO:wifi.cc(324)] Scan on mlan0 from ScanTimerHandler
2017-11-23T19:22:42.703160+00:00 NOTICE wpa_supplicant[854]: dbus: fill_dict_with_properties dbus_interface=fi.w1.wpa_supplicant1.BSS dbus_property=RSN getter failed
2017-11-23T19:22:42.703177+00:00 ERR wpa_supplicant[854]: dbus: wpa_dbus_get_object_properties: failed to get object properties: (org.freedesktop.DBus.Error.Failed) failed to parse RSN IE
2017-11-23T19:22:42.703187+00:00 ERR wpa_supplicant[854]: dbus: Failed to construct signal
2017-11-23T19:22:42.705059+00:00 INFO shill[1354]: [INFO:wifi.cc(391)] ScanDone
2017-11-23T19:23:42.706615+00:00 INFO shill[1354]: [INFO:wifi.cc(324)] Scan on mlan0 from ScanTimerHandler
2017-11-23T19:23:46.143866+00:00 INFO shill[1354]: [INFO:wifi.cc(391)] ScanDone
2017-11-23T19:23:46.145212+00:00 INFO shill[1354]: [INFO:wifi_service.cc(783)] Rep ep updated for 9. [SSID=GIN-2g], bssid: 94:b4:0f:ff:d1:d3, sig: -80, sec: rsn, freq: 5660
2017-11-23T19:23:46.150586+00:00 INFO shill[1354]: [INFO:service.cc(296)] Service 32 destroyed.
2017-11-23T19:23:46.157583+00:00 INFO shill[1354]: [INFO:service.cc(312)] Suppressed autoconnect to service 8 (no carrier)
2017-11-23T19:23:55.646193+00:00 ERR dhcpcd[1684]: eth0: dhcp if_readrawpacket: Network is down
2017-11-23T19:23:55.721694+00:00 INFO dhcpcd[1684]: eth0: releasing lease of 100.115.198.95
2017-11-23T19:23:55.721772+00:00 INFO dhcpcd[1684]: eth0: sending RELEASE with xid 0xe51c5795
2017-11-23T19:23:55.722290+00:00 INFO dhcpcd[1684]: status changed to Release
2017-11-23T19:23:55.723013+00:00 INFO shill[1354]: [INFO:dhcp_config.cc(210)] Stopping 1684 (ReleaseIP)
2017-11-23T19:23:55.723110+00:00 INFO dhcpcd[1684]: received SIGTERM, stopping
2017-11-23T19:23:55.723125+00:00 INFO dhcpcd[1684]: eth0: removing interface
2017-11-23T19:23:55.756012+00:00 INFO dhcpcd[1684]: status changed to Release
2017-11-23T19:23:55.756401+00:00 INFO dhcpcd[1684]: dhcpcd exited
2017-11-23T19:23:55.785409+00:00 INFO shill[1354]: [INFO:service.cc(398)] Service 1: state Online -> Idle
2017-11-23T19:23:55.785529+00:00 INFO shill[1354]: [INFO:manager.cc(1459)] Service 1 updated; state: Idle failure Unknown
2017-11-23T19:23:55.796304+00:00 INFO shill[1354]: [INFO:manager.cc(1459)] Service 1 updated; state: Idle failure Unknown
2017-11-23T19:23:55.847497+00:00 INFO shill[1354]: [INFO:manager.cc(1459)] Service 1 updated; state: Idle failure Unknown
2017-11-23T19:23:55.848213+00:00 WARNING shill[1354]: [WARNING:ethernet.cc(467)] Failed to enable wake-on-lan: Invalid argument.
2017-11-23T19:23:55.921375+00:00 INFO shill[1354]: [INFO:manager.cc(1703)] Default physical service: 1 (not connected)
2017-11-23T19:23:55.922571+00:00 INFO shill[1354]: [INFO:service.cc(303)] Auto-connecting to service 1
2017-11-23T19:23:55.922944+00:00 INFO shill[1354]: [INFO:service.cc(319)] Connect to service 1: AutoConnect
2017-11-23T19:23:55.934640+00:00 INFO shill[1354]: [INFO:dhcp_config.cc(203)] Spawned /sbin/dhcpcd with pid: 4869
2017-11-23T19:23:55.935247+00:00 INFO shill[1354]: [INFO:service.cc(398)] Service 1: state Idle -> Configuring
2017-11-23T19:23:55.935667+00:00 INFO shill[1354]: [INFO:manager.cc(1459)] Service 1 updated; state: Configuring failure Unknown
2017-11-23T19:23:55.936906+00:00 INFO shill[1354]: [INFO:service.cc(312)] Suppressed autoconnect to service 8 (no carrier)
2017-11-23T19:23:55.940119+00:00 INFO shill[1354]: [INFO:service.cc(312)] Suppressed autoconnect to service 1 (connecting)
2017-11-23T19:23:55.940240+00:00 INFO shill[1354]: [INFO:service.cc(312)] Suppressed autoconnect to service 8 (no carrier)
2017-11-23T19:23:55.944620+00:00 NOTICE dhcpcd[4869]: dhcpcd is running with reduced privileges
2017-11-23T19:23:55.950895+00:00 INFO dhcpcd[4869]: status changed to Init
2017-11-23T19:23:55.951305+00:00 INFO shill[1354]: [INFO:dhcp_config.cc(145)] Init DHCP Proxy: eth0 at :1.36
2017-11-23T19:23:56.207165+00:00 INFO dhcpcd[4869]: status changed to Discover
2017-11-23T19:23:56.207303+00:00 INFO dhcpcd[4869]: eth0: soliciting a DHCP lease
2017-11-23T19:23:56.207373+00:00 INFO dhcpcd[4869]: eth0: sending DISCOVER (xid 0xc1c679e3), next in 4.5 seconds
2017-11-23T19:24:00.713390+00:00 INFO dhcpcd[4869]: eth0: sending DISCOVER (xid 0xc1c679e3), next in 7.8 seconds
2017-11-23T19:24:08.495320+00:00 INFO dhcpcd[4869]: eth0: sending DISCOVER (xid 0xc1c679e3), next in 15.8 seconds
2017-11-23T19:24:08.497582+00:00 INFO dhcpcd[4869]: eth0: received OFFER with xid 0xc1c679e3
2017-11-23T19:24:08.497673+00:00 INFO dhcpcd[4869]: eth0: offered 100.115.198.95 from 172.24.98.18
2017-11-23T19:24:08.497719+00:00 INFO dhcpcd[4869]: eth0: requesting lease of 100.115.198.95
2017-11-23T19:24:08.497756+00:00 INFO dhcpcd[4869]: status changed to Request


Project Member

Comment 52 by bugdroid1@chromium.org, Dec 8

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/6d0797fc1b00ac1de59c77f58b7dda92be9cec46

commit 6d0797fc1b00ac1de59c77f58b7dda92be9cec46
Author: Grant Grundler <grundler@google.com>
Date: Fri Dec 08 02:43:35 2017

autotest: sys_power: flock PAUSE_ETHERNETHOOK_FILE

PAUSE_ETHERNETHOOK_FILE is created and locked so check_ethernet.hook
knows callers of do_suspend will be bringing down the network link.

The idea is to "pause" check_ethernet.hook from taking any action on
the DUT until the flock is released. This code depends on the flock
being released when when the process exits.

BUG=chromium:677572
TEST=no change in behavior of this test.

Change-Id: I9e2643f5821a9779fe6c89307def44dc7b5d62c3
Reviewed-on: https://chromium-review.googlesource.com/653661
Commit-Ready: ChromeOS CL Exonerator Bot <chromiumos-cl-exonerator@appspot.gserviceaccount.com>
Tested-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Ravi Chandra Sadineni <ravisadineni@chromium.org>

[modify] https://crrev.com/6d0797fc1b00ac1de59c77f58b7dda92be9cec46/client/cros/sys_power.py

Looks like power_SuspendStress.bare started to fail after #52.

Maybe the assignment "pause_ethernet_fd = open..." in do_suspend made python think it's a local var? We might need to declare "global pause_ethernet_fd".
https://docs.python.org/2.7/faq/programming.html#what-are-the-rules-for-local-and-global-variables-in-python

12/11 17:44:06.651 DEBUG|             utils:0212| Running 'logger "autotest finished iteration /usr/local/autotest/results/default/power_SuspendStress.bare/sysinfo/iteration.1"'
12/11 17:44:06.655 DEBUG|              test:0420| after_iteration_hooks completed
12/11 17:44:06.656 WARNI|              test:0637| The test failed with the following exception
Traceback (most recent call last):
  File "/usr/local/autotest/common_lib/test.py", line 631, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/usr/local/autotest/common_lib/test.py", line 837, in _call_test_function
    raise error.UnhandledTestFail(e)
UnhandledTestFail: Unhandled UnboundLocalError: local variable 'pause_ethernet_fd' referenced before assignment
Traceback (most recent call last):
  File "/usr/local/autotest/common_lib/test.py", line 831, in _call_test_function
    return func(*args, **dargs)
  File "/usr/local/autotest/common_lib/test.py", line 495, in execute
    dargs)
  File "/usr/local/autotest/common_lib/test.py", line 362, in _call_run_once_with_retry
    postprocess_profiled_run, args, dargs)
  File "/usr/local/autotest/common_lib/test.py", line 400, in _call_run_once
    self.run_once(*args, **dargs)
  File "/usr/local/autotest/tests/power_SuspendStress/power_SuspendStress.py", line 92, in run_once
    self._suspender.suspend(random.randint(0, 3) + self._min_suspend)
  File "/usr/local/autotest/cros/power_suspend.py", line 520, in suspend
    alarm = self._suspend(duration + board_delay)
  File "/usr/local/autotest/cros/sys_power.py", line 147, in do_suspend
    if pause_ethernet_fd == 0:
UnboundLocalError: local variable 'pause_ethernet_fd' referenced before assignment

Project Member

Comment 55 by bugdroid1@chromium.org, Dec 11

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crostestutils/+/a27c77e841450b5863797fedbd36e4c1a527cc6c

commit a27c77e841450b5863797fedbd36e4c1a527cc6c
Author: Grant Grundler <grundler@google.com>
Date: Mon Dec 11 23:16:23 2017

check_ethernet: add PAUSE_FILE/TIME check

power_SuspendStress test will bring down the lab ethernet repeatedly
and often. This allows power_SuspendStress to communicate how long
it thinks it will be bouncing the link and thus can tell this script
when the test has completed (and should continue checking link state).

CQ-DEPEND=CL:653661
BUG=chromium:677572
TEST=manually invoke on local machine with "bash -x" to review flow.

Change-Id: I572074a4b5affd4a0079995356983b42a68c005c
Reviewed-on: https://chromium-review.googlesource.com/653662
Commit-Ready: Grant Grundler <grundler@chromium.org>
Tested-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/a27c77e841450b5863797fedbd36e4c1a527cc6c/recover_duts/hooks/check_ethernet.hook

Project Member

Comment 56 by bugdroid1@chromium.org, Dec 13

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/abab39832a1550ab1294bdcad58c40b7f0ab622b

commit abab39832a1550ab1294bdcad58c40b7f0ab622b
Author: Ben Zhang <benzh@chromium.org>
Date: Wed Dec 13 00:21:43 2017

autotest: sys_power: define pause_ethernet_fd as global

The assignment "pause_ethernet_fd = open..." in do_suspend made python
think it's a local var, and causing:
UnboundLocalError: local variable 'pause_ethernet_fd' referenced before assignment

BUG=chromium:677572
TEST=Run power_SuspendStress.bare on Fizz without error

Change-Id: I4c7c3b859bf44788dac1efc41b0ff0744ceb009b
Signed-off-by: Ben Zhang <benzh@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/820895
Commit-Ready: Daniel Kurtz <djkurtz@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/abab39832a1550ab1294bdcad58c40b7f0ab622b/client/cros/power/sys_power.py

Project Member

Comment 57 by bugdroid1@chromium.org, Jan 9

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/platform/crostestutils/+/1623cb9b768e260825f4ca0cd3c2e0033911311f

commit 1623cb9b768e260825f4ca0cd3c2e0033911311f
Author: Grant Grundler <grundler@google.com>
Date: Tue Jan 09 22:14:27 2018

recover_duts: increase polling frequency

This is an architectural change: the "hook" will check
it's state and run exactly once if it's allowed.
Thus we need to check more often.

90 seconds is about 2x longer than it should ever take to have a
working network (wired ethernet).  My experience is a USB NIC will
be up and working within 20 seconds of power on. But on a bad day it
might take 15-20 seconds to load the NIC driver module and then dhcpcd
could timeout after 30 seconds. I don't know if shill is retrying
anything and I don't expect that is matters since dhcpcd retries.

BUG=chromium:677572
TEST=Copy recovery_duts to DUT and reboot.
     Manually verify hook is invoked every minute after
     LONG_REBOOT_DELAY has expired.

Change-Id: I3da58c2378269ff1e10fb7e024dd2d4ec559eeb3
Reviewed-on: https://chromium-review.googlesource.com/704256
Commit-Ready: Grant Grundler <grundler@chromium.org>
Tested-by: Grant Grundler <grundler@chromium.org>
Reviewed-by: Richard Barnette <jrbarnette@google.com>
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/1623cb9b768e260825f4ca0cd3c2e0033911311f/recover_duts/recover_duts.py

Sign in to add a comment