New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 866181 link

Starred by 1 user

Issue metadata

Status: Assigned
Owner:
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 3
Type: Bug

Blocked on:
issue 870042



Sign in to add a comment

network_WiFi_MaskedBSSID.wifi_masked_bssid test failing with "SSID CrOS_Masked{0,1} is not in scan results" error

Project Member Reported by jmuppala@chromium.org, Jul 20

Issue description

Logs@
https://stainless.corp.google.com/search?view=matrix&row=build&col=model&first_date=2018-05-31&last_date=2018-06-27&test=network_WiFi_MaskedBSSID.wifi_masked_bssid&status=GOOD&status=WARN&status=FAIL&status=ERROR&exclude_cts=false&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false

Sample failure:
06/24 21:11:12.621 WARNI|              test:0637| The test failed with the following exception
Traceback (most recent call last):
  File "/usr/local/autotest/client/common_lib/test.py", line 631, in _exec
    _call_test_function(self.execute, *p_args, **p_dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 831, in _call_test_function
    return func(*args, **dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 495, in execute
    dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 362, in _call_run_once_with_retry
    postprocess_profiled_run, args, dargs)
  File "/usr/local/autotest/client/common_lib/test.py", line 400, in _call_run_once
    self.run_once(*args, **dargs)
  File "/usr/local/autotest/server/site_tests/network_WiFi_MaskedBSSID/network_WiFi_MaskedBSSID.py", line 42, in run_once
    [config.ssid for config in configurations])
  File "/usr/local/autotest/server/cros/network/wifi_client.py", line 586, in scan
    self.assert_bsses_include_ssids(bss_list, ssids)
  File "/usr/local/autotest/server/cros/network/wifi_client.py", line 280, in assert_bsses_include_ssids
    (ssid, found_bsses))
TestFail: SSID CrOS_Masked1 is not in scan results: [IwBss(bss='00:11:22:33:44:55', frequency=2412, ssid='CrOS_Masked0', security='open', ht=None, signal=-35.0)]


devices seeing this issue:
The following devices always failt this testcase,
astronaut -> chromeos15-row1-rack2-host6
gandof -> chromeos15-row1-rack8-host1 
lulu -> chromeos15-row1-rack5-host4 

and some devices fail intermittently,
edgar -> chromeos15-row1-rack10-host2
orco -> chromeos15-row1-rack9-host2
wizpig-> chromeos15-row1-rack6-host2


 
Cc: briannorris@chromium.org kirtika@chromium.org
Here is the difference I see in the hostapd logs from astronaut where it is failing vs. asuka where this test is passing. Does this mean that the AP is not being configured properly?


ASTRONAUT: https://storage.cloud.google.com/chromeos-autotest-results/212296961-chromeos-test/chromeos15-row1-rack2-host6/network_WiFi_MaskedBSSID/debug/hostapd_router_1_managed1.log

530145667.554605: nl80211: Ignored event (cmd=60) for foreign interface (ifindex 3614 wdev 0x0)
1530145683.864660: nl80211: Event message available
1530145683.864714: nl80211: Ignored event (cmd=60) for foreign interface (ifindex 3614 wdev 0x0)
1530145683.892513: VLAN: RTM_NEWLINK: ifi_index=3614 ifname=managed0 ifi_family=0 ifi_flags=0x1003 ([UP])
1530145683.892552: VLAN: vlan_newlink(managed0)
1530145683.892589: nl80211: Ignore RTM_NEWLINK event for foreign ifindex 3614
1530145683.987518: VLAN: RTM_NEWLINK: ifi_index=3614 ifname=managed0 ifi_family=0 ifi_flags=0x1002 ()
1530145683.987594: VLAN: vlan_newlink(managed0)
1530145683.987750: nl80211: Ignore RTM_NEWLINK event for foreign ifindex 3614


DEBUG logs:
06/27 17:27:47.607 INFO | site_linux_router:0480| AP configured.
06/27 17:27:47.612 DEBUG|          ssh_host:0301| Running (ssh) '(time -p /usr/sbin/iw dev wlan0 scan freq 2412 ssid "CrOS_Masked0" "CrOS_Masked1") 2>&1' from 'poll_for_condition|<lambda>|scan|timed_scan|run|run_very_slowly'
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] BSS 00:11:22:33:44:55(on wlan0)
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	last seen: 14716.517s [boottime]
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	TSF: 12260245 usec (0d, 00:00:12)
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	freq: 2412
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	beacon interval: 100 TUs
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	capability: ESS (0x0001)
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	signal: -37.00 dBm
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	last seen: 76 ms ago
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	Information elements from Probe Response frame:
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	SSID: CrOS_Masked0
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	Supported rates: 1.0* 2.0* 5.5 11.0 
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	DS Parameter set: channel 1
06/27 17:27:48.073 DEBUG|             utils:0286| [stdout] 	Extended capabilities: Extended Channel Switching, SSID List, 6
06/27 17:27:48.115 DEBUG|             utils:0286| [stdout] real 0.11
06/27 17:27:48.115 DEBUG|             utils:0286| [stdout] user 0.00
06/27 17:27:48.115 DEBUG|             utils:0286| [stdout] sys 0.00
06/27 17:27:48.117 DEBUG|              test:0410| Test failed due to SSID CrOS_Masked1 is not in scan results: [IwBss(bss='00:11:22:33:44:55', frequency=2412, ssid='CrOS_Masked0', security='open', ht=None, signal=-37.0)]. Exception log follows the after_iteration_hooks.







ASUKA: https://storage.cloud.google.com/chromeos-autotest-results/212294834-chromeos-test/chromeos15-row1-rack8-host5/network_WiFi_MaskedBSSID/debug/hostapd_router_1_managed1.log

1530128355.327664: nl80211: BSS Event 59 (NL80211_CMD_FRAME) received for managed1
1530128355.327719: nl80211: MLME event 59 (NL80211_CMD_FRAME) on managed1(00:11:22:33:44:55) A1=ff:ff:ff:ff:ff:ff A2=00:16:eb:4c:2e:99
1530128355.327768: nl80211: MLME event frame - hexdump(len=82): 40 00 00 00 ff ff ff ff ff ff 00 16 eb 4c 2e 99 ff ff ff ff ff ff e0 93 00 0c 43 72 4f 53 5f 4d 61 73 6b 65 64 30 01 08 02 04 0b 16 0c 12 18 24 32 04 30 48 60 6c 2d 1a ef 11 17 ff ff 00 00 00 00 00 00 00 00 2c 01 01 00 00 00 00 00 00 00 00 00 00
1530128355.327927: nl80211: Frame event
1530128355.327956: nl80211: RX frame sa=00:16:eb:4c:2e:99 freq=2412 ssi_signal=-82 fc=0x40 seq_ctrl=0x93e0 stype=4 (WLAN_FC_STYPE_PROBE_REQ) len=82
1530128355.330602: nl80211: Event message available
1530128355.330676: nl80211: BSS Event 59 (NL80211_CMD_FRAME) received for managed1
1530128355.330709: nl80211: MLME event 59 (NL80211_CMD_FRAME) on managed1(00:11:22:33:44:55) A1=ff:ff:ff:ff:ff:ff A2=00:16:eb:4c:2e:99
1530128355.330751: nl80211: MLME event frame - hexdump(len=82): 40 00 00 00 ff ff ff ff ff ff 00 16 eb 4c 2e 99 ff ff ff ff ff ff f0 93 00 0c 43 72 4f 53 5f 4d 61 73 6b 65 64 31 01 08 02 04 0b 16 0c 12 18 24 32 04 30 48 60 6c 2d 1a ef 11 17 ff ff 00 00 00 00 00 00 00 00 2c 01 01 00 00 00 00 00 00 00 00 00 00
1530128355.330904: nl80211: Frame event
1530128355.330931: nl80211: RX frame sa=00:16:eb:4c:2e:99 freq=2412 ssi_signal=-80 fc=0x40 seq_ctrl=0x93f0 stype=4 (WLAN_FC_STYPE_PROBE_REQ) len=82
1530128355.331037: 1530128355.331044: managed1: STA 00:16:eb:4c:2e:99 IEEE 802.11: received uni-cast proberequest for SSID:CrOS_Masked1 with frame_rssi:-80 BSSID:ff:ff:ff:ff:ff:ff mode:ht
1530128355.331201: nl80211: send_mlme - da= 00:16:eb:4c:2e:99 noack=0 freq=0 no_cck=0 offchanok=0 wait_time=0 fc=0x50 (WLAN_FC_STYPE_PROBE_RESP) nlmode=3
1530128355.331245: nl80211: send_mlme -> send_frame
1530128355.331269: nl80211: send_frame - Use bss->freq=2412
1530128355.331294: nl80211: send_frame -> send_frame_cmd
1530128355.331318: nl80211: CMD_FRAME freq=2412 wait=0 no_cck=0 no_ack=0 offchanok=0
1530128355.331345: CMD_FRAME - hexdump(len=69): 50 00 00 00 00 16 eb 4c 2e 99 00 11 22 33 44 55 00 11 22 33 44 55 00 00 00 00 00 00 00 00 00 00 64 00 01 00 00 0c 43 72 4f 53 5f 4d 61 73 6b 65 64 31 01 04 82 84 0b 16 03 01 01 7f 08 04 00 00 02 00 00 00 40
1530128355.331828: nl80211: Frame TX command accepted; cookie 0xf3
1530128355.331871: client monitor: update failed for station 00:16:eb:4c:2e:99
1530128355.331954: nl80211: Event message available
1530128355.332010: nl80211: Ignored event (cmd=60) for foreign interface (ifindex 8181 wdev 0x0)
1530128355.334511: nl80211: Event message available
1530128355.334624: nl80211: Drv Event 60 (NL80211_CMD_FRAME_TX_STATUS) received for managed1
1530128355.334676: nl80211: MLME event 60 (NL80211_CMD_FRAME_TX_STATUS) on managed1(00:11:22:33:44:55) A1=00:16:eb:4c:2e:99 A2=00:11:22:33:44:55
1530128355.334743: nl80211: MLME event frame - hexdump(len=69): 50 00 00 00 00 16 eb 4c 2e 99 00 11 22 33 44 55 00 11 22 33 44 55 00 00 00 00 00 00 00 00 00 00 64 00 01 00 00 0c 43 72 4f 53 5f 4d 61 73 6b 65 64 31 01 04 82 84 0b 16 03 01 01 7f 08 04 00 00 02 00 00 00 40
1530128355.334959: nl80211: Frame TX status event
1530128355.335018: managed1: Event TX_STATUS (17) received
1530128355.968603: nl80211: Event message available
1530128355.968670: nl80211: Ignored event (cmd=60) for foreign interface (ifindex 8181 wdev 0x0)
1530128355.994514: VLAN: RTM_NEWLINK: ifi_index=8181 ifname=managed0 ifi_family=0 ifi_flags=0x1003 ([UP])
1530128355.994550: VLAN: vlan_newlink(managed0)
1530128355.994580: nl80211: Ignore RTM_NEWLINK event for foreign ifindex 8181
1530128356.121503: VLAN: RTM_NEWLINK: ifi_index=8181 ifname=managed0 ifi_family=0 ifi_flags=0x1002 ()
1530128356.121582: VLAN: vlan_newlink(managed0)
1530128356.121644: nl80211: Ignore RTM_NEWLINK event for foreign ifindex 8181


DEBUG logs:
06/27 12:39:15.466 INFO | site_linux_router:0480| AP configured.
06/27 12:39:15.479 DEBUG|          ssh_host:0301| Running (ssh) '(time -p /usr/sbin/iw dev wlan0 scan freq 2412 ssid "CrOS_Masked0" "CrOS_Masked1") 2>&1' from 'poll_for_condition|<lambda>|scan|timed_scan|run|run_very_slowly'
06/27 12:39:15.939 DEBUG|             utils:0286| [stdout] BSS 00:11:22:33:44:55(on wlan0)
06/27 12:39:15.939 DEBUG|             utils:0286| [stdout] 	TSF: 13527101 usec (0d, 00:00:13)
06/27 12:39:15.939 DEBUG|             utils:0286| [stdout] 	freq: 2412
06/27 12:39:15.939 DEBUG|             utils:0286| [stdout] 	beacon interval: 100 TUs
06/27 12:39:15.939 DEBUG|             utils:0286| [stdout] 	capability: ESS (0x0001)
06/27 12:39:15.939 DEBUG|             utils:0286| [stdout] 	signal: -54.00 dBm
06/27 12:39:15.939 DEBUG|             utils:0286| [stdout] 	last seen: 11 ms ago
06/27 12:39:15.939 DEBUG|             utils:0286| [stdout] 	Information elements from Probe Response frame:
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	SSID: CrOS_Masked0
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	Supported rates: 1.0* 2.0* 5.5 11.0 
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	DS Parameter set: channel 1
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	Extended capabilities: Extended Channel Switching, SSID List, 6
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] BSS 00:11:22:33:44:55(on wlan0)
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	TSF: 4415178 usec (0d, 00:00:04)
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	freq: 2412
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	beacon interval: 100 TUs
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	capability: ESS (0x0001)
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	signal: -81.00 dBm
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	last seen: 11 ms ago
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	Information elements from Probe Response frame:
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	SSID: CrOS_Masked1
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	Supported rates: 1.0* 2.0* 5.5 11.0 
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	DS Parameter set: channel 1
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] 	Extended capabilities: Extended Channel Switching, SSID List, 6
06/27 12:39:15.940 DEBUG|             utils:0286| [stdout] real 0.11
06/27 12:39:15.941 DEBUG|             utils:0286| [stdout] user 0.00
06/27 12:39:15.941 DEBUG|             utils:0286| [stdout] sys 0.00





Cc: grundler@chromium.org
I'm not sure the highlighted snippets capture what's really going wrong.

It's pretty suspicious that the problem is consistently with the 2nd SSID, on phy2. I'm suspecting that phy2 is not actually a good radio to use on the Whirlwind. I'm seeing some related problems on Gale, since it doesn't have that 3rd phy (it only has a 2GHz phy0 and a 5GHz phy1), and once I rewrite the test to avoid using the 3rd radio (which I already did locally for Gale), it passes consistently on that astronaut.

BTW, I believe network_WiFi_VerifyRouter isn't actually verifying phy2 on Whirlwind -- it seems to only test the first two radios (despite the comments that suggest otherwise).

If I'm correct, we need to take a closer look at our lab verification tests, as well as our use of phy2 on Whirlwind. I have a few changes related to testing out Gale.
Cc: kevinhayes@google.com
Brian,

I am suspicious of any TX use of phy2. Historically (currently?), phy2 was only used to scan and measure noise of other channels while phy0/phy1 were in normal use. The idea was a third "ear" would allow the cloud to figure out the best channels (one in each band) for a given whirlwind to be using.
Owner: harpreet@chromium.org
Status: Assigned (was: Unconfirmed)
(This was mostly written before comment #3)

So I'm pretty sure it's just that phy2 isn't very useful on these routers. I'm not sure if it's a defective assembly, an inherent issue with the 3rd radio on Whirlwind, or a little of both. But if I do the following in network_WiFi_VerifyRouter, then it will actually verify the 3rd radio in AP mode again:

diff --git a/server/site_tests/network_WiFi_VerifyRouter/network_WiFi_VerifyRouter.py b/server/site_tests/network_WiFi_VerifyRouter/network_WiFi_VerifyRouter.py
index 07a09bdabe69..60a34aeb0c1a 100644
--- a/server/site_tests/network_WiFi_VerifyRouter/network_WiFi_VerifyRouter.py
+++ b/server/site_tests/network_WiFi_VerifyRouter/network_WiFi_VerifyRouter.py
@@ -66,7 +66,7 @@ class network_WiFi_VerifyRouter(wifi_cell_test_base.WiFiCellTestBase):
         # Setup two APs on |channel|. configure() will spread these across
         # radios.
         n_mode = hostap_config.HostapConfig.MODE_11N_MIXED
-        ap_config = hostap_config.HostapConfig(channel=channel, mode=n_mode)
+        ap_config = hostap_config.HostapConfig(channel=channel, mode=n_mode, min_streams=1)
         self.context.configure(ap_config)
         self.context.configure(ap_config, multi_interface=True)
         failures = []

and it passes on this device (where $subject was already passing):

chromeos15-row4-rack12-host4

but it fails on this, where $subject always fails:

chromeos15-row1-rack2-host6

---

EDIT, after comment #3:
I suppose it makes sense to avoid this radio as an AP then. We should still probably validate it as a monitor though, since we might end up using it as such (especially if we don't install a separate pcap device). I'm not sure the best way to do that though...maybe just keep the above change, to validate in AP mode?

I'm going to assign to Harpreet for now, to audit the lab for badly-installed Whirwlinds, in case this is partially a lab issue.
Blockedon: 870042
Owner: briannorris@chromium.org
Status: Started (was: Assigned)
I've got some work in progress for utilizing the radios differently on this test, so I'll take $subject.

I filed bug 870042 to track lab verification, and assigned *that* to Harpreet.
I think this is another symptom of the same problem:

https://stainless.corp.google.com/search?view=matrix&row=build&col=hostname&first_date=2018-07-26&last_date=2018-08-01&test=%5Enetwork%5C_WiFi%5C_BgscanBackoff%5C.&reason=%5EBackground+scans+should+detect+new+BSSeswithin+an+associated+ESS.&exclude_cts=true&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false

All of these network_WiFi_BgscanBackoff runs use phy2 on Whirlwind, and the test is failing with "Background scans should detect new BSSeswithin an associated ESS". That's probably because phy2 is not properly (?) broadcasting the new BSS.
Re #4, chromeos15-row4-rack12-host4 is a OTA setups where whirlwind in use is not taken apart whereas chromeos15-row1-rack2-host6 is a conductive setup where the whirlwind was disassembled and antennas connected are as shown in the picture at the link below. We only connect 2.4Ghz (phy0) and 5Ghz (phy1) antennas and do not connect the aux-radio which is what phy2 seems to be.

https://screenshot.googleplex.com/1fJgPBpNQV2

Here are more details about the conductive setup 
https://docs.google.com/document/d/1-mI6OIUgZhCcaprc9HFYwa5UzMA0twNFk9ccWoK4GJI/edit#


Given the above, network_WiFi_MaskedBSSID test still does pass on approx half of the conductive setups (anything in chromeos15-row1 racks 1 to 11 - see stainless link below). Does that mean it is able to get some signal on phy2 over-the-air or that it maybe using a different (phy0 or phy1) interface in those cases? 

https://stainless.corp.google.com/search?view=matrix&row=hostname&col=build&test=network_WiFi_MaskedBSSID.wifi_masked_bssid&hostname=%5Echromeos15-row1-&status=GOOD&status=WARN&status=FAIL&status=ERROR&exclude_cts=false&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=false&days=15
Project Member

Comment 9 by bugdroid1@chromium.org, Aug 2

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/042f9b2be03c04e57959a94aa64d1e059086191f

commit 042f9b2be03c04e57959a94aa64d1e059086191f
Author: Brian Norris <briannorris@chromium.org>
Date: Thu Aug 02 20:53:47 2018

network_WiFi_VerifyRouter: verify all router radios

Whirlwind has a 3rd radio (phy2) with a single antenna. The current test
skips this radio, because it can only support a single spatial stream,
and our defaults look for a minimum of 2. Lower the minimum for this
test, so we pick up the radio still.

BUG=chromium:866181
TEST=network_WiFi_VerifyRouter -- fails on routers where phy2 isn't
     working properly for whatever reason

Change-Id: Ib24336986e8bb49698050d8987aa69093ec316a8
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1158796
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/042f9b2be03c04e57959a94aa64d1e059086191f/server/site_tests/network_WiFi_VerifyRouter/network_WiFi_VerifyRouter.py

> Does that mean it is able to get some signal on phy2 over-the-air or that it maybe using a different (phy0 or phy1) interface in those cases? 

It's just getting an extremely weak signal over the air, but it's enough to at least register a scan. (That's all this is looking for -- it doesn't need to associate.)

See one log:

07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] BSS 00:11:22:33:44:55(on wlan0)
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	TSF: 4505984 usec (0d, 00:00:04)
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	freq: 2412
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	beacon interval: 100 TUs
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	capability: ESS (0x0001)
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	signal: -98.00 dBm
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	last seen: 101 ms ago
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	Information elements from Probe Response frame:
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	SSID: CrOS_Masked1
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	Supported rates: 1.0* 2.0* 5.5 11.0 
07/28 16:40:29.314 DEBUG|             utils:0287| [stdout] 	DS Parameter set: channel 1
07/28 16:40:29.315 DEBUG|             utils:0287| [stdout] 	TIM: DTIM Count 0 DTIM Period 2 Bitmap Control 0x0 Bitmap[0] 0x0
07/28 16:40:29.315 DEBUG|             utils:0287| [stdout] 	Extended capabilities: Extended Channel Switching, SSID List, 6

And the hostapd log is clearly showing it on phy2:

1532821223.101726: nl80211: interface managed1 in phy phy2


https://stainless.corp.google.com/browse/chromeos-autotest-results/221669331-chromeos-test/
https://storage.cloud.google.com/chromeos-autotest-results/221669331-chromeos-test/chromeos15-row1-rack1-host3/network_WiFi_MaskedBSSID/debug/hostapd_router_1_managed1.log


> We only connect 2.4Ghz (phy0) and 5Ghz (phy1) antennas and do not connect the aux-radio which is what phy2 seems to be.

OK. Then I don't know what's up with stuff like this:

https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/321917
https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/283621

We should probably rip out any of that stuff that allows using the 1x1 AUX radio. I've partially done that, but we should do that 100% if we don't expect this radio to be hooked up. (And I should probably also just revert comment #9 too.)
Project Member

Comment 11 by bugdroid1@chromium.org, Aug 3

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/1e1bc01176164206941956a673a740343b8e7b89

commit 1e1bc01176164206941956a673a740343b8e7b89
Author: Brian Norris <briannorris@chromium.org>
Date: Fri Aug 03 04:50:00 2018

[autotest] network_WiFi_MaskedBSSID: stop requiring MULTI_AP_SAME_BAND

This test has some strange requirements: it wants to set up an illegal
configuration, with 2 BSS's using the same BSSID, to imitate some broken
routers in the field. The Linux mac80211 framework doesn't accept this,
returning -ENOTUNIQ instead, so this doesn't work when you run both of
these BSS's on the same radio.

This all worked OK on APs that had more than 1 radio for each band (so
you work around Linux's per-interface BSSID restriction), but it doesn't
work on Gale, where we force the 2 BSS's onto the same radio.

We can work around this by just switching this test to put the two
incompatible BSS's on separate bands (2.4GHz / 5GHz), and then drop the
CAPABILITY_MULTI_AP_SAME_BAND requirement.

This should also fix some issues seen on some Whirlwind routers, where
the 3rd radio (phy2) wasn't operating reliably. We may try to avoid
using this radio entirely in the future, so this is a good start.

BUG=chromium:774808, chromium:866181
TEST=network_WiFi_MaskedBSSID.wifi_masked_bssid on Gale

Change-Id: Idab9c56f42ad426a0f6b323e49539699679cd2d4
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1159461
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/1e1bc01176164206941956a673a740343b8e7b89/server/site_tests/network_WiFi_MaskedBSSID/network_WiFi_MaskedBSSID.py

Project Member

Comment 12 by bugdroid1@chromium.org, Aug 3

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/1a7644e393fbb03980c495524b0aaee9afdb1d6d

commit 1a7644e393fbb03980c495524b0aaee9afdb1d6d
Author: Brian Norris <briannorris@chromium.org>
Date: Fri Aug 03 04:50:01 2018

[autotest] network_WiFi_BgscanBackoff: straighten out router requirements

As written today, this test requires CAPABILITY_MULTI_AP_SAME_BAND but
doesn't declare it. The test starts up multiple BSS's at distinct
frequencies, which can't be served by a single radio.

For the 'wifi_bgscan_backoff' test, this isn't really required; the test
can just as well be run on separate 2G vs. 5G bands.

For the '5760noise_check' variant, we explicitly wanted to test two
5GHz channels. This isn't possible on Gale, so let's add a capability
check so this test gets a TEST_NA result.

As a related effect, this also should move the .wifi_bgscan_backoff
variant to avoid running on phy2 on Whirlwind, which can help avoid some
flakiness. Whirlwind's phy2 is not known to be a reliable transmitter,
and we may stop using it entirely soon.

BUG=chromium:774808, chromium:866181
TEST=network_WiFi_BgscanBackoff.wifi_bgscan_backoff
     and network_WiFi_BgscanBackoff.5760_noise_check with gale;
     the former now passes, and the latter gets TEST_NA

Change-Id: I9f6d7ea0dba86d84aaa8cbc8ca236baf8fbdf92b
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1159462
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/1a7644e393fbb03980c495524b0aaee9afdb1d6d/server/site_tests/network_WiFi_BgscanBackoff/network_WiFi_BgscanBackoff.py
[modify] https://crrev.com/1a7644e393fbb03980c495524b0aaee9afdb1d6d/server/site_tests/network_WiFi_BgscanBackoff/control.5760noise_check
[modify] https://crrev.com/1a7644e393fbb03980c495524b0aaee9afdb1d6d/server/site_tests/network_WiFi_BgscanBackoff/control.wifi_bgscan_backoff

Project Member

Comment 13 by bugdroid1@chromium.org, Aug 7

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/66c7d49d0456ba37ad68963872a80f64a65ece2e

commit 66c7d49d0456ba37ad68963872a80f64a65ece2e
Author: Brian Norris <briannorris@chromium.org>
Date: Tue Aug 07 08:50:35 2018

Revert "network_WiFi_VerifyRouter: verify all router radios"

This reverts commit 042f9b2be03c04e57959a94aa64d1e059086191f and adjusts
some related comments (that were previously incorrect).

It turns out we *don't* want to use the 3rd Whirlwind radio as an AP,
and it's often not even connected in conductive setups. So don't try to
verify it.

BUG=chromium:866181
TEST=network_WiFi_VerifyRouter -- see that it doesn't pick up phy2 on
     whirlwind

Change-Id: I5f1ce02be49c192f422b8f2a2dc19a59547358fc
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1162982

[modify] https://crrev.com/66c7d49d0456ba37ad68963872a80f64a65ece2e/server/site_tests/network_WiFi_VerifyRouter/network_WiFi_VerifyRouter.py

Project Member

Comment 14 by bugdroid1@chromium.org, Aug 14

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/autotest/+/e942af85c9cd6d05b468e41be9bea1fd26eddd97

commit e942af85c9cd6d05b468e41be9bea1fd26eddd97
Author: Brian Norris <briannorris@chromium.org>
Date: Tue Aug 14 23:05:20 2018

[autotest] network_WiFi_BgscanBackoff: stop using Whirlwind's phy2

The 5760noise_check variant of this test was requesting 1 spatial
stream, so that it could run on Whirlwind's phy2. Don't do this, because
phy2 is not normally used in production as a transmitter, and because
our lab conductive setups don't usually wire up its antennas.

As an effect of this, we can't support 2 simultaneous channels on the 5
GHz band. Just use different channels from the 2.4 and 5 GHz bands.

Per Kirtika's suggestion, I make the 5760noise_check variant roughly
comparable to the wifi_bgscan_backoff variant, so that we can see
whether channel 153 (a known noisy channel) behaves significantly
differently than a known less-noisy 5GHz channel.

BUG=chromium:774808, chromium:866181
TEST=network_WiFi_BgscanBackoff.wifi_bgscan_backoff
     and network_WiFi_BgscanBackoff.5760_noise_check with gale;
     both now pass; run w/ whirlwind, and see we avoid phy2

Change-Id: I22227cee072d362ceb00bda39876717a374a1d44
Signed-off-by: Brian Norris <briannorris@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/1162983
Reviewed-by: Kirtika Ruchandani <kirtika@chromium.org>
Reviewed-by: Grant Grundler <grundler@chromium.org>

[modify] https://crrev.com/e942af85c9cd6d05b468e41be9bea1fd26eddd97/server/site_tests/network_WiFi_BgscanBackoff/network_WiFi_BgscanBackoff.py
[modify] https://crrev.com/e942af85c9cd6d05b468e41be9bea1fd26eddd97/server/site_tests/network_WiFi_BgscanBackoff/control.5760noise_check
[modify] https://crrev.com/e942af85c9cd6d05b468e41be9bea1fd26eddd97/server/site_tests/network_WiFi_BgscanBackoff/control.wifi_bgscan_backoff

Status: Fixed (was: Started)
OK, I think this should all be fixed. Stainless shows that there's very little red here now.
Summary: network_WiFi_MaskedBSSID.wifi_masked_bssid test failing with "SSID CrOS_Masked{0,1} is not in scan results" error (was: network_WiFi_MaskedBSSID.wifi_masked_bssid test failing on Astronaut/Gandof devices with "SSID CrOS_Masked1 is not in scan results" error)
Well, the previous bug was specifically about Masked1, which was caused by the way we incorrectly set up the 2nd BSS. But I'll rename the bug and maybe look at it.

BTW, anyone know why network_WiFi_VerifyRouter is barely running at all in the lab even though we scheduled it? There's a whole lot of NOT_RUN.
BTW, anyone know why network_WiFi_VerifyRouter is barely running at all in the lab even though we scheduled it? There's a whole lot of NOT_RUN.

>> I am suspecting it is due to high load. We run wifi_matfunc (on beta and stable), and wifi_end_to_end suite (on beta and stable) along with nightly tot runs for wifi_matfunc, perf and end to end suites on day 3 of the week. 
wifi_update_router runs on day 4 of the week, so they are plausibly getting timed out. 

https://stainless.corp.google.com/search?view=matrix&row=build&col=queued_date&first_date=2018-08-18&last_date=2018-10-15&suite=wifi_update_router&test=network_WiFi_VerifyRouter&status=GOOD&status=WARN&status=FAIL&status=ERROR&status=ABORT&status=ALERT&status=RUNNING&status=TEST_NA&status=NOSTATUS&status=NOT_RUN&exclude_cts=true&exclude_not_run=false&exclude_non_release=true&exclude_au=true&exclude_acts=true&exclude_retried=true&exclude_non_production=true

Suite_scheduler link below, 
https://cs.corp.google.com/chromeos_public/infra/suite_scheduler/configs/suite_scheduler.ini?q=suite_sched&g=0&l=1

Sign in to add a comment