New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 599595 link

Starred by 6 users

Issue metadata

Status: Verified
Owner: ----
Closed: Aug 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

802.1x wifi broken in M51

Reported by eric.kuf...@kohls.com, Mar 31 2016

Issue description

UserAgent: Mozilla/5.0 (X11; CrOS x86_64 8104.2.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2688.0 Safari/537.36
Platform: Platform 8104.2.0 (Official Build) dev-channel lulu

Example URL:

Steps to reproduce the problem:
1. Upgrade from M50 to M51
2. Attempt to connect to EAP-TLS wifi SSID
3. Failure to connect. Bouncing wifi icon.

Reboots. Wifi toggle do not make a difference.

What is the expected behavior?
EAP-TLS wifi connects automatically without issue.

What went wrong?
Upgraded to M51 and it broke.

Did this work before? N/A 

Chrome version: 51.0.2688.0  Channel: dev
OS Version: 8104.2.0
Flash Version: Shockwave Flash 21.0 r0

Let me know what you need to help troubleshoot this.
 
Components: -Internals>Network Internals>Network>Connectivity
Cc: jleong@chromium.org cernekee@chromium.org
+Jian +Kevin
Cc: grundler@chromium.org ejcaruso@chromium.org
A feedback report (alt-shift-i) taken right after the failure might help.

Can our testers reproduce this on their setup or is the problem only seen at one site?

Also, wpa_supplicant was recently updated in b/26058138
I flashed my cyan with the latest canary channel (8138.0.0 / 51.0.2695.1) and ran the following tests:

1) Corp enroll, obtain cert, and connect to the autoprovisioned Google-A network.  No problems.

2) Manually connect to an EAP-TLS test network using a Linksys WRT1900ACS + FreeRADIUS server.  Set Identity to "identity" and verify from the FreeRADIUS logs that the correct identity is being set.  No problems; the wifi connection to the AP was successful.

3) Same as (2), but use Identity "identity@foobar.com".  Still no problems.

This suggests that it might be an interop problem, rather than a general 802.1x wifi breakage.  AP logs would help, as would access to a network that is showing the problem.
I will try a wipe & re-enroll. Admittedly I did not try that. I will also capture some logs.
The other thing that occurred to me is that maybe the problem only affects certain configurations, e.g. "ONC-provisioned networks with an empty Identity field."  Would you mind sharing your configuration so I can try to replicate the same thing here?  Private email is fine if you don't want to post it.
Cc: tienchang@chromium.org
Two other successful experiments I ran today:

4) Set up a managed EAP-TLS network from admin.google.com with a blank identity (Username) field.  Verify that the Chromebook prompts the user to fill in the identity, then sends the correct identity to the AP / FreeRADIUS.

5) Same as (4), but use ${LOGIN_ID} to prepopulate the identity field.
ALT+SHIFT+I - submitted feedback.

Problem still happens in latest build:

Version 51.0.2694.1 dev (64-bit)
Platform 8134.0.0 (Official Build) dev-channel lulu
Firmware Google_Lulu.6301.136.39
I have also tried removing my cert and re-enrolling. no luck.

next, will try a wipe.
1. Create recovery USB media using Chrome Recovery Utility
2. Esc+refresh+power
3. Wipe
4. Version # 48.0.2564.116
5. Update to stable # 49.0.2623.95
6. Update to dev # 51.0.2694.1
7. Enroll a wifi user certificate
8. Attempt to connect to wifi

No change in result. Still fails to connect.
Components: -Internals>Network>Connectivity OS>Systems>Network
We have repro'd this on 4 other user's devices.

All are:
Version 51.0.2694.1 dev (64-bit)
Platform 8134.0.0 (Official Build) dev-channel lulu
Firmware Google_Lulu.6301.136.39
I'm having trouble finding your feedback report from comment #9 -- can you file another one when this happens next and put "@ejcaruso" or "@cernekee" in the message so we can search for it easily?
Feedback sent. I tagged you both.
OK. I'm looking at the one in c#15 and it looks like there was a timeout during the 4-way handshake. We spent 200ms waiting for an EAP packet that never came, and earlier iterations had come in about 5ms, so we should have seen the packet before the timeout.

You don't see this happening on the stable M49 build you listed in c#11, do you? If it's only in new images, it might have something to do with the wpa_supplicant uprev to 2.5, but that would probably mean 802.1X is widely broken and cernekee's experiments seem to rule that out.

I also have a device on M50 which is working perfectly.

M50 and below = working fine.


The problem begins @ M51.
OK, grundler@ found something interesting: http://lists.shmoo.com/pipermail/hostap/2015-July/033312.html

We have the first patch that causes us to disconnect on a PMKID mismatch, and there's a workaround patch in this thread which never made it into upstream wpa_supplicant and so we don't have that one. What versions of FreeRADIUS and OpenSSL are you using?
Cc: aashuto...@chromium.org
aashutoshk@, have we ran into this issue against our wifi rack setup yet? I don't think so, but let's do a check with Lulu on M50, autoupdate, and try on M51.
We are using Cisco ISE 2.0
It looks like the thread in c#19 (which cernekee@ found, not grundler@, sorry) does mention Cisco ISE as a culprit here.

Should we include this workaround? It's unlikely it'll get fixed in the RADIUS server...
FTR, I didn't "find" it - just reshared what cernekee@ had included in offlist email. Credit goes to Kevin for digging that up in the email archives.

Let me reshare what has been happening with wpa_supplicant and why there is a "completely new" version in R51:

o eric and I have been working moving to a common hostap source code since January - this includes lots of testing for both OnHub and ChromeBooks (everything based on ChromeOS). This includes both hostapd (Onhub only) and wpa_supplicant (all ChromeOS devices including Onhub).

o the main change for ChromeOS is moving it's local dBus patches from hostap_2_3 based code to hostap_2_5 - that includes the one change Kevin pointed out from the hostap mailing list but not the workaround Jouni Malinen proposed to handle the "incorrect MSK derivation". [Seems like the "right fix" is to update the SW on those servers.]

o Access team added patches to support mesh networking (the primary reason Access team wanted to use hostap_2_5 code base.)

o The code was all in place by beginning of March and missed the M50 Feature Complete branch date. So it ended up in M51.

o The actual "use new code" step was in the third_party/chromiumos-overlay ebuild commits:
    https://chromium-review.googlesource.com/#/q/project:chromiumos/overlays/chromiumos-overlay+owner:%22Grant+Grundler+%253Cgrundler%2540chromium.org%253E%22

The commit that made the switch for ChromeOS, was on March 11:
    https://chromium-review.googlesource.com/#/c/323380/
    (landed in build 8053.0.0)

The commit that made the switch for Jetstream was on March 24:
    https://chromium-review.googlesource.com/#/c/334436/

Kevin, Can you make 8052.0.0 and 8053.0.0 test image available to Eric to try out? That would confirm we are on the right track. (ISTR Eric is testing with "peppy" - Acer C720).

The source code was "in tree" by beginning of March but not fully tested and missed the M50 Feature Complete branch date. So it ended up in M51.

Kudos to Eric for trying out the dev releases and helping track down this issue. 

It's possible the "right fix" is to update the SW on the AP (assuming the AP has a broken OpenSSL version that was fixed later).
I'm not finding the OpenSSL version used in Cisco ISE 2.0. Eric Kufrin, can you ask your networking folks? I expect the open source SW version info is adverstised _someplace_ in the product output. It would be good to know since we might be barking up the wrong tree here.
I will find out tomorrow AM.

Was looking at cisco release notes, thought it was worth mentioning... https://bst.cloudapps.cisco.com/bugsearch/bug/CSCuw88770

We moved from 1.x to 2.0 about 2 weeks ago. (That did not cause any issues)

I do not believe we are on 2.0.1 since the release date is <2weeks.
Thanks!

I don't have an account with Cisco and thus no access to the release notes. They mention something about "OpenSSL updates" ? :)
Screen Shot 2016-04-06 at 9.09.22 PM.png
147 KB View Download
Labels: Hotlist-Enterprise
Labels: Needs-Feedback
@Eric: Do you still see this issue? If not, can you close this as won't fix.
p.s. I no longer see this issue on my setup.
I think this is why my EAP-PEAP setup stopped working. I'm using an self-managed PKI (Active Directory Certificate Services) and pushed my organizations root CA via Google Admin Console. For some reason, the latest WPA_Supplicant isn't following the EAP server certificate back to the root and kicks back an error. Basically, WPA supplicant is no longer checking imported CAs when verifying EAP server certs 
Our issue is resolved.

The root cause was an outdated WPA_Supplicant on our RADIUS environment.

The fixes that Google implemented would have resolved our problem as well, had we not been able to patch RADIUS.

From my perspective, this bug should be marked Verified/closed.
Status: Verified (was: Unconfirmed)
Thanks Eric! Closing as verified as per#31

Sign in to add a comment