Chrome overrides user-selected wifi network |
|||||||||||
Issue descriptionChrome OS 62.0.3202.8 / 9901.5.0 dev channel: On a corp-enrolled device: 1) Open system menu 2) Select Network 3) Currently connected to Google-A 4) Select GoogleGuest 5) Observe that Chromebook connects to GoogleGuest (good) 6) Observe that Chromebook scans for networks again (good) 7) Observe that Chromebook automatically switches back to Google-A without prompting (bad) chrome://device-log: https://drive.google.com/open?id=0ByglX3exPaCrNl85by1yLVVPYXM I don't think this behavior is new. We saw it several months ago when debugging captive portals. I believe this is triggered by the ConnectToBestServices call at 16:51:45 in the log. Google-A does outrank the other networks (by design), but shill should not override the user selection unless it is explicitly requested. Maybe related: bug 764000
,
Sep 15 2017
Oops, looks like I already filed bug 709664 back in April regarding this issue.
,
Jan 10 2018
Hi Kevin, In my testing I found that under network preferences, "Automatically connect to this network" is enabled for "Google-A" and also for "GoogleGuest". When I disabled that option on both the networks, they stay connected based on user selection. I guess this is enterprise policy on the network which enables this option and makes the chromebook autoconnect to this network when present.
,
Jan 10 2018
dsunkara@: You're correct - this is enforced by the @google.com domain. We realize that this may be confusing for users, so we are adding a notification to alert users when this occurs. See issue 764000 for more details.
,
Jan 11 2018
I think what we want is to prevent the ConnectToBestServices call on rescan, because that tells shill to forget about the last user selection and instead apply the sorting algorithm to pick the active SSID.
,
Jan 11 2018
cernekee@ - It is weird (and probably a bug) if we are calling ConnectToBestServices after Google-A is connectable. Maybe the policy updated after you switched to GoogleGuest? khorimoto@ - Could you follow up your recent change with a reason in the event log for ConnectToBestServices? (i.e. login, policy load, cert load)? The reason for the complexity in the behavior is to ensure the following: 1. User logs in. 2. User explicitly connects to GoogleGuest. 3. Certificate and/or policy loads. 4. User should automatically switch to Google-A. 5. (new) User should be notified. Usually (i.e. pretty much always outside of testing) a user wants to switch from an insecure network to a secure one when the secure network becomes available. This is the use case that the code was optimized for. I'm fine with fixing edge cases as long as we don't break the primary use case.
,
Jan 11 2018
This is readily reproducible on M62 eve using the steps in the OP.
,
Jan 11 2018
It ought to be, since it's WAI :) (Unless you can repro this after being logged in for ~10 seconds, i.e. long enough for policy and certs to have loaded).
,
Jan 11 2018
I was logged in for about 4 hours. I don't think it's being caused by a policy refresh. It happens every time I try it.
,
Jan 11 2018
Looking at the log you provided, there is a gap from [11:08:34] to [16:51:04] which I assume was a resume after suspend, which is presumably triggering login, policy, and/or cert changes. Please feel free to examine your logs for supurious 'ConnectToBestServices' events not associated with login or resume, but I suspect those are the scenarios where you are seeing this issue. In the log I see: NetworkUser[16:51:26] ConnectToNetwork: /service/2 (GoogleGuest) followed by: NetworkEvent[16:51:26] ConnectToBestServices i.e. a certificate or policy load triggered the ConnectToBestServices call about the same time you explicitly connected to GoogleGuest, about 22 seconds after resume (which is a little longer than I might have expected, but may be typical after a resume). So again, this is WAI. It would certainly be *nice* to identify any of the following: * Google-A was already fully configured * Google-A had already been auto connected and was explicitly disconnected to switch to GoogleGuest However, Chrome doesn't know what the "Best" network is, Shill decides that, so in order to robustly fix this edge case, we really need to improve the logic in Shill. The "ConnectToBestServices" signal from Chgrome -> Shill is basically just "something may have changed that may improve Shill's ability to connect to a more secure network".
,
Jan 11 2018
Steven, please ping me when you're in MTV and I'll show you the issue in person.
,
Jan 11 2018
I won't be in Mountain View until Jan 32, but is the issue different than: 1. Wake up device. 2. (Device maybe auto-conencts to Google-A). 3. User connects to GoogleGuest because they want to be connected to GoogleGuest and not Google-A 4. 20-30 seconds later device auto-connects to Google-A ? Becuase: a) I can see how that can be annoying, especially if you're doing it a lot. I would recommend trying to set up a non enterprise enrolled account for testing (but am aware that setting up a VPN is a pita). b) We can't easily differntiate that from the same scenario after login except that in step 3. the user connects to GoogleGuest because they need connectivity while they wait for certs or policy to load, but wants to connect to Google-A (and their company policy wants them connected to Google-A) as soon as it is available. Once more with feeling, I am happy to support changes to improve (b), but if it were trivial we would have already.
,
Jan 11 2018
In this case, the auto-connect to Google-A happens as a result of the scan completion (because I'm leaving the system menu open to the Network page). It's not happening because of a policy push.
,
Jan 11 2018
Could you include a log that shows that? The device log from comment #1 occured < 30s after the device work up from sleep and was probably triggered by certificates loading. Could you also check 'Debug' on the device log? That will tell us exactly what triggered the ConnectToBestNetwork request.
,
Jan 12 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/08841991b38df5188f6ab5290db1fe50216cae6b commit 08841991b38df5188f6ab5290db1fe50216cae6b Author: Kyle Horimoto <khorimoto@google.com> Date: Thu Jan 11 23:59:43 2018 Add network log when auto-connect is initiated. Bug: 765489 Change-Id: Ia500908197e15197456bf5c915979b8ab78be1ef Reviewed-on: https://chromium-review.googlesource.com/861339 Commit-Queue: Kyle Horimoto <khorimoto@chromium.org> Reviewed-by: Steven Bennetts <stevenjb@chromium.org> Cr-Commit-Position: refs/heads/master@{#528822} [modify] https://crrev.com/08841991b38df5188f6ab5290db1fe50216cae6b/chromeos/network/auto_connect_handler.cc
,
Jan 12 2018
OK, I was just able to reproduce what you describe. I wish this had been written up a bit more clearly, this is definitely a regression. Looking at chrome://device-log with 'Debug' checked it seems that the reason for the request is "Client certificate patterns resolved."
,
Jan 12 2018
To be clear, the simplest repro is: 0. While logged in for > 30 seconds wifh wifi enabled and connected to a network requiring a certificate (e.g. Google-A): 1. Open the system menu > network, connect to an insecure network (e.g. GoogleGuest) 2. Leave the system menu open, wait for a network scan Observe: Chrome OS auto-connects back to the secure network (e.g. Google-A). This is a P1 not a P3. I suspect that this change or another semi-recent change to certificate resolution affected the frequency of calls to ResolveRequestCompleted: https://chromium-review.googlesource.com/c/chromium/src/+/753371 +pmarko@ The right fix is probably to only call RequestBestConnection() from ResolveRequestCompleted() if client_certs_resolved_ was false. pmarko@ - Would ResolveRequestCompleted() ever get callbed before a user is logged in? If not, then I think my suggestion is valid. Otherwise we may need to also check that we are logged in first. Do you know why connecting to GoogleGuest would trigger NotifyResolveRequestCompleted? khorimoto@ - If you're taking a stab at this that would be great, but if not please feel free to assign this to me.
,
Jan 12 2018
FYI, Yoshi and I saw this last April (bug 709664). So I don't think it is related to pmarko's cert changes.
,
Jan 12 2018
,
Jan 12 2018
Ah, I think I see what is going on. This bug may be older than the recent changes, but nobody noticed or reported it in a way that got our attention. ClientCertResolver::NetworkListChanged() looks for new networks to resolve and replaces the list with the current list of networks. The probelm is Shill aggressively forgets visible networks so the "old" gets cleared out and we re-resolve the same networks over and over again. There are a couple of pontential fixes here: 1. Only resolve "remembered" (configured) networks. 2. Do not clear the list of resolved networks. 3. Leave ClientCertResolver as-is and just make the suggested fix to AutoConnectHandler. #1 seems like the best approach to me, but it's late and I may be missing something obvious. pmarko@ - WDYT?
,
Jan 12 2018
,
Jan 12 2018
Re Comment #20: Oh, interesting, thanks for the analysis! Please excuse the following wall-of-text. So, is the current hypothesis that: (1) ClientCertResolver keeps getting a modified list of networks on NetworkListChanged() (2) tries to resolve the delta between each two calls (which happens to be non-empty on each call), (3) the delta contains networks with client certificate patterns configured in policy (otherwise they shouldn't be resolved according to [1]). (4) This makes ClientCertResolver call ResolveRequestCompleted(network_properties_changed = true) in [2] (5) This makes AutoConnectHandler::ResolveRequestCompleted call "RequestBestConnection" in [3] The original plan is that if networks keep appearing which don't have certificate patterns in policy, ClientCertResolver is supposed to call ResolveRequestCompleted(network_properties_changed = false). Unless this doesn't work somehow (i.e. ResolveRequestCompleted(true) is called even if non-policy-cert-pattern networks appeard only, or AutoConnectHandler misbehaves on ResolveRequestCompleted(false) calls), this would mean that shill keeps forgetting and rediscovering a policy-cert-pattern network (e.g. Google-A). So if that's indeed the case, I'm not sure what exactly we can do. What's the expected behavior if the user manually connects to a random_network, then walks out of range of Google-A, and walks back? Re: Options: Option 1: IIUC that's what we already do here: https://cs.chromium.org/chromium/src/chromeos/network/client_cert_resolver.cc?rcl=b0e4b4840bff841de8ae65d01e1b3b1482da8c77&l=506 Option 2: I think we can't do this as suggested, because when the network re-appears, ClientCertResolver must give shill the client certificate slot + id again. What we could do is have an additional set of "networks ever seen in this run", and give shill the cert data (so the user can connect manually) for re-appearing networks, but don't make re-appearing networks cause a ResolveRequestCompleted(true) notification. Option 3: This is similar in behavior to the variation of Option 2 described above, but would only cause auto-connect if a policy-cert-pattern network was visible on the first client cert resolving run, and not if a policy-cert-pattern network appeared later. Summary: We should be sure what the target behavior is for a policy-cert-pattern network disappearing and re-appearing. What happens for regular Auto-Connect networks without policy-cert-patterns? We should be sure that this is what actually happens (i.e. Google-A or another policy-cert-pattern network disappears and re-appears) and understand why this happens if the chromebook was not physically moved around. [1] https://cs.chromium.org/chromium/src/chromeos/network/client_cert_resolver.cc?rcl=b0e4b4840bff841de8ae65d01e1b3b1482da8c77&l=514 and https://cs.chromium.org/chromium/src/chromeos/network/client_cert_resolver.cc?rcl=b0e4b4840bff841de8ae65d01e1b3b1482da8c77&l=527 [2] https://cs.chromium.org/chromium/src/chromeos/network/client_cert_resolver.cc?rcl=b0e4b4840bff841de8ae65d01e1b3b1482da8c77&l=627 [3] https://cs.chromium.org/chromium/src/chromeos/network/auto_connect_handler.cc?rcl=b0e4b4840bff841de8ae65d01e1b3b1482da8c77&l=188
,
Jan 12 2018
You're right, it does not seem like ResolveRequestCompleted() should be getting called repeatedly with network_properties_changed == true. I will investigate that first.
,
Jan 12 2018
Unfortunatley I don't have access to a network requiring a certificate here in Boulder. However, looking at the code however I strongly suspect this change: https://codereview.chromium.org/1443043002 This code: https://cs.chromium.org/chromium/src/chromeos/network/client_cert_resolver.cc?l=451 always calls ResolveNetworks() with a disconnected network which it looks like will always lead to a call to ResolveRequestCompleted(true) if the disconnected network is policy configured with a certificate (e.g. Google-A). I put up a CL to add some logging to confirm this, but it looks like we will need to either: a) Make the cert resolver code smarter and identify when an actual cert change occurs (which may be tricky) b) fix this in the auto connect handler as I originally proposed I'm going to take a stab at (b) since it seems simpler.
,
Jan 13 2018
I'll try to repro this on Monday. Regarding your CL: Please note that I think what I said above regarding Option 2 still holds - If user policy has a network with a certificate pattern, and the network becomes visible after the first certificate resolving run finished, we won't call RequestBestConnection anymore. Is that fine?
,
Jan 16 2018
Sorry, I didn't have the time to attempt a repro yet, but I'm confident I'll have the time on Wed.
,
Jan 16 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/2e22f7d75cf332b5bdc09438fcbb86097475ce34 commit 2e22f7d75cf332b5bdc09438fcbb86097475ce34 Author: Steven Bennetts <stevenjb@chromium.org> Date: Tue Jan 16 18:37:56 2018 Add device_event_log events for ClientCertResolver Bug: 765489 Change-Id: Ie688eab57c63aef45b26c6ad1508f575587a40af Reviewed-on: https://chromium-review.googlesource.com/865474 Reviewed-by: Pavol Marko <pmarko@chromium.org> Commit-Queue: Steven Bennetts <stevenjb@chromium.org> Cr-Commit-Position: refs/heads/master@{#529466} [modify] https://crrev.com/2e22f7d75cf332b5bdc09438fcbb86097475ce34/chromeos/network/client_cert_resolver.cc
,
Jan 17 2018
I've managed to repro this and indeed the issue is what you suspected in Comment #24: - When disconnecting from the EAP-TLS network, ClientCertResolver::NetworkConnectionStateChanged is called with the |network|. - It triggers a ResolveNetworks run. - If the network has a policy-set client cert pattern, it continues into FindCertificateMatches. This will find the same cert that's already configured for the network - We proceed into ConfigureCertificates. As a match has been found, this updates shill and calls Obsever::ResolveRequestCompleted(true /* network_properties_changed */ ). So, we definitely have the issue that client_cert_resolver.cc misbehaves: it's setting network_properties_changed = true even when no network properties effectively changed. I'll create a CL to change that (i.e. do what you proposed as (a)). We can still discuss behavior of auto_connect_handler.cc (i.e. when it should call RequestBestConnection?), but I think we should definitely fix the root cause in this case.
,
Jan 17 2018
OK, sgtm. If that eliminates the problem then it might be better to leave auto_connect_handler as-is. We've lived with this issue for quite a while and it won't affect most users so it's not super urgent, but we should definitely fix it sooner than later.
,
Jan 24 2018
Any progress on fixing client_cert_resolver.cc? I realize this is an edge case that doesn't affect a lot of folks and that we have lived with for a while, but I would still prefer to get a fix in sooner than later. If the client_cert_resolver.cc fix is complicated I can update the auto_connect_handler.cc workaround.
,
Jan 25 2018
Sorry, I didn't manage to do this change before I was OOO due to other high-priority issues. I'm back now and will create a CL asap.
,
Jan 26 2018
,
Feb 1 2018
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/d69d4303994541f440aa0348dfb868bdabc8021c commit d69d4303994541f440aa0348dfb868bdabc8021c Author: Pavol Marko <pmarko@chromium.org> Date: Thu Feb 01 10:08:41 2018 ClientCertResolver: Track last resolved client cert Make ClientCertResolver track the last certificate that has been resolved for each network. ClientCertResolver now only notifies Observers with |network_properties_changed|=true if the network has not been resolved before, or if the resolved client certificate (or EAP identity string) actually changed. Bug: 765489 Test: chromeos_unittests --gtest_filter=ClientCertResolver* Change-Id: Ief557c95492f5b4906445f840fc29d6def73097a Reviewed-on: https://chromium-review.googlesource.com/888558 Commit-Queue: Pavol Marko <pmarko@chromium.org> Reviewed-by: Steven Bennetts <stevenjb@chromium.org> Reviewed-by: Maksim Ivanov <emaxx@chromium.org> Cr-Commit-Position: refs/heads/master@{#533617} [modify] https://crrev.com/d69d4303994541f440aa0348dfb868bdabc8021c/chromeos/network/client_cert_resolver.cc [modify] https://crrev.com/d69d4303994541f440aa0348dfb868bdabc8021c/chromeos/network/client_cert_resolver.h [modify] https://crrev.com/d69d4303994541f440aa0348dfb868bdabc8021c/chromeos/network/client_cert_resolver_unittest.cc
,
Feb 19 2018
As the CL landed in 66.0.3337.0, this should not happen there anymore.
,
Oct 10
Verified on Eve - R70-11021.23.0 , 70.0.3538.28. |
|||||||||||
►
Sign in to add a comment |
|||||||||||
Comment 1 by steve...@chromium.org
, Sep 15 2017Owner: khorimoto@chromium.org
Status: Assigned (was: Untriaged)