Possible to bypass forced OOBE update |
||||||||||||||||||
Issue descriptionI wrote an autotest which interrupts forced OOBE updates with reboots, disable/reenable network, suspend/resume: autoupdate_ForcedOOBEUpdate.interrupt The test has been showing that sometimes rebooting during the update moves to the sign in screen after reboot. Example run: https://stainless.corp.google.com/browse/chromeos-autotest-results/225769435-chromeos-test/ It has screenshots before and after the reboot (attached). Failing result: Update engine logs before reboot: https://storage.cloud.google.com/chromeos-autotest-results/225769435-chromeos-test/chromeos4-row11-rack8-host5/autoupdate_ForcedOOBEUpdate/results/update_engine.20180810-085907 Failing result: Update engine logs after reboot: https://storage.cloud.google.com/chromeos-autotest-results/225769435-chromeos-test/chromeos4-row11-rack8-host5/autoupdate_ForcedOOBEUpdate/results/update_engine.20180810-090211 Passing result: update engine logs before reboot: Before update: https://storage.cloud.google.com/chromeos-autotest-results/225792349-chromeos-test/chromeos4-row8-rack8-host14/autoupdate_ForcedOOBEUpdate/sysinfo/var/log_diff/update_engine/update_engine.20180810-095431 Passing result: update engine logs after reboot: https://storage.cloud.google.com/chromeos-autotest-results/225792349-chromeos-test/chromeos4-row8-rack8-host14/autoupdate_ForcedOOBEUpdate/sysinfo/var/log_diff/update_engine/update_engine.20180810-095932 I'm not sure where else to be looking to find out why OOBE was bypassed. Full logs here: https://stainless.corp.google.com/browse/chromeos-autotest-results/225769435-chromeos-test/
,
Aug 10
,
Aug 10
I think the update engine is behaving as it is. It is the OOBE screens that probably do not behave properly after reboot. Maybe the way the state of the update is communicated between chrome and update engine is broken. Update engine reports its state as like: checking for update, downloading update, finalizing update, etc. I'm guessing the code that is supposed to check for these states in Chrome might have a race condition somewhere. But I'm not familiar with the code in chrome at all. Someone familiar with OOBE screens in Chrome maybe can help. Who would be a good contact? Adding derat@ maybe he knows who would know about this.
,
Aug 10
Alexander (already cc-ed) is a good guess. Xiyuan may also know. I don't know anything about how Chrome learns that there's a blocking update that needs to be applied before going through OOBE. Does it make a D-Bus method call to update_engine? Is the theory that it might be trying to make those calls before update_engine is running? If the user hasn't clicked through OOBE yet, shouldn't Chrome be showing OOBE again after rebooting regardless of the forced-update state?
,
Aug 10
I think that's how it supposed to work. Since this test is failing some times, I smell of a race condition somewhere. For example the dbus call fails, then chrome ignores it, what else it can do? It probably can't wait too far until the UE comes up. We probably need chrome logs but I don't think the logs in posted link have more informative chrome logs. Or is there anywhere else to see the failed dbus logs?
,
Aug 10
UpdateScreen contains the code check whether we have critical update. If there is one, stay on UpdateScreen until update finishes then reboot the device [1]. On reboot, think ShouldShowSigninScreen() [2] decides whether to resume oobe or go to sign-in. Somehow, I don't see chorme logs after the reboot in the attached log. So could not tell why sign-in screen is shown. I suspect some code calls StartupUtils::MarkOobeCompleted prematurely to cause this. [1] https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/screens/update_screen.cc?sq=package:chromium&g=0&l=249,307 [2] https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/ui/login_display_host_webui.cc?rcl=b7c3730ffbbe625a91c222b1ed6af348bdd7b904&l=183
,
Aug 11
This sounds like a dup of issue 803921
,
Aug 13
This is happening on a few boards on every build. So seems it is much easier than we thought to bypass forced oobe
,
Aug 13
The logs posted in the bug do not contain full chrome logs. Without chrome logs we cannot tell what is the problem unless we can specifically reproduce them locally. Can you give links to the boards or tests that are failing?
,
Aug 14
This might have the same cause as crbug.com/838746 where the .oobe_completed file reapers.
,
Aug 14
Here are a few more runs of it happening: https://stainless.corp.google.com/browse/chromeos-autotest-results/227055593-chromeos-test/ https://stainless.corp.google.com/browse/chromeos-autotest-results/227052380-chromeos-test/ https://stainless.corp.google.com/browse/chromeos-autotest-results/227051476-chromeos-test/
,
Aug 14
Looked at the first log: https://stainless.corp.google.com/browse/chromeos-autotest-results/227055593-chromeos-test/ Before the reboot ==== chrome_20180813-153728 is the run that detects critical update and attempts to apply it: [3701:3701:0813/153753.811375:VERBOSE1:update_screen.cc(554)] Initiate update check [3701:3701:0813/153753.818666:VERBOSE1:update_screen.cc(98)] Callback from RequestUpdateCheck, result 0 ... [3701:3701:0813/153759.743451:VERBOSE1:update_screen.cc(253)] Critical update available: 999999.0.0 [3701:3701:0813/153800.246313:VERBOSE1:update_screen.cc(276)] Critical update available: 999999.0.0 Two "Critical update available" lines, one for UPDATE_STATUS_UPDATE_AVAILABLE and one for UPDATE_STATUS_DOWNLOADING. After the update interruption and reboot, ==== chrome_20180813-153956, we do get back the Oobe update screen. But this time, chrome thinks there is no update. [1246:1246:0813/154017.131076:VERBOSE1:update_screen.cc(554)] Initiate update check ... [1246:1246:0813/154021.795112:VERBOSE1:update_screen.cc(98)] Callback from RequestUpdateCheck, result 0 [1246:1246:0813/154021.795276:VERBOSE1:wizard_controller.cc(1447)] Wizard screen exit code: UPDATE_NOUPDATE The difference is the first check runs after update engine runs (first log line timestamp 0813/153638) but the 2nd one starts before update engine runs (0813/154021). So ahassani@'s assessment in #3 might be true that this is caused by some racing, when chrome goes directly to the update screen. My suspect is that chrome receives a status update signal of STATUS_IDLE from update engine before it sets the ignore idle state flag in its StartUpdateCallback [1]. This could cause the update screen to think there is no update. [1] https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/screens/update_screen.cc?rcl=b390e82b77504a1d4a2c1a4308332e0f89ff0646&l=101
,
Aug 14
One possible fix is to defer observing update engining status [1] until StartUpdateCheckCallback is finished. [1]: https://cs.chromium.org/chromium/src/chrome/browser/chromeos/login/screens/update_screen.cc?rcl=b390e82b77504a1d4a2c1a4308332e0f89ff0646&l=553
,
Aug 14
Issue 803921 has been merged into this issue.
,
Aug 14
Yeah, I think that might be the root cause.
,
Aug 15
Thanks for investigation... We should get this before next stable as there are several blocking AU FSIs that rely on this functionality. Marking as RBS for M70 and M69 (if merge/time permits)
,
Aug 15
rkc@, could you help to find an owner from you team? I am overloaded and could not get to this quickly. Thanks. I suspect that the issue could be fixed by moving GetUpdateEngineClient()->AddObserver() after StartUpdateCallback happens. But verifying the fix might take some time.
,
Aug 15
,
Aug 15
qnnguyen@ can you take a look?
,
Aug 31
Please prioritize this fix, it is blocking stable with no updates in a couple of weeks. Thanks.
,
Aug 31
This issue reproduced on all Nami devices (Akali, Akali360, Pantheon, Sona, Vayne) during FSI testing (M68 FSI build (10718.88.2, 68.0.3440.118)). After suspend/resume or reboot of device during forced AU at OOBE, auto update screen skips to sign in screen, while the update continues in the background. Attached debug logs from Akali360.
,
Sep 5
I've put together a CL to fix this here: https://chromium-review.googlesource.com/c/chromium/src/+/1208138 It's still pending an okay from platform (basically just to make sure the assumptions about update engine are correct). Could we have someone patch to this CL and try reproducing the issue with the relevant devices? Thanks, Quan
,
Sep 7
Has this been reviewed and ready for merge?
,
Sep 7
It has been LGTM'd by xiyuan@, so technically ready to land. I can submit today. However, I was hoping to get someone to test it out on the Nami device(s) where we saw the issue originally. This change involves a fairly core part of the OOBE process, so despite it being a fairly small change, I would like to have some confirmation before merging into M69.
,
Sep 7
The test team is not setup to patch to CLs and build. This bug was found by this autotest: autoupdate_ForcedOOBEUpdate.interrupt.full Can you run that a few times against your build? Or provide your build to the test team on a USB stick and we can try it out.
,
Sep 7
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0270f6321a30f5f46b59b7d0385e6527a34da914 commit 0270f6321a30f5f46b59b7d0385e6527a34da914 Author: Quan Nguyen <qnnguyen@chromium.org> Date: Fri Sep 07 17:33:58 2018 cros: Don't skip update screen when resuming forced update in OOBE. There are two important bits of nuance behind removing the call to SetIgnoreIdleStatus(false). 1. After calling RequestUpdateCheck(), we will eventually receive a status update with UPDATE_STATUS_CHECKING_FOR_UPDATE. UpdateStatusChanged will see this and no longer ignore UPDATE_STATUS_IDLE, so we don't need to SetIgnoreIdleStatus(false) manually. 2. It is possible to receive an UPDATE_STATUS_IDLE after calling RequestUpdateCheck() and before receiving the UPDATE_STATUS_CHECKING_FOR_UPDATE. This seems to happen a lot when resuming a previous forced-autoupdate. The UpdateScreen should ignore any UPDATE_STATUS_IDLE until it receives either an UPDATE_STATUS_CHECKING_FOR_UPDATE or an UPDATE_STATUS_REPORTING_ERROR_EVENT. Bug: 873270 Change-Id: I3d7d0d05e7abf60579a80a62345e4e755e633894 Reviewed-on: https://chromium-review.googlesource.com/1208138 Reviewed-by: Xiyuan Xia <xiyuan@chromium.org> Commit-Queue: Quan Nguyen <qnnguyen@chromium.org> Cr-Commit-Position: refs/heads/master@{#589566} [modify] https://crrev.com/0270f6321a30f5f46b59b7d0385e6527a34da914/chrome/browser/chromeos/login/screens/update_screen.cc
,
Sep 7
Spoke with dhaddock@ offline. The CL is landed but in order to autotest it, it will need to be included in a revision. The plan is to wait for uprev and test it out on Monday.
,
Sep 10
Have you been able to validate?
,
Sep 11
AU team, Could you please look into the issue.?
,
Sep 11
To clarify, what's needed is that the autotest in #25 be run on a recent build of Chrome with the following CL in it. https://chromium-review.googlesource.com/c/chromium/src/+/1208138 Looks like 71.0.3548.0 includes it, for example. I haven't been able to figure out how to run this autotest, since I don't believe I have access to the testing lab anyway. I figure if someone on the test team already has access for running various autotests, it might be most time efficient to have them run this one. Thanks!
,
Sep 11
+chchakrapani on the testing team who was going to take a look.
,
Sep 11
Unfortunately the test was disabled last week by the infra team ( crbug.com/877107 ). I am getting it enabled again with these two CLs: https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/1213765 https://chromium-review.googlesource.com/c/chromiumos/third_party/autotest/+/1219549
,
Sep 12
friendly ping, please provide an update.
,
Sep 12
The changes above are in the CQ now
,
Sep 12
Please go ahead and merge to the branch.
,
Sep 12
,
Sep 12
This bug requires manual review: Request affecting a post-stable build Please contact the milestone owner if you have questions. Owners: amineer@(Android), kariahda@(iOS), cindyb@(ChromeOS), govind@(Desktop) For more details visit https://www.chromium.org/issue-tracking/autotriage - Your friendly Sheriffbot
,
Sep 13
Cindy approved this in comment 35. How often does this actually happen? We have a potential blocking AU FSI on 69 that may need to be respun on this, if it is a common failure.
,
Sep 13
It is happening on 5-10 boards on every build.
,
Sep 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/04fdf02791560fe4e3d96f0cfd7adf6110bc7fe1 commit 04fdf02791560fe4e3d96f0cfd7adf6110bc7fe1 Author: Quan Nguyen <qnnguyen@chromium.org> Date: Thu Sep 13 16:08:42 2018 cros: Don't skip update screen when resuming forced update in OOBE. There are two important bits of nuance behind removing the call to SetIgnoreIdleStatus(false). 1. After calling RequestUpdateCheck(), we will eventually receive a status update with UPDATE_STATUS_CHECKING_FOR_UPDATE. UpdateStatusChanged will see this and no longer ignore UPDATE_STATUS_IDLE, so we don't need to SetIgnoreIdleStatus(false) manually. 2. It is possible to receive an UPDATE_STATUS_IDLE after calling RequestUpdateCheck() and before receiving the UPDATE_STATUS_CHECKING_FOR_UPDATE. This seems to happen a lot when resuming a previous forced-autoupdate. The UpdateScreen should ignore any UPDATE_STATUS_IDLE until it receives either an UPDATE_STATUS_CHECKING_FOR_UPDATE or an UPDATE_STATUS_REPORTING_ERROR_EVENT. Bug: 873270 Change-Id: I3d7d0d05e7abf60579a80a62345e4e755e633894 Reviewed-on: https://chromium-review.googlesource.com/1208138 Reviewed-by: Xiyuan Xia <xiyuan@chromium.org> Commit-Queue: Quan Nguyen <qnnguyen@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#589566}(cherry picked from commit 0270f6321a30f5f46b59b7d0385e6527a34da914) Reviewed-on: https://chromium-review.googlesource.com/1224970 Reviewed-by: Bernie Thompson <bhthompson@chromium.org> Cr-Commit-Position: refs/branch-heads/3497@{#932} Cr-Branched-From: 271eaf50594eb818c9295dc78d364aea18c82ea8-refs/heads/master@{#576753} [modify] https://crrev.com/04fdf02791560fe4e3d96f0cfd7adf6110bc7fe1/chrome/browser/chromeos/login/screens/update_screen.cc
,
Sep 13
It looks like this was not merged to the branch, but we built our RC last night so this missed the boat. I just merged it back and submitted it so we will catch the next one, given this only impacts blocking AU if we don't plan on using that capability on this release, this should not really block, and given it is merged I think we can drop the RBS label.
,
Sep 13
Approving merge to 70 also, as it looks like it was not landed there either.
,
Sep 13
The following revision refers to this bug: https://chromium.googlesource.com/chromium/src.git/+/0a205ba72b8aef81c61819705e3daec4e252cef6 commit 0a205ba72b8aef81c61819705e3daec4e252cef6 Author: Quan Nguyen <qnnguyen@chromium.org> Date: Thu Sep 13 16:11:08 2018 cros: Don't skip update screen when resuming forced update in OOBE. There are two important bits of nuance behind removing the call to SetIgnoreIdleStatus(false). 1. After calling RequestUpdateCheck(), we will eventually receive a status update with UPDATE_STATUS_CHECKING_FOR_UPDATE. UpdateStatusChanged will see this and no longer ignore UPDATE_STATUS_IDLE, so we don't need to SetIgnoreIdleStatus(false) manually. 2. It is possible to receive an UPDATE_STATUS_IDLE after calling RequestUpdateCheck() and before receiving the UPDATE_STATUS_CHECKING_FOR_UPDATE. This seems to happen a lot when resuming a previous forced-autoupdate. The UpdateScreen should ignore any UPDATE_STATUS_IDLE until it receives either an UPDATE_STATUS_CHECKING_FOR_UPDATE or an UPDATE_STATUS_REPORTING_ERROR_EVENT. Bug: 873270 Change-Id: I3d7d0d05e7abf60579a80a62345e4e755e633894 Reviewed-on: https://chromium-review.googlesource.com/1208138 Reviewed-by: Xiyuan Xia <xiyuan@chromium.org> Commit-Queue: Quan Nguyen <qnnguyen@chromium.org> Cr-Original-Commit-Position: refs/heads/master@{#589566}(cherry picked from commit 0270f6321a30f5f46b59b7d0385e6527a34da914) Reviewed-on: https://chromium-review.googlesource.com/1224464 Reviewed-by: Bernie Thompson <bhthompson@chromium.org> Cr-Commit-Position: refs/branch-heads/3538@{#369} Cr-Branched-From: 79f7c91a2b2a2932cd447fa6f865cb6662fa8fa6-refs/heads/master@{#587811} [modify] https://crrev.com/0a205ba72b8aef81c61819705e3daec4e252cef6/chrome/browser/chromeos/login/screens/update_screen.cc
,
Sep 13
Merges complete, marking the bug as fixed.
,
Sep 19
I have not seen the failure once in the last 10 M71 builds. Nice! |
||||||||||||||||||
►
Sign in to add a comment |
||||||||||||||||||
Comment 1 by dhadd...@chromium.org
, Aug 10