New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 915776 link

Starred by 4 users

Issue metadata

Status: Untriaged
Owner: ----
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug



Sign in to add a comment

Samsung Chromebook 3 devices are losing enrollment

Project Member Reported by vkasatkin@google.com, Dec 17

Issue description

ChromeOS version: 70.0.3538.76
ChromeOS device model: Samsung Chromebook 3 
Case#: 17753176

Description:
Devices are losing WiFi policies and spontaneously losing enrollment 

Steps to reproduce: 
Customer reports that some of their chromebooks are losing WiFi, and going back to the enterprise enrollment screen. They thought it was an issue with the latest Chrome OS update to version 70, but turning off auto updates did not help.

Current Behavior / Reproduction: 
No specific steps, issue happens sporadically 

Expected Behavior: 
Devices stay connected and do not un-enroll

Drive link to logs:
https://drive.google.com/open?id=1LFh_Z-SLVZTrNx2buu7Rzf4XK6fapjdT
 

Comment 2 Deleted

Customer provided a fresh debug log: https://drive.google.com/open?id=1GTAhNut9ZsvYkSd7hJ_An5MrFXkAmrtY,
stating that issue happen again this morning (12/20/2018).
From the log files, it looks like issue occur around 10:50 am. 

Force enrollment triggered at 10:52:

53:[1162:1162:1219/105215.263064:VERBOSE1:wizard_controller.cc(614)] Showing EULA screen.
54:[1162:1162:1219/105215.263186:VERBOSE1:wizard_controller.cc(1363)] SetCurrentScreenSmooth: eula
57:[1162:1162:1219/105224.021710:VERBOSE1:wizard_controller.cc(1535)] Wizard screen exit code: EULA_ACCEPTED
58:[1162:1162:1219/105224.022888:VERBOSE1:auto_enrollment_controller.cc(546)] Auto-enrollment required: flag in VPD.
59:[1162:1162:1219/105224.023034:VERBOSE1:auto_enrollment_controller.cc(555)] Proceeding with FRE
60:[1162:1162:1219/105224.023124:VERBOSE1:auto_enrollment_controller.cc(732)] New auto-enrollment state: 1
62:[1162:1162:1219/105224.023347:VERBOSE1:wizard_controller.cc(1275)] StartOOBEUpdate
63:[1162:1162:1219/105224.023417:VERBOSE1:wizard_controller.cc(1363)] SetCurrentScreenSmooth: update
77:[1162:1162:1219/105232.783284:VERBOSE1:wizard_controller.cc(1693)] Hiding error screen.
78:[1162:1162:1219/105232.783322:VERBOSE1:wizard_controller.cc(1363)] SetCurrentScreenSmooth: update
83:[1162:1162:1219/105237.936291:VERBOSE1:wizard_controller.cc(1535)] Wizard screen exit code: UPDATE_ERROR_UPDATING
84:[1162:1162:1219/105237.936440:VERBOSE1:wizard_controller.cc(746)] Showing Auto-enrollment check screen.
85:[1162:1162:1219/105237.936531:VERBOSE1:wizard_controller.cc(1363)] SetCurrentScreenSmooth: auto-enrollment-check
86:[1162:1162:1219/105250.488051:VERBOSE1:auto_enrollment_controller.cc(685)] Starting auto-enrollment client for FRE.
87:[1162:1162:1219/105250.489027:VERBOSE1:auto_enrollment_controller.cc(732)] New auto-enrollment state: 1
88:[1162:1162:1219/105250.564658:VERBOSE1:auto_enrollment_controller.cc(732)] New auto-enrollment state: 1
89:[1162:1162:1219/105250.649777:VERBOSE1:auto_enrollment_controller.cc(732)] New auto-enrollment state: 1
91:[1162:1162:1219/105251.165345:VERBOSE1:auto_enrollment_controller.cc(732)] New auto-enrollment state: 4
92:[1162:1162:1219/105251.165441:VERBOSE1:wizard_controller.cc(1535)] Wizard screen exit code: ENTERPRISE_AUTO_ENROLLMENT_CHECK_COMPLETED
93:[1162:1162:1219/105251.165534:VERBOSE1:wizard_controller.cc(2073)] Showing enrollment screen. Forcing interactive enrollment: 0.
94:[1162:1162:1219/105251.165609:VERBOSE1:wizard_controller.cc(1363)] SetCurrentScreenSmooth: oauth-enrollment
155:[1162:1162:1219/105341.100173:VERBOSE1:wizard_controller.cc(1535)] Wizard screen exit code: ENTERPRISE_ENROLLMENT_COMPLETED
156:[1162:1162:1219/105341.100935:VERBOSE1:wizard_controller.cc(579)] Showing login screen.


Also, around this time there  is a failed attempt to update from 70.0.3538.76 to 70.0.3538.110

[1219/105237:ERROR:update_attempter.cc(1367)] Update failed.
[1219/105237:INFO:payload_state.cc(257)] Updating payload state for error code: 49 (ErrorCode::kNonCriticalUpdateInOOBE)

  

Labels: Impacts-Enterprise
Cc: poromov@chromium.org apronin@chromium.org emaxx@chromium.org
Andrey/Maksim - is this disk corruption causing us to lose the stateful partition? If we think there's a bug here, this should be a higher priority.

Comment 6 by emaxx@chromium.org, Jan 17 (5 days ago)

After taking a look at logs from comment 3, it seems that the device has already been in the bad state for a while. Each boot, starting from the very first one present in the logs, has messages about TPM not being ready:

2018-12-16T09:23:15.216120+00:00 INFO cryptohomed[1093]: TPM error 0x2020 (Key not found in persistent storage): LoadKeyByUuid: failed LoadKeyByUUID
2018-12-16T09:23:15.216184+00:00 WARNING cryptohomed[1093]: Canceled creating cryptohome key - TPM is not ready.
2018-12-16T09:23:15.216328+00:00 WARNING cryptohomed[1093]: Could not load the device policy file.
2018-12-16T09:23:15.216617+00:00 ERR cryptohomed[1093]: Creating new salt at /home/.shadow/salt (0, 0)
2018-12-16T09:23:15.231710+00:00 WARNING chapsd[1036]: SRK does not exist - this is normal when the TPM is not yet owned.
2018-12-16T09:23:15.235271+00:00 WARNING chapsd[1036]: SRK does not exist - this is normal when the TPM is not yet owned.
2018-12-16T09:23:15.262651+00:00 ERR cryptohomed[1093]: stat() of /mnt/stateful_partition/unencrypted/preserve/attestation.epb failed.: No such file or directory
2018-12-16T09:23:15.262731+00:00 ERR cryptohomed[1093]: Failed to read db.: No such file or directory
2018-12-16T09:23:15.262758+00:00 INFO cryptohomed[1093]: Attestation: Attestation data not found.

Similar with logs from comment 0.

It's likely that the stateful partition had indeed been corrupted before these logs were collected. I couldn't find any trace in the logs about the root cause.
Who could be the right person/team for investigating disk corruption and related issues?

Comment 7 by apronin@chromium.org, Jan 17 (5 days ago)

Cc: gwendal@chromium.org
Looks like here's where the stateful was cleared: 

2018/12/16 09:22:57 UTC Self-repair incoherent stateful partition: var and home. History: /home/chronos /var /home 
2018/12/16 09:22:57 UTC (preserve log): /sbin/clobber-state fast keepimg
2018/12/16 09:23:00 UTC (restore log): /sbin/clobber-state

And I don't see any kernel crashes or other issues in the event.log

+gwendal, does it look any similar to issue 878595 or other recent disk-related issues? I'm not sure which of those issues affected which devices. And if a disk was just full (inodes or otherwise), we'd have some other traces in the logs, right?

Comment 8 by ryutas@google.com, Jan 18 (5 days ago)

Other Enterprise customer has reported the similar case.
ChromeOS version: 71.0.3578.94
ChromeOS device model: banon Acer Chromebook 15 (CB3-532)
Case#: 18055655

Note: 
Chromebooks are losing enrollment after rebooting them and they are not auto re-enrolling. Auto-enrollment is not working whether devices are wiped manually or lose enrollment due to the issue. can still be manually re-enrolled. 

The customer shared only a device log, but they’ve explained us about there are more affected devices.

Drive link to logs: 
https://drive.google.com/open?id=1UvR4FzT5K3Pxv8JlEbh6ng7VstbHkwEz

2018-06-20T12:25:54.959094+00:00 INFO cryptohomed[1220]: TPM error 0x2020   [Reason: info:TPM error codes] (Key not found in persistent storage): LoadKeyByUuid: failed LoadKeyByUUID
2018-06-20T12:25:54.959130+00:00 WARNING cryptohomed[1220]: Canceled creating cryptohome key - TPM is not ready.
2018-06-20T12:25:54.959245+00:00 WARNING cryptohomed[1220]: Could not load the device policy file.
2018-06-20T12:25:54.964480+00:00 WARNING chapsd[918]: SRK does not exist - this is normal when the TPM is not yet owned.
2018-06-20T12:25:54.965314+00:00 WARNING chapsd[918]: SRK does not exist - this is normal when the TPM is not yet owned.
2018-06-20T12:25:55.060328+00:00 INFO cryptohomed[1220]: TPM error 0x2020   [Reason: info:TPM error codes] (Key not found in persistent storage): Unseal: Failed to load SRK.
2018-06-20T12:25:55.060765+00:00 ERR cryptohomed[1220]: Cannot unseal aes key.
2018-06-20T12:25:55.060798+00:00 ERR cryptohomed[1220]: Attestation: Could not unseal decryption key.
2018-06-20T12:25:55.060820+00:00 WARNING cryptohomed[1220]: Attestation: Attestation data invalid.  This is normal if the TPM has been cleared.
2018-06-20T12:25:55.147232+00:00 INFO cryptohomed[1220]: Cannot read boot lockbox files.
2018-06-20T12:25:55.147328+00:00 INFO cryptohomed[1220]: The TPM chip does not support GetAlertsData. Stop UploadAlertsData task.

Comment 9 by ryutas@google.com, Jan 18 (5 days ago)

Cc: ryutas@chromium.org

Comment 10 by ryutas@google.com, Jan 21 (2 days ago)

Components: OS>Kernel>TPM

Comment 11 by atwilson@google.com, Today (19 hours ago)

Components: -Enterprise
Labels: -Impacts-Enterprise Enterprise-Triaged
Removing the Enterprise component as this is just stateful-partition corruption which is not enterprise-specific. Keeping on Hotlist-Enterprise so it can be tracked.

Auto-re-enrollment is not currently enabled I believe so it's WAI that you're not seeing re-enrollment - FRE should force you to manually enroll though.

I'm not clear exactly how we want to deal with these kinds of corrupted-disk issues - apronin/gwendal, should one of you own this? Also, those TPM errors are weird in the logs, but they are also from  June so not sure how relevant they are.

Sign in to add a comment