New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 699389 link

Starred by 5 users

Issue metadata

Status: Archived
Owner:
Closed: Apr 2017
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 1
Type: Bug



Sign in to add a comment

cryptohome: fails to remove old cryptohome in ext4 dircrypt case

Project Member Reported by apronin@chromium.org, Mar 8 2017

Issue description

In ext4 dircrypt case on 9347.0.0, when cryptohomed attempts to re-create the user home  (e.g. due to inability to decrypt it), it fails to remove the old cryptohome. 

1) Boot, login, logout.
2) rm /home/.shadow/cryptohome.key*
3) reboot 
4) Upon boot, attempt to login again. The error is seen.

Sample log:
2017-03-07T21:05:10.824931-08:00 ERR cryptohomed[2693]: Failed to decrypt any keysets for f5bda69cdfc94659061f6e8bd36d8dd5be83b158
2017-03-07T21:05:10.824987-08:00 ERR cryptohomed[2693]: Error, cryptohome must be re-created because of fatal error.
2017-03-07T21:05:10.828390-08:00 ERR kernel: [   56.650645] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.4/fs/dcache.c:754
2017-03-07T21:05:10.828430-08:00 ERR kernel: [   56.650663] in_atomic(): 0, irqs_disabled(): 0, pid: 2922, name: MountThread
2017-03-07T21:05:10.828434-08:00 WARNING kernel: [   56.650674] CPU: 2 PID: 2922 Comm: MountThread Tainted: G     U          4.4.44 #3
2017-03-07T21:05:10.828437-08:00 WARNING kernel: [   56.650680] Hardware name: Google Pyro/Pyro, BIOS Google_Pyro.9042.52.0 02/17/2017
2017-03-07T21:05:10.828439-08:00 WARNING kernel: [   56.650688]  0000000000000286 00000000e818fd8a ffff880068bc7be0 ffffffff89092765
2017-03-07T21:05:10.828441-08:00 WARNING kernel: [   56.650705]  0000000000000b6a ffff88016813b800 ffff880068bc7c00 ffffffff88e8b52d
2017-03-07T21:05:10.828444-08:00 WARNING kernel: [   56.650720]  ffffffff897c00f8 00000000000002f2 ffff880068bc7c28 ffffffff88e8b5d1
2017-03-07T21:05:10.828446-08:00 WARNING kernel: [   56.650735] Call Trace:
2017-03-07T21:05:10.828449-08:00 WARNING kernel: [   56.650753]  [<ffffffff89092765>] dump_stack+0x4d/0x63
2017-03-07T21:05:10.828452-08:00 WARNING kernel: [   56.650766]  [<ffffffff88e8b52d>] ___might_sleep+0x149/0x14e
2017-03-07T21:05:10.828454-08:00 WARNING kernel: [   56.650775]  [<ffffffff88e8b5d1>] __might_sleep+0x9f/0xa6
2017-03-07T21:05:10.828457-08:00 WARNING kernel: [   56.650787]  [<ffffffff88f77574>] dput+0x2f/0x206
2017-03-07T21:05:10.828460-08:00 WARNING kernel: [   56.650798]  [<ffffffff890094f9>] ext4_d_revalidate+0x6e/0x97
2017-03-07T21:05:10.828462-08:00 WARNING kernel: [   56.650806]  [<ffffffff88f6f25c>] lookup_fast+0xb5/0x296
2017-03-07T21:05:10.828464-08:00 WARNING kernel: [   56.650816]  [<ffffffff88f7037e>] path_openat+0x2b0/0xc56
2017-03-07T21:05:10.828467-08:00 WARNING kernel: [   56.650825]  [<ffffffff8900a020>] ? ext4_free_crypt_info+0x3b/0x3e
2017-03-07T21:05:10.828469-08:00 WARNING kernel: [   56.650835]  [<ffffffff8900a431>] ? _ext4_get_encryption_info+0x3d7/0x416
2017-03-07T21:05:10.828471-08:00 WARNING kernel: [   56.650844]  [<ffffffff88f72a99>] do_filp_open+0x5c/0xc6
2017-03-07T21:05:10.828473-08:00 WARNING kernel: [   56.650856]  [<ffffffff88f56bf7>] ? slab_pre_alloc_hook+0x29/0x2f
2017-03-07T21:05:10.828476-08:00 WARNING kernel: [   56.650868]  [<ffffffff894fedac>] ? _raw_spin_unlock+0xe/0x20
2017-03-07T21:05:10.828478-08:00 WARNING kernel: [   56.650880]  [<ffffffff88f63cb1>] do_sys_open+0x86/0x198
2017-03-07T21:05:10.828480-08:00 WARNING kernel: [   56.650889]  [<ffffffff88f63cb1>] ? do_sys_open+0x86/0x198
2017-03-07T21:05:10.828482-08:00 WARNING kernel: [   56.650899]  [<ffffffff88f63de1>] SyS_open+0x1e/0x20
2017-03-07T21:05:10.828485-08:00 WARNING kernel: [   56.650909]  [<ffffffff894ff1a1>] entry_SYSCALL_64_fastpath+0x1c/0x74
2017-03-07T21:05:11.828356-08:00 ERR kernel: [   57.650177] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.4/fs/dcache.c:754
2017-03-07T21:05:11.828383-08:00 ERR kernel: [   57.650187] in_atomic(): 0, irqs_disabled(): 0, pid: 2922, name: MountThread
2017-03-07T21:05:11.828385-08:00 WARNING kernel: [   57.650192] CPU: 3 PID: 2922 Comm: MountThread Tainted: G     U          4.4.44 #3
2017-03-07T21:05:11.828386-08:00 WARNING kernel: [   57.650195] Hardware name: Google Pyro/Pyro, BIOS Google_Pyro.9042.52.0 02/17/2017
2017-03-07T21:05:11.828387-08:00 WARNING kernel: [   57.650199]  0000000000000286 00000000e818fd8a ffff880068bc7b18 ffffffff89092765
2017-03-07T21:05:11.828388-08:00 WARNING kernel: [   57.650206]  0000000000000b6a ffff88016813b800 ffff880068bc7b38 ffffffff88e8b52d
2017-03-07T21:05:11.828418-08:00 WARNING kernel: [   57.650212]  ffffffff897c00f8 00000000000002f2 ffff880068bc7b60 ffffffff88e8b5d1
2017-03-07T21:05:11.828421-08:00 WARNING kernel: [   57.650219] Call Trace:
2017-03-07T21:05:11.828422-08:00 WARNING kernel: [   57.650230]  [<ffffffff89092765>] dump_stack+0x4d/0x63
2017-03-07T21:05:11.828423-08:00 WARNING kernel: [   57.650236]  [<ffffffff88e8b52d>] ___might_sleep+0x149/0x14e
2017-03-07T21:05:11.828424-08:00 WARNING kernel: [   57.650240]  [<ffffffff88e8b5d1>] __might_sleep+0x9f/0xa6
2017-03-07T21:05:11.828425-08:00 WARNING kernel: [   57.650246]  [<ffffffff88f77574>] dput+0x2f/0x206
2017-03-07T21:05:11.828426-08:00 WARNING kernel: [   57.650251]  [<ffffffff890094f9>] ext4_d_revalidate+0x6e/0x97
2017-03-07T21:05:11.828426-08:00 WARNING kernel: [   57.650254]  [<ffffffff88f6f25c>] lookup_fast+0xb5/0x296
2017-03-07T21:05:11.828427-08:00 WARNING kernel: [   57.650258]  [<ffffffff88f6f813>] walk_component+0x5f/0x179
2017-03-07T21:05:11.828429-08:00 WARNING kernel: [   57.650263]  [<ffffffff88f6db5d>] ? __inode_permission+0x78/0x9c
2017-03-07T21:05:11.828430-08:00 WARNING kernel: [   57.650266]  [<ffffffff88f6fad1>] link_path_walk+0x1a4/0x46b
2017-03-07T21:05:11.828431-08:00 WARNING kernel: [   57.650270]  [<ffffffff88f6ee79>] ? path_init+0x10f/0x2bf
2017-03-07T21:05:11.828432-08:00 WARNING kernel: [   57.650273]  [<ffffffff88f6fe3d>] path_lookupat+0x31/0x103
2017-03-07T21:05:11.828432-08:00 WARNING kernel: [   57.650276]  [<ffffffff88f716ca>] filename_lookup+0x8c/0x11d
2017-03-07T21:05:11.828433-08:00 WARNING kernel: [   57.650282]  [<ffffffff88f56bf7>] ? slab_pre_alloc_hook+0x29/0x2f
2017-03-07T21:05:11.828434-08:00 WARNING kernel: [   57.650286]  [<ffffffff88f5884d>] ? kmem_cache_alloc+0x24/0x123
2017-03-07T21:05:11.828435-08:00 WARNING kernel: [   57.650289]  [<ffffffff88f713ab>] ? getname_flags+0x3d/0x194
2017-03-07T21:05:11.828436-08:00 WARNING kernel: [   57.650293]  [<ffffffff88f714e7>] ? getname_flags+0x179/0x194
2017-03-07T21:05:11.828437-08:00 WARNING kernel: [   57.650296]  [<ffffffff88f71829>] user_path_at_empty+0x37/0x3d
2017-03-07T21:05:11.828438-08:00 WARNING kernel: [   57.650299]  [<ffffffff88f71829>] ? user_path_at_empty+0x37/0x3d
2017-03-07T21:05:11.828439-08:00 WARNING kernel: [   57.650303]  [<ffffffff88f68db1>] vfs_fstatat+0x60/0xaf
2017-03-07T21:05:11.828440-08:00 WARNING kernel: [   57.650306]  [<ffffffff88f68edd>] vfs_lstat+0x1e/0x20
2017-03-07T21:05:11.828441-08:00 WARNING kernel: [   57.650310]  [<ffffffff88f68f54>] SYSC_newlstat+0x24/0x51
2017-03-07T21:05:11.828442-08:00 WARNING kernel: [   57.650315]  [<ffffffff88f405ea>] ? __might_fault+0x35/0x37
2017-03-07T21:05:11.828443-08:00 WARNING kernel: [   57.650318]  [<ffffffff88f74aa1>] ? SyS_getdents+0xeb/0x117
2017-03-07T21:05:11.828444-08:00 WARNING kernel: [   57.650322]  [<ffffffff88f74803>] ? iterate_dir+0x115/0x115
2017-03-07T21:05:11.828444-08:00 WARNING kernel: [   57.650325]  [<ffffffff88f69020>] SyS_newlstat+0xe/0x10
2017-03-07T21:05:11.828445-08:00 WARNING kernel: [   57.650331]  [<ffffffff894ff1a1>] entry_SYSCALL_64_fastpath+0x1c/0x74
2017-03-07T21:05:11.978495-08:00 ERR cryptohomed[2693]: Fatal decryption error, but unable to remove cryptohome.

 
The recently merged https://chromium-review.googlesource.com/c/440747/ didn't seem to address this issue, as it is already present in 9347.0.0.
I tried to reproduce this with ToT (Platform 9348) reef, but I couldn't.
After a number of login attempts, cryptohome successfully recreated the homedir.

The pasted log contains "Fatal decryption error, but unable to remove cryptohome" which can be seen only when HomeDirs::Remove() fails, which means DeleteFile() failed for /home/.shadow/<user hash>, /home/user/<user hash>, or /home/root/<user hash>.
Please note that DeleteFile() fails only when the specified path exists and it cannot be deleted (i.e. returns true even when the specified path doesn't exist).

Could you provide more information about your setup?
For example:
- Device name
- Whether removing "--direncryption" from /etc/init/cryptohomed.conf to use eCryptfs affects the result or not
- Which directory is causing the HomeDirs::Remove() failure
Re #2:

1) Device name: pyro

2) After removing --direncryption I don't see that error. It also fails, but with:
2017-03-08T10:57:57.175637-08:00 ERR cryptohomed[2614]: TPM public key hash mismatch.
2017-03-08T10:57:57.176600-08:00 ERR cryptohomed[2614]: Failed to decrypt any keysets for f5bda69cdfc94659061f6e8bd36d8dd5be83b158
2017-03-08T10:57:57.177330-08:00 ERR cryptohomed[2614]: Error, cryptohome must be re-created because of fatal error.
2017-03-08T10:57:57.296060-08:00 ERR cryptohomed[2614]: Asked to mount nonexistent user

3) I instrumented platform::DeleteFile. It gave me
Platform::DeleteFile failed for /home/.shadow/f5bda69cdfc94659061f6e8bd36d8dd5be83b158

I believe that's the very first in what HomeDirs::Remove() does, so the other 2 dirs were not attempted to be removed.

I checked, the directory indeed exists and is not empty:
# ls -ld /home/.shadow/f5bda69cdfc94659061f6e8bd36d8dd5be83b158/
drwx------. 3 root root 4096 Mar  8 11:08 /home/.shadow/f5bda69cdfc94659061f6e8bd36d8dd5be83b158/
# ls -l /home/.shadow/f5bda69cdfc94659061f6e8bd36d8dd5be83b158/
total 8
drwx------. 4 root root 4096 Mar  8 11:06 mount
# ls -l /home/.shadow/f5bda69cdfc94659061f6e8bd36d8dd5be83b158/mount/
total 16
drwxrwx--T.  6 root    daemon-store   4096 Mar  8 11:06 ZxEePmWaJ9SL1eH84RaOoB
drwx--x---. 25 chronos chronos-access 4096 Mar  8 11:08 jIj,z4Q2OvgglqzDsUrt5C

Calling 'rm -rf' on it produces the same 'BUG: sleeping function' in the logs and fails:
# rm -rf /home/.shadow/f5bda69cdfc94659061f6e8bd36d8dd5be83b158/
rm: cannot remove '/home/.shadow/f5bda69cdfc94659061f6e8bd36d8dd5be83b158/mount/jIj,z4Q2OvgglqzDsUrt5C/7jdpDECSCNghzor8WKU4AC/dfj2PS6Eiv5w+2vryqyLpaatmKA8lwcLJ7X,6qlrfNB/3hyR+DlmAut1F0ZCcTTPZD/_eyde5wQ,XQxApp,2S7uyyGRD7mvesR4+': Structure needs cleaning

To better define the test I'm running, here are all the steps:
1) crossystem clear_tpm_owner_request=1; reboot
2) Go through OOBE, create a user w/o doing corp enrollment.
3) Sign out.
4) rm /home/.shadow/cryptohome.key*
5) reboot
6) Attempt to sign in as that user.
I believe restarting cryptohomed at step 5 is also sufficient, I'm going through reboot just to make a cleaner test and avoid potential side effects.
Other things that can be different: I build kernel for my test image from sources, but I'm on master branch w/o any changes there. I'll try sync-ing to ToT again and going with the pre-built kernel.
With the change from #5 (repo sync to ToT at ~9349.0 + cros_workon stop kernel-4_4), I don't the error anymore.

The 'BUG: sleeping function called from invalid context' is still there, but only one instead of two, and it doesn't lead to 'unable to remove cryptohome', and after several retries login succeeds:

2017-03-08T12:22:42.546968-08:00 ERR cryptohomed[2563]: Error, cryptohome must be re-created because of fatal error.
2017-03-08T12:22:42.549082-08:00 ERR kernel: [   57.585294] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.4/fs/dcache.c:754
2017-03-08T12:22:42.549104-08:00 ERR kernel: [   57.585303] in_atomic(): 0, irqs_disabled(): 0, pid: 2891, name: MountThread
2017-03-08T12:22:42.549106-08:00 WARNING kernel: [   57.585309] CPU: 2 PID: 2891 Comm: MountThread Tainted: G     U          4.4.44 #3
2017-03-08T12:22:42.549107-08:00 WARNING kernel: [   57.585312] Hardware name: Google Pyro/Pyro, BIOS Google_Pyro.9042.52.0 02/17/2017
2017-03-08T12:22:42.549108-08:00 WARNING kernel: [   57.585315]  0000000000000286 0000000051d3e667 ffff88006dccfbe0 ffffffffa4492765
2017-03-08T12:22:42.549109-08:00 WARNING kernel: [   57.585322]  0000000000000b4b ffff880067ba0e00 ffff88006dccfc00 ffffffffa428b52d
2017-03-08T12:22:42.549110-08:00 WARNING kernel: [   57.585329]  ffffffffa4bc00f8 00000000000002f2 ffff88006dccfc28 ffffffffa428b5d1
2017-03-08T12:22:42.549111-08:00 WARNING kernel: [   57.585335] Call Trace:
2017-03-08T12:22:42.549113-08:00 WARNING kernel: [   57.585346]  [<ffffffffa4492765>] dump_stack+0x4d/0x63
2017-03-08T12:22:42.549114-08:00 WARNING kernel: [   57.585352]  [<ffffffffa428b52d>] ___might_sleep+0x149/0x14e
2017-03-08T12:22:42.549115-08:00 WARNING kernel: [   57.585356]  [<ffffffffa428b5d1>] __might_sleep+0x9f/0xa6
2017-03-08T12:22:42.549116-08:00 WARNING kernel: [   57.585362]  [<ffffffffa4377574>] dput+0x2f/0x206
2017-03-08T12:22:42.549117-08:00 WARNING kernel: [   57.585367]  [<ffffffffa44094f9>] ext4_d_revalidate+0x6e/0x97
2017-03-08T12:22:42.549118-08:00 WARNING kernel: [   57.585370]  [<ffffffffa436f25c>] lookup_fast+0xb5/0x296
2017-03-08T12:22:42.549119-08:00 WARNING kernel: [   57.585374]  [<ffffffffa437037e>] path_openat+0x2b0/0xc56
2017-03-08T12:22:42.549120-08:00 WARNING kernel: [   57.585378]  [<ffffffffa440a020>] ? ext4_free_crypt_info+0x3b/0x3e
2017-03-08T12:22:42.549120-08:00 WARNING kernel: [   57.585381]  [<ffffffffa440a431>] ? _ext4_get_encryption_info+0x3d7/0x416
2017-03-08T12:22:42.549121-08:00 WARNING kernel: [   57.585385]  [<ffffffffa4372a99>] do_filp_open+0x5c/0xc6
2017-03-08T12:22:42.549122-08:00 WARNING kernel: [   57.585391]  [<ffffffffa4356bf7>] ? slab_pre_alloc_hook+0x29/0x2f
2017-03-08T12:22:42.549123-08:00 WARNING kernel: [   57.585396]  [<ffffffffa48fedac>] ? _raw_spin_unlock+0xe/0x20
2017-03-08T12:22:42.549124-08:00 WARNING kernel: [   57.585401]  [<ffffffffa4363cb1>] do_sys_open+0x86/0x198
2017-03-08T12:22:42.549125-08:00 WARNING kernel: [   57.585405]  [<ffffffffa4363cb1>] ? do_sys_open+0x86/0x198
2017-03-08T12:22:42.549126-08:00 WARNING kernel: [   57.585409]  [<ffffffffa4363de1>] SyS_open+0x1e/0x20
2017-03-08T12:22:42.549126-08:00 WARNING kernel: [   57.585413]  [<ffffffffa48ff1a1>] entry_SYSCALL_64_fastpath+0x1c/0x74
2017-03-08T12:22:42.621799-08:00 INFO sshd[3581]: Did not receive identification string from 127.0.0.1 port 60618
2017-03-08T12:22:42.689482-08:00 ERR cryptohomed[2563]: Asked to mount nonexistent user

Going through 'crossystem clear_tpm_owner_request=1' doesn't change anything - still no error from the bug description.

Were there any kernel changes relevant for ext4 dircrypt support recently that could be the reason? Or was it something about the specifics of the mounted directory structure?
Owner: gwendal@chromium.org
Status: Assigned (was: Untriaged)
Same stack as:

https://buganizer.corp.google.com/issues/35775060
Looks like those "unable to remove cryptohome" errors still happen, and are just more rare, or depend on the contents of the mounted dir.

While playing with imitating transient tpm errors, bumped into it again. After enough errors happen in a row, Chrome invokes recreating the user account after too many failed login attempts. After that those errors are seen. I will send a feedback report, but here's an excerpt from the log:

2017-03-09T14:57:57.781647-08:00 ERR kernel: [ 4197.637623] BUG: sleeping function called from invalid context at /mnt/host/source/src/third_party/kernel/v4.4/fs/dcache.c:754
2017-03-09T14:57:57.781691-08:00 ERR kernel: [ 4197.637641] in_atomic(): 0, irqs_disabled(): 0, pid: 25141, name: MountThread
2017-03-09T14:57:57.781695-08:00 WARNING kernel: [ 4197.637652] CPU: 3 PID: 25141 Comm: MountThread Tainted: G     U          4.4.44 #3
2017-03-09T14:57:57.781697-08:00 WARNING kernel: [ 4197.637658] Hardware name: Google Pyro/Pyro, BIOS Google_Pyro.9042.52.0 02/17/2017
2017-03-09T14:57:57.781699-08:00 WARNING kernel: [ 4197.637666]  0000000000000286 00000000e7f5b3b4 ffff880063bbbb78 ffffffff8ca92765
2017-03-09T14:57:57.781702-08:00 WARNING kernel: [ 4197.637682]  0000000000006235 ffff880064d5d400 ffff880063bbbb98 ffffffff8c88b52d
2017-03-09T14:57:57.781704-08:00 WARNING kernel: [ 4197.637697]  ffffffff8d1c00f8 00000000000002f2 ffff880063bbbbc0 ffffffff8c88b5d1
2017-03-09T14:57:57.781707-08:00 WARNING kernel: [ 4197.637711] Call Trace:
2017-03-09T14:57:57.781746-08:00 WARNING kernel: [ 4197.637730]  [<ffffffff8ca92765>] dump_stack+0x4d/0x63
2017-03-09T14:57:57.781751-08:00 WARNING kernel: [ 4197.637742]  [<ffffffff8c88b52d>] ___might_sleep+0x149/0x14e
2017-03-09T14:57:57.781753-08:00 WARNING kernel: [ 4197.637751]  [<ffffffff8c88b5d1>] __might_sleep+0x9f/0xa6
2017-03-09T14:57:57.781755-08:00 WARNING kernel: [ 4197.637763]  [<ffffffff8c977574>] dput+0x2f/0x206
2017-03-09T14:57:57.781758-08:00 WARNING kernel: [ 4197.637774]  [<ffffffff8ca094f9>] ext4_d_revalidate+0x6e/0x97
2017-03-09T14:57:57.781760-08:00 WARNING kernel: [ 4197.637783]  [<ffffffff8c96f25c>] lookup_fast+0xb5/0x296
2017-03-09T14:57:57.781762-08:00 WARNING kernel: [ 4197.637791]  [<ffffffff8c96f813>] walk_component+0x5f/0x179
2017-03-09T14:57:57.781765-08:00 WARNING kernel: [ 4197.637800]  [<ffffffff8c96fe8e>] path_lookupat+0x82/0x103
2017-03-09T14:57:57.781768-08:00 WARNING kernel: [ 4197.637808]  [<ffffffff8c9716ca>] filename_lookup+0x8c/0x11d
2017-03-09T14:57:57.781770-08:00 WARNING kernel: [ 4197.637821]  [<ffffffff8c956bf7>] ? slab_pre_alloc_hook+0x29/0x2f
2017-03-09T14:57:57.781773-08:00 WARNING kernel: [ 4197.637831]  [<ffffffff8c95884d>] ? kmem_cache_alloc+0x24/0x123
2017-03-09T14:57:57.781775-08:00 WARNING kernel: [ 4197.637839]  [<ffffffff8c9713ab>] ? getname_flags+0x3d/0x194
2017-03-09T14:57:57.781777-08:00 WARNING kernel: [ 4197.637847]  [<ffffffff8c9714e7>] ? getname_flags+0x179/0x194
2017-03-09T14:57:57.781779-08:00 WARNING kernel: [ 4197.637856]  [<ffffffff8c971829>] user_path_at_empty+0x37/0x3d
2017-03-09T14:57:57.781782-08:00 WARNING kernel: [ 4197.637864]  [<ffffffff8c971829>] ? user_path_at_empty+0x37/0x3d
2017-03-09T14:57:57.781784-08:00 WARNING kernel: [ 4197.637873]  [<ffffffff8c968db1>] vfs_fstatat+0x60/0xaf
2017-03-09T14:57:57.781787-08:00 WARNING kernel: [ 4197.637882]  [<ffffffff8c968edd>] vfs_lstat+0x1e/0x20
2017-03-09T14:57:57.781790-08:00 WARNING kernel: [ 4197.637890]  [<ffffffff8c968f54>] SYSC_newlstat+0x24/0x51
2017-03-09T14:57:57.781793-08:00 WARNING kernel: [ 4197.637901]  [<ffffffff8c969020>] SyS_newlstat+0xe/0x10
2017-03-09T14:57:57.781795-08:00 WARNING kernel: [ 4197.637913]  [<ffffffff8ceff1a1>] entry_SYSCALL_64_fastpath+0x1c/0x74
2017-03-09T14:57:57.780149-08:00 WARNING cryptohomed[25132]: No valid keysets on disk for 35cf97bbf8ac5940b925b6d6c9fcd703ebc17561
2017-03-09T14:57:57.780214-08:00 ERR cryptohomed[25132]: Failed to decrypt any keysets for 35cf97bbf8ac5940b925b6d6c9fcd703ebc17561
2017-03-09T14:57:57.780255-08:00 ERR cryptohomed[25132]: Error, cryptohome must be re-created because of fatal error.
2017-03-09T14:57:57.889569-08:00 ERR cryptohomed[25132]: Fatal decryption error, but unable to remove cryptohome.

Cc: edjee@google.com tytso@google.com
More pointers in https://buganizer.corp.google.com/issues/35775060#comment12

I can repro the issue with xfstests generic/068.
The test passes on 4.4.44, but dmesg shows the same stack above.

Looking at the disassembly, we enter lookup_fast with LOOKUP_RCU flag set. 

one easy fix is to remove might_sleep() in dput, but there is a code section that do sleep (when the dentry must be destroyed). 

dmesg.gz
7.2 KB Download
We are missing a patch: 
commit 03a8bb0e53d9562276045bdfcf2b5de2e4cff5a1

    ext4/fscrypto: avoid RCU lookup in d_revalidate

Re #10: In my experiments, "BUG: sleeping function called from invalid context" disappeared after applying this patch.
Project Member

Comment 12 by bugdroid1@chromium.org, Mar 13 2017

Labels: merge-merged-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/3f4b2b78bb4a5191008519d69c7c1d81de6fcfcf

commit 3f4b2b78bb4a5191008519d69c7c1d81de6fcfcf
Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date: Mon Mar 13 07:49:23 2017

BACKPORT: ext4/fscrypto: avoid RCU lookup in d_revalidate

As Al pointed, d_revalidate should return RCU lookup before using d_inode.
This was originally introduced by:
commit 34286d666230 ("fs: rcu-walk aware d_revalidate method").

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable <stable@vger.kernel.org>
(cherry picked from commit 03a8bb0e53d9562276045bdfcf2b5de2e4cff5a1)

BUG= chromium:699389 
TEST=Check on 4.4.44 that with this patch, xfstests generic/068 passes.
Check that recreating user directory does not trigger error.

Change-Id: I349a25ec172297588e0f9f746289999db3f1e303
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/452705
Reviewed-by: Guenter Roeck <groeck@chromium.org>

[modify] https://crrev.com/3f4b2b78bb4a5191008519d69c7c1d81de6fcfcf/fs/ext4/crypto.c

Comment 13 by tfiga@chromium.org, Mar 13 2017

I could reproduce it reliably by a simple "stop ui" over SSH on a reef, which would trigger the crash and also kill the SSH connection. I can confirm that it disappeared after compiling ToT kernel. I guess next build should be fine.
Cc: uekawa@chromium.org
 Issue 700789  has been merged into this issue.
Project Member

Comment 15 by bugdroid1@chromium.org, Mar 13 2017

Labels: merge-merged-release-R58-9334.B-chromeos-4.4
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/24d44369cf5a6ab6d694f4eaa276a922f8c182e0

commit 24d44369cf5a6ab6d694f4eaa276a922f8c182e0
Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date: Mon Mar 13 16:55:43 2017

BACKPORT: ext4/fscrypto: avoid RCU lookup in d_revalidate

As Al pointed, d_revalidate should return RCU lookup before using d_inode.
This was originally introduced by:
commit 34286d666230 ("fs: rcu-walk aware d_revalidate method").

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable <stable@vger.kernel.org>
(cherry picked from commit 03a8bb0e53d9562276045bdfcf2b5de2e4cff5a1)

BUG= chromium:699389 
TEST=Check on 4.4.44 that with this patch, xfstests generic/068 passes.
Check that recreating user directory does not trigger error.

Change-Id: I349a25ec172297588e0f9f746289999db3f1e303
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/452705
Reviewed-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 3f4b2b78bb4a5191008519d69c7c1d81de6fcfcf)
Reviewed-on: https://chromium-review.googlesource.com/453406

[modify] https://crrev.com/24d44369cf5a6ab6d694f4eaa276a922f8c182e0/fs/ext4/crypto.c

Comment 16 by edjee@google.com, Mar 20 2017

Has the change "ext4/fscrypto: avoid RCU lookup in d_revalidate" been cherrypicked to R57 as well?

Comment 17 by gwendal@google.com, Mar 20 2017

Not needed, ext4 crypto is not enabled in R57.
Project Member

Comment 18 by bugdroid1@chromium.org, Mar 25 2017

Labels: merge-merged-chromeos-3.18
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/b1e94c7e2bfd75938a3e4ae5eb5d1df17412bb23

commit b1e94c7e2bfd75938a3e4ae5eb5d1df17412bb23
Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date: Sat Mar 25 02:38:17 2017

BACKPORT: ext4/fscrypto: avoid RCU lookup in d_revalidate

As Al pointed, d_revalidate should return RCU lookup before using d_inode.
This was originally introduced by:
commit 34286d666230 ("fs: rcu-walk aware d_revalidate method").

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable <stable@vger.kernel.org>
(cherry picked from commit 03a8bb0e53d9562276045bdfcf2b5de2e4cff5a1)

BUG= chromium:699389 
TEST=Check on 4.4.44 that with this patch, xfstests generic/068 passes.
Check that recreating user directory does not trigger error.

Change-Id: I349a25ec172297588e0f9f746289999db3f1e303
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/452705
Reviewed-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 3f4b2b78bb4a5191008519d69c7c1d81de6fcfcf)
Reviewed-on: https://chromium-review.googlesource.com/453404

[modify] https://crrev.com/b1e94c7e2bfd75938a3e4ae5eb5d1df17412bb23/fs/ext4/crypto.c

Project Member

Comment 19 by bugdroid1@chromium.org, Mar 25 2017

The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/b1e94c7e2bfd75938a3e4ae5eb5d1df17412bb23

commit b1e94c7e2bfd75938a3e4ae5eb5d1df17412bb23
Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date: Sat Mar 25 02:38:17 2017

BACKPORT: ext4/fscrypto: avoid RCU lookup in d_revalidate

As Al pointed, d_revalidate should return RCU lookup before using d_inode.
This was originally introduced by:
commit 34286d666230 ("fs: rcu-walk aware d_revalidate method").

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable <stable@vger.kernel.org>
(cherry picked from commit 03a8bb0e53d9562276045bdfcf2b5de2e4cff5a1)

BUG= chromium:699389 
TEST=Check on 4.4.44 that with this patch, xfstests generic/068 passes.
Check that recreating user directory does not trigger error.

Change-Id: I349a25ec172297588e0f9f746289999db3f1e303
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/452705
Reviewed-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 3f4b2b78bb4a5191008519d69c7c1d81de6fcfcf)
Reviewed-on: https://chromium-review.googlesource.com/453404

[modify] https://crrev.com/b1e94c7e2bfd75938a3e4ae5eb5d1df17412bb23/fs/ext4/crypto.c

Project Member

Comment 20 by bugdroid1@chromium.org, Mar 25 2017

Labels: merge-merged-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/7e9ffa19f1c8b5fe45611bd9135b9cc88bf50a9f

commit 7e9ffa19f1c8b5fe45611bd9135b9cc88bf50a9f
Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date: Sat Mar 25 02:38:09 2017

BACKPORT: ext4/fscrypto: avoid RCU lookup in d_revalidate

As Al pointed, d_revalidate should return RCU lookup before using d_inode.
This was originally introduced by:
commit 34286d666230 ("fs: rcu-walk aware d_revalidate method").

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable <stable@vger.kernel.org>
(cherry picked from commit 03a8bb0e53d9562276045bdfcf2b5de2e4cff5a1)

BUG= chromium:699389 
TEST=Check on 4.4.44 that with this patch, xfstests generic/068 passes.
Check that recreating user directory does not trigger error.

Change-Id: I349a25ec172297588e0f9f746289999db3f1e303
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/452705
Reviewed-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 3f4b2b78bb4a5191008519d69c7c1d81de6fcfcf)
Reviewed-on: https://chromium-review.googlesource.com/453405

[modify] https://crrev.com/7e9ffa19f1c8b5fe45611bd9135b9cc88bf50a9f/fs/ext4/crypto.c

Project Member

Comment 21 by bugdroid1@chromium.org, Mar 27 2017

Labels: merge-merged-release-R58-9334.B-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/256445c7a188e29220d22be854711e216f7af1e0

commit 256445c7a188e29220d22be854711e216f7af1e0
Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date: Mon Mar 27 18:12:43 2017

BACKPORT: ext4/fscrypto: avoid RCU lookup in d_revalidate

As Al pointed, d_revalidate should return RCU lookup before using d_inode.
This was originally introduced by:
commit 34286d666230 ("fs: rcu-walk aware d_revalidate method").

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable <stable@vger.kernel.org>
(cherry picked from commit 03a8bb0e53d9562276045bdfcf2b5de2e4cff5a1)

BUG= chromium:699389 
TEST=Check on 4.4.44 that with this patch, xfstests generic/068 passes.
Check that recreating user directory does not trigger error.

Change-Id: I349a25ec172297588e0f9f746289999db3f1e303
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/452705
Reviewed-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 3f4b2b78bb4a5191008519d69c7c1d81de6fcfcf)
Reviewed-on: https://chromium-review.googlesource.com/453405
(cherry picked from commit 7e9ffa19f1c8b5fe45611bd9135b9cc88bf50a9f)
Reviewed-on: https://chromium-review.googlesource.com/461220

[modify] https://crrev.com/256445c7a188e29220d22be854711e216f7af1e0/fs/ext4/crypto.c

Project Member

Comment 22 by bugdroid1@chromium.org, Mar 27 2017

Labels: merge-merged-release-R58-9334.B-chromeos-3.18
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/456788eef795544bffdf8dca7040bcb22112fcd3

commit 456788eef795544bffdf8dca7040bcb22112fcd3
Author: Jaegeuk Kim <jaegeuk@kernel.org>
Date: Mon Mar 27 18:15:12 2017

BACKPORT: ext4/fscrypto: avoid RCU lookup in d_revalidate

As Al pointed, d_revalidate should return RCU lookup before using d_inode.
This was originally introduced by:
commit 34286d666230 ("fs: rcu-walk aware d_revalidate method").

Reported-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Jaegeuk Kim <jaegeuk@kernel.org>
Cc: Theodore Ts'o <tytso@mit.edu>
Cc: stable <stable@vger.kernel.org>
(cherry picked from commit 03a8bb0e53d9562276045bdfcf2b5de2e4cff5a1)

BUG= chromium:699389 
TEST=Check on 4.4.44 that with this patch, xfstests generic/068 passes.
Check that recreating user directory does not trigger error.

Change-Id: I349a25ec172297588e0f9f746289999db3f1e303
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/452705
Reviewed-by: Guenter Roeck <groeck@chromium.org>
(cherry picked from commit 3f4b2b78bb4a5191008519d69c7c1d81de6fcfcf)
Reviewed-on: https://chromium-review.googlesource.com/453404
(cherry picked from commit b1e94c7e2bfd75938a3e4ae5eb5d1df17412bb23)
Reviewed-on: https://chromium-review.googlesource.com/461221

[modify] https://crrev.com/456788eef795544bffdf8dca7040bcb22112fcd3/fs/ext4/crypto.c

Status: Fixed (was: Assigned)

Comment 24 by dchan@google.com, May 30 2017

Labels: VerifyIn-60
Labels: VerifyIn-61

Comment 26 by dchan@chromium.org, Jan 22 2018

Status: Archived (was: Fixed)

Sign in to add a comment