New issue
Advanced search Search tips
Note: Color blocks (like or ) mean that a user may not be available. Tooltip shows the reason.

Issue 613758 link

Starred by 2 users

Issue metadata

Status: Verified
Owner:
Closed: Jul 2016
Cc:
Components:
EstimatedDays: ----
NextAction: ----
OS: Chrome
Pri: 2
Type: Bug

Blocking:
issue 347322



Sign in to add a comment

ext4 : crash in xfstest generic/038 - 036

Project Member Reported by gwendal@chromium.org, May 20 2016

Issue description

From the test:
# Stress btrfs' block group allocation and deallocation while running fstrim in
# parallel. Part of the goal is also to get data block groups deallocated so
# that new metadata block groups, using the same physical device space ranges,
# get allocated while fstrim is running. This caused several issues ranging
# from invalid memory accesses, kernel crashes, metadata or data corruption,
# free space cache inconsistencies, free space leaks and memory leaks.



We crash in  ext4_ext_direct_IO(),  BUG_ON(iocb->private == NULL);


 
console-ramoops
127 KB View Download
Summary: ext4 : crash in xfstest generic/038 - 036 (was: ext4 crypto: crash in xfstest generic/038)
Not limited to ext4-crypto. Plain 3.15 kernel fails at the same place.

We failed in test 036 though, not 038.
Stack is similar.

console-ramoops
127 KB View Download
Labels: Arch-x86_64 Kernel-3.14
Reproducing the crash is easy:
On 3.14:

/usr/local/xfs-test/src/aio-dio-regress/aio-dio-fcntl-race \ /mnt/stateful_partition/unencrypted/aio-testfile

It does not happen on 3.18.


Cc: tytso@google.com

Comment 4 by tytso@google.com, May 31 2016

Is this with any of the 3.14.y bugfix patches?  If ChromeOS is using plain 3.14, you're probably missing out on a lot of security and stability patches....  :-)

Comment 5 by tytso@google.com, May 31 2016

You probably want commit 07110343605, introduced in 3.14.33:

commit 07110343605adc3f8ddfc0dde38d29d2e0e210a2
Author: Dmitry Monakhov <dmonakhov@openvz.org>
Date:   Thu Oct 30 10:53:16 2014 -0400

    ext4: prevent bugon on race between write/fcntl
    
    commit a41537e69b4aa43f0fea02498c2595a81267383b upstream.
    
    O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
    twice inside ext4_file_write_iter() and __generic_file_write() which
    result in BUG_ON inside ext4_direct_IO.
    
    Let's initialize iocb->private unconditionally.
    
    TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/
    
    #TYPICAL STACK TRACE:
    kernel BUG at fs/ext4/inode.c:2960!
    invalid opcode: 0000 [#1] SMP
    Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
    CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
    Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
    task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
    RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
    RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
    RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
    RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
    RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
    R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
    FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
    Stack:
     ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
     0000000000000200 0000000000000200 0000000000000001 0000000000000200
     ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
    Call Trace:
     [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
     [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
     [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
     [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
     [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
     [<ffffffff810990e5>] ? local_clock+0x25/0x30
     [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
     [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
     [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
     [<ffffffff810990e5>] ? local_clock+0x25/0x30
     [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
     [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
     [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
     [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
     [<ffffffff811be030>] SyS_io_submit+0x10/0x20
     [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
    Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
    24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
    RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
     RSP <ffff88080f90bb58>
    
    Reported-by: Sasha Levin <sasha.levin@oracle.com>
    Signed-off-by: Theodore Ts'o <tytso@mit.edu>
    Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
    [hujianyang: Backported to 3.10
     - Move initialization of iocb->private to ext4_file_write() as we don't
       have ext4_file_write_iter(), which is introduced by commit 9b884164.
     - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
       as ext4_file_dio_write() is not move into ext4_file_write()]
    Signed-off-by: hujianyang <hujianyang@huawei.com>
    Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>


The patch addresses the issue.
Owner: gwendal@chromium.org
Status: Started (was: Untriaged)
Project Member

Comment 8 by bugdroid1@chromium.org, Jul 2 2016

Labels: merge-merged-chromeos-3.14
The following revision refers to this bug:
  https://chromium.googlesource.com/chromiumos/third_party/kernel/+/31747c2a88c898ade986d7b724351d38180d464c

commit 31747c2a88c898ade986d7b724351d38180d464c
Author: Dmitry Monakhov <dmonakhov@openvz.org>
Date: Thu Oct 30 14:53:16 2014

STABLE: ext4: prevent bugon on race between write/fcntl

commit a41537e69b4aa43f0fea02498c2595a81267383b upstream.

O_DIRECT flags can be toggeled via fcntl(F_SETFL). But this value checked
twice inside ext4_file_write_iter() and __generic_file_write() which
result in BUG_ON inside ext4_direct_IO.

Let's initialize iocb->private unconditionally.

TESTCASE: xfstest:generic/036  https://patchwork.ozlabs.org/patch/402445/

kernel BUG at fs/ext4/inode.c:2960!
invalid opcode: 0000 [#1] SMP
Modules linked in: brd iTCO_wdt lpc_ich mfd_core igb ptp dm_mirror dm_region_hash dm_log dm_mod
CPU: 6 PID: 5505 Comm: aio-dio-fcntl-r Not tainted 3.17.0-rc2-00176-gff5c017 #161
Hardware name: Intel Corporation W2600CR/W2600CR, BIOS SE5C600.86B.99.99.x028.061320111235 06/13/2011
task: ffff88080e95a7c0 ti: ffff88080f908000 task.ti: ffff88080f908000
RIP: 0010:[<ffffffff811fabf2>]  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
RSP: 0018:ffff88080f90bb58  EFLAGS: 00010246
RAX: 0000000000000400 RBX: ffff88080fdb2a28 RCX: 00000000a802c818
RDX: 0000040000080000 RSI: ffff88080d8aeb80 RDI: 0000000000000001
RBP: ffff88080f90bbc8 R08: 0000000000000000 R09: 0000000000001581
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88080d8aeb80
R13: ffff88080f90bbf8 R14: ffff88080fdb28c8 R15: ffff88080fdb2a28
FS:  00007f23b2055700(0000) GS:ffff880818400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f23b2045000 CR3: 000000080cedf000 CR4: 00000000000407e0
Stack:
 ffff88080f90bb98 0000000000000000 7ffffffffffffffe ffff88080fdb2c30
 0000000000000200 0000000000000200 0000000000000001 0000000000000200
 ffff88080f90bbc8 ffff88080fdb2c30 ffff88080f90be08 0000000000000200
Call Trace:
 [<ffffffff8112ca9d>] generic_file_direct_write+0xed/0x180
 [<ffffffff8112f2b2>] __generic_file_write_iter+0x222/0x370
 [<ffffffff811f495b>] ext4_file_write_iter+0x34b/0x400
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff811bd709>] ? aio_run_iocb+0x239/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810abd94>] ? __lock_acquire+0x274/0x700
 [<ffffffff811f4610>] ? ext4_unwritten_wait+0xb0/0xb0
 [<ffffffff811bd756>] aio_run_iocb+0x286/0x410
 [<ffffffff810990e5>] ? local_clock+0x25/0x30
 [<ffffffff810ac359>] ? lock_release_holdtime+0x29/0x190
 [<ffffffff811bc05b>] ? lookup_ioctx+0x4b/0xf0
 [<ffffffff811bde3b>] do_io_submit+0x55b/0x740
 [<ffffffff811bdcaa>] ? do_io_submit+0x3ca/0x740
 [<ffffffff811be030>] SyS_io_submit+0x10/0x20
 [<ffffffff815ce192>] system_call_fastpath+0x16/0x1b
Code: 01 48 8b 80 f0 01 00 00 48 8b 18 49 8b 45 10 0f 85 f1 01 00 00 48 03 45 c8 48 3b 43 48 0f 8f e3 01 00 00 49 83 7c
24 18 00 75 04 <0f> 0b eb fe f0 ff 83 ec 01 00 00 49 8b 44 24 18 8b 00 85 c0 89
RIP  [<ffffffff811fabf2>] ext4_direct_IO+0x162/0x3d0
 RSP <ffff88080f90bb58>

Reported-by: Sasha Levin <sasha.levin@oracle.com>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
Signed-off-by: Dmitry Monakhov <dmonakhov@openvz.org>
[hujianyang: Backported to 3.10
 - Move initialization of iocb->private to ext4_file_write() as we don't
   have ext4_file_write_iter(), which is introduced by commit 9b884164.
 - Adjust context to make 'overwrite' changes apply to ext4_file_dio_write()
   as ext4_file_dio_write() is not move into ext4_file_write()]
Signed-off-by: hujianyang <hujianyang@huawei.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

(cherry picked from commit 07110343605adc3f8ddfc0dde38d29d2e0e210a2)

BUG= chromium:613758 

Change-Id: I0c8b3473dad5d5d6deb15a2788f1169b5eae0227
Signed-off-by: Gwendal Grignou <gwendal@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/349321
Reviewed-by: Sonny Rao <sonnyrao@chromium.org>

[modify] https://crrev.com/31747c2a88c898ade986d7b724351d38180d464c/fs/ext4/file.c

Status: Fixed (was: Started)
Labels: VerifyIn-54
Labels: Bulk-Verified
Status: Verified (was: Fixed)

Sign in to add a comment