1. 23 Oct, 2015 1 commit
  2. 22 Oct, 2015 1 commit
  3. 21 Oct, 2015 1 commit
    • Qu Wenruo's avatar
      btrfs: Avoid truncate tailing page if fallocate range doesn't exceed inode size · 0f6925fa
      Qu Wenruo authored
      
      Current code will always truncate tailing page if its alloc_start is
      smaller than inode size.
      
      For example, the file extent layout is like:
      0	4K	8K	16K	32K
      |<-----Extent A---------------->|
      |<--Inode size: 18K---------->|
      
      But if calling fallocate even for range [0,4K), it will cause btrfs to
      re-truncate the range [16,32K), causing COW and a new extent.
      
      0	4K	8K	16K	32K
      |///////|	<- Fallocate call range
      |<-----Extent A-------->|<--B-->|
      
      The cause is quite easy, just a careless btrfs_truncate_inode() in a
      else branch without extra judgment.
      Fix it by add judgment on whether the fallocate range is beyond isize.
      Signed-off-by: default avatarQu Wenruo <quwenruo@cn.fujitsu.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      0f6925fa
  4. 16 Oct, 2015 2 commits
    • Ross Zwisler's avatar
      mm, dax: fix DAX deadlocks · 0f90cc66
      Ross Zwisler authored
      The following two locking commits in the DAX code:
      
      commit 84317297 ("dax: fix race between simultaneous faults")
      commit 46c043ed ("mm: take i_mmap_lock in unmap_mapping_range() for DAX")
      
      introduced a number of deadlocks and other issues which need to be fixed
      for the v4.3 kernel.  The list of issues in DAX after these commits
      (some newly introduced by the commits, some preexisting) can be found
      here:
      
        https://lkml.org/lkml/2015/9/25/602
      
       (Subject: "Re: [PATCH] dax: fix deadlock in __dax_fault").
      
      This undoes most of the changes introduced by those two commits,
      essentially returning us to the DAX locking scheme that was used in
      v4.2.
      Signed-off-by: default avatarRoss Zwisler <ross.zwisler@linux.intel.com>
      Cc: Alexander Viro <viro@zeniv.linux.org.uk>
      Cc: Dan Williams <dan.j.williams@intel.com>
      Tested-by: default avatarDave Chinner <dchinner@redhat.com>
      Cc: Jan Kara <jack@suse.com>
      Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
      Cc: Matthew Wilcox <matthew.r.wilcox@intel.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      0f90cc66
    • Michal Hocko's avatar
      mm, fs: obey gfp_mapping for add_to_page_cache() · 063d99b4
      Michal Hocko authored
      Commit 6afdb859 ("mm: do not ignore mapping_gfp_mask in page cache
      allocation paths") has caught some users of hardcoded GFP_KERNEL used in
      the page cache allocation paths.  This, however, wasn't complete and
      there were others which went unnoticed.
      
      Dave Chinner has reported the following deadlock for xfs on loop device:
      : With the recent merge of the loop device changes, I'm now seeing
      : XFS deadlock on my single CPU, 1GB RAM VM running xfs/073.
      :
      : The deadlocked is as follows:
      :
      : kloopd1: loop_queue_read_work
      :       xfs_file_iter_read
      :       lock XFS inode XFS_IOLOCK_SHARED (on image file)
      :       page cache read (GFP_KERNEL)
      :       radix tree alloc
      :       memory reclaim
      :       reclaim XFS inodes
      :       log force to unpin inodes
      :       <wait for log IO completion>
      :
      : xfs-cil/loop1: <does log force IO work>
      :       xlog_cil_push
      :       xlog_write
      :       <loop issuing log writes>
      :               xlog_state_get_iclog_space()
      :               <blocks due to all log buffers under write io>
      :               <waits for IO completion>
      :
      : kloopd1: loop_queue_write_work
      :       xfs_file_write_iter
      :       lock XFS inode XFS_IOLOCK_EXCL (on image file)
      :       <wait for inode to be unlocked>
      :
      : i.e. the kloopd, with it's split read and write work queues, has
      : introduced a dependency through memory reclaim. i.e. that writes
      : need to be able to progress for reads make progress.
      :
      : The problem, fundamentally, is that mpage_readpages() does a
      : GFP_KERNEL allocation, rather than paying attention to the inode's
      : mapping gfp mask, which is set to GFP_NOFS.
      :
      : The didn't used to happen, because the loop device used to issue
      : reads through the splice path and that does:
      :
      :       error = add_to_page_cache_lru(page, mapping, index,
      :                       GFP_KERNEL & mapping_gfp_mask(mapping));
      
      This has changed by commit aa4d8616
      
       ("block: loop: switch to VFS
      ITER_BVEC").
      
      This patch changes mpage_readpage{s} to follow gfp mask set for the
      mapping.  There are, however, other places which are doing basically the
      same.
      
      lustre:ll_dir_filler is doing GFP_KERNEL from the function which
      apparently uses GFP_NOFS for other allocations so let's make this
      consistent.
      
      cifs:readpages_get_pages is called from cifs_readpages and
      __cifs_readpages_from_fscache called from the same path obeys mapping
      gfp.
      
      ramfs_nommu_expand_for_mapping is hardcoding GFP_KERNEL as well
      regardless it uses mapping_gfp_mask for the page allocation.
      
      ext4_mpage_readpages is the called from the page cache allocation path
      same as read_pages and read_cache_pages
      
      As I've noticed in my previous post I cannot say I would be happy about
      sprinkling mapping_gfp_mask all over the place and it sounds like we
      should drop gfp_mask argument altogether and use it internally in
      __add_to_page_cache_locked that would require all the filesystems to use
      mapping gfp consistently which I am not sure is the case here.  From a
      quick glance it seems that some file system use it all the time while
      others are selective.
      Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
      Reported-by: default avatarDave Chinner <david@fromorbit.com>
      Cc: "Theodore Ts'o" <tytso@mit.edu>
      Cc: Ming Lei <ming.lei@canonical.com>
      Cc: Andreas Dilger <andreas.dilger@intel.com>
      Cc: Oleg Drokin <oleg.drokin@intel.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: Christoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      063d99b4
  5. 14 Oct, 2015 2 commits
    • Chris Mason's avatar
      btrfs: fix use after free iterating extrefs · dc6c5fb3
      Chris Mason authored
      
      The code for btrfs inode-resolve has never worked properly for
      files with enough hard links to trigger extrefs.  It was trying to
      get the leaf out of a path after freeing the path:
      
      	btrfs_release_path(path);
      	leaf = path->nodes[0];
      	item_size = btrfs_item_size_nr(leaf, slot);
      
      The fix here is to use the extent buffer we cloned just a little higher
      up to avoid deadlocks caused by using the leaf in the path.
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      cc: stable@vger.kernel.org # v3.7+
      cc: Mark Fasheh <mfasheh@suse.de>
      Reviewed-by: default avatarFilipe Manana <fdmanana@suse.com>
      Reviewed-by: default avatarMark Fasheh <mfasheh@suse.de>
      dc6c5fb3
    • David Sterba's avatar
      btrfs: check unsupported filters in balance arguments · 8eb93459
      David Sterba authored
      
      We don't verify that all the balance filter arguments supplemented by
      the flags are actually known to the kernel. Thus we let it silently pass
      and do nothing.
      
      At the moment this means only the 'limit' filter, but we're going to add
      a few more soon so it's better to have that fixed. Also in older stable
      kernels so that it works with newer userspace tools.
      
      Cc: stable@vger.kernel.org # 3.16+
      Signed-off-by: default avatarDavid Sterba <dsterba@suse.com>
      Signed-off-by: default avatarChris Mason <clm@fb.com>
      8eb93459
  6. 12 Oct, 2015 7 commits
    • Tejun Heo's avatar
      writeback: bdi_writeback iteration must not skip dying ones · b817525a
      Tejun Heo authored
      
      bdi_for_each_wb() is used in several places to wake up or issue
      writeback work items to all wb's (bdi_writeback's) on a given bdi.
      The iteration is performed by walking bdi->cgwb_tree; however, the
      tree only indexes wb's which are currently active.
      
      For example, when a memcg gets associated with a different blkcg, the
      old wb is removed from the tree so that the new one can be indexed.
      The old wb starts dying from then on but will linger till all its
      inodes are drained.  As these dying wb's may still host dirty inodes,
      writeback operations which affect all wb's must include them.
      bdi_for_each_wb() skipping dying wb's led to sync(2) missing and
      failing to sync the inodes belonging to those wb's.
      
      This patch adds a RCU protected @bdi->wb_list which lists all wb's
      beloinging to that bdi.  wb's are added on creation and removed on
      release rather than on the start of destruction.  bdi_for_each_wb()
      usages are replaced with list_for_each[_continue]_rcu() iterations
      over @bdi->wb_list and bdi_for_each_wb() and its helpers are removed.
      
      v2: Updated as per Jan.  last_wb ref leak in bdi_split_work_to_wbs()
          fixed and unnecessary list head severing in cgwb_bdi_destroy()
          removed.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-and-tested-by: default avatarArtem Bityutskiy <dedekind1@gmail.com>
      Fixes: ebe41ab0 ("writeback: implement bdi_for_each_wb()")
      Link: http://lkml.kernel.org/g/1443012552.19983.209.camel@gmail.com
      
      
      Cc: Jan Kara <jack@suse.cz>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      b817525a
    • Tejun Heo's avatar
      writeback: fix bdi_writeback iteration in wakeup_dirtytime_writeback() · 6fdf860f
      Tejun Heo authored
      
      wakeup_dirtytime_writeback() walks and wakes up all wb's of all bdi's;
      unfortunately, it was always waking up bdi->wb instead of the wb being
      walked.  Fix it.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Fixes: 001fe6f6
      
       ("writeback: make wakeup_dirtytime_writeback() handle multiple bdi_writeback's")
      Reviewed-by: default avatarJan Kara <jack@suse.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      6fdf860f
    • Konstantin Khlebnikov's avatar
      ovl: free lower_mnt array in ovl_put_super · 5ffdbe8b
      Konstantin Khlebnikov authored
      
      This fixes memory leak after umount.
      
      Kmemleak report:
      
      unreferenced object 0xffff8800ba791010 (size 8):
        comm "mount", pid 2394, jiffies 4294996294 (age 53.920s)
        hex dump (first 8 bytes):
          20 1c 13 02 00 88 ff ff                           .......
        backtrace:
          [<ffffffff811f8cd4>] create_object+0x124/0x2c0
          [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
          [<ffffffff811dffe6>] __kmalloc+0x106/0x340
          [<ffffffffa0152bfc>] ovl_fill_super+0x55c/0x9b0 [overlay]
          [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
          [<ffffffffa0152118>] ovl_mount+0x18/0x20 [overlay]
          [<ffffffff81201ab3>] mount_fs+0x43/0x170
          [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
          [<ffffffff812233ad>] do_mount+0x22d/0xdf0
          [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
          [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Fixes: dd662667 ("ovl: add mutli-layer infrastructure")
      Cc: <stable@vger.kernel.org> # v4.0+
      5ffdbe8b
    • Konstantin Khlebnikov's avatar
      ovl: free stack of paths in ovl_fill_super · 0f95502a
      Konstantin Khlebnikov authored
      
      This fixes small memory leak after mount.
      
      Kmemleak report:
      
      unreferenced object 0xffff88003683fe00 (size 16):
        comm "mount", pid 2029, jiffies 4294909563 (age 33.380s)
        hex dump (first 16 bytes):
          20 27 1f bb 00 88 ff ff 40 4b 0f 36 02 88 ff ff   '......@K.6....
        backtrace:
          [<ffffffff811f8cd4>] create_object+0x124/0x2c0
          [<ffffffff817a059b>] kmemleak_alloc+0x7b/0xc0
          [<ffffffff811dffe6>] __kmalloc+0x106/0x340
          [<ffffffffa01b7a29>] ovl_fill_super+0x389/0x9a0 [overlay]
          [<ffffffff81200ac4>] mount_nodev+0x54/0xa0
          [<ffffffffa01b7118>] ovl_mount+0x18/0x20 [overlay]
          [<ffffffff81201ab3>] mount_fs+0x43/0x170
          [<ffffffff81220d34>] vfs_kern_mount+0x74/0x170
          [<ffffffff812233ad>] do_mount+0x22d/0xdf0
          [<ffffffff812242cb>] SyS_mount+0x7b/0xc0
          [<ffffffff817b6bee>] entry_SYSCALL_64_fastpath+0x12/0x76
          [<ffffffffffffffff>] 0xffffffffffffffff
      Signed-off-by: default avatarKonstantin Khlebnikov <khlebnikov@yandex-team.ru>
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Fixes: a78d9f0d ("ovl: support multiple lower layers")
      Cc: <stable@vger.kernel.org> # v4.0+
      0f95502a
    • Miklos Szeredi's avatar
      ovl: fix open in stacked overlay · 1c8a47df
      Miklos Szeredi authored
      
      If two overlayfs filesystems are stacked on top of each other, then we need
      recursion in ovl_d_select_inode().
      
      I guess d_backing_inode() is supposed to do that.  But currently it doesn't
      and that functionality is open coded in vfs_open().  This is now copied
      into ovl_d_select_inode() to fix this regression.
      Reported-by: default avatarAlban Crequy <alban.crequy@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Fixes: 4bacc9c9 ("overlayfs: Make f_path always point to the overlay...")
      Cc: David Howells <dhowells@redhat.com>
      Cc: <stable@vger.kernel.org> # v4.2+
      1c8a47df
    • David Howells's avatar
      ovl: fix dentry reference leak · ab79efab
      David Howells authored
      
      In ovl_copy_up_locked(), newdentry is leaked if the function exits through
      out_cleanup as this just to out after calling ovl_cleanup() - which doesn't
      actually release the ref on newdentry.
      
      The out_cleanup segment should instead exit through out2 as certainly
      newdentry leaks - and possibly upper does also, though this isn't caught
      given the catch of newdentry.
      
      Without this fix, something like the following is seen:
      
      	BUG: Dentry ffff880023e9eb20{i=f861,n=#ffff880023e82d90} still in use (1) [unmount of tmpfs tmpfs]
      	BUG: Dentry ffff880023ece640{i=0,n=bigfile}  still in use (1) [unmount of tmpfs tmpfs]
      
      when unmounting the upper layer after an error occurred in copyup.
      
      An error can be induced by creating a big file in a lower layer with
      something like:
      
      	dd if=/dev/zero of=/lower/a/bigfile bs=65536 count=1 seek=$((0xf000))
      
      to create a large file (4.1G).  Overlay an upper layer that is too small
      (on tmpfs might do) and then induce a copy up by opening it writably.
      Reported-by: default avatarUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org> # v3.18+
      ab79efab
    • David Howells's avatar
      ovl: use O_LARGEFILE in ovl_copy_up() · 0480334f
      David Howells authored
      
      Open the lower file with O_LARGEFILE in ovl_copy_up().
      
      Pass O_LARGEFILE unconditionally in ovl_copy_up_data() as it's purely for
      catching 32-bit userspace dealing with a file large enough that it'll be
      mishandled if the application isn't aware that there might be an integer
      overflow.  Inside the kernel, there shouldn't be any problems.
      Reported-by: default avatarUlrich Obergfell <uobergfe@redhat.com>
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarMiklos Szeredi <miklos@szeredi.hu>
      Cc: <stable@vger.kernel.org> # v3.18+
      0480334f
  7. 10 Oct, 2015 1 commit
    • Trond Myklebust's avatar
      namei: results of d_is_negative() should be checked after dentry revalidation · daf3761c
      Trond Myklebust authored
      Leandro Awa writes:
       "After switching to version 4.1.6, our parallelized and distributed
        workflows now fail consistently with errors of the form:
      
        T34: ./regex.c:39:22: error: config.h: No such file or directory
      
        From our 'git bisect' testing, the following commit appears to be the
        possible cause of the behavior we've been seeing: commit 766c4cbf"
      
      Al Viro says:
       "What happens is that 766c4cbf got the things subtly wrong.
      
        We used to treat d_is_negative() after lookup_fast() as "fall with
        ENOENT".  That was wrong - checking ->d_flags outside of ->d_seq
        protection is unreliable and failing with hard error on what should've
        fallen back to non-RCU pathname resolution is a bug.
      
        Unfortunately, we'd pulled the test too far up and ran afoul of
        another kind of staleness.  The dentry might have been absolutely
        stable from the RCU point of view (and we might be on UP, etc), but
        stale from the remote fs point of view.  If ->d_revalidate() returns
        "it's actually stale", dentry gets thrown away and the original code
        wouldn't even have looked at its ->d_flags.
      
        What we need is to check ->d_flags where 766c4cbf
      
       does (prior to
        ->d_seq validation) but only use the result in cases where we do not
        discard this dentry outright"
      Reported-by: default avatarLeandro Awa <lawa@nvidia.com>
      Link: https://bugzilla.kernel.org/show_bug.cgi?id=104911
      Fixes: 766c4cbf
      
       ("namei: d_is_negative() should be checked...")
      Tested-by: default avatarLeandro Awa <lawa@nvidia.com>
      Cc: stable@vger.kernel.org # v4.1+
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      Acked-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      daf3761c
  8. 09 Oct, 2015 1 commit
  9. 06 Oct, 2015 3 commits
  10. 05 Oct, 2015 3 commits
    • Filipe Manana's avatar
      Btrfs: fix deadlock when finalizing block group creation · d9a0540a
      Filipe Manana authored
      
      Josef ran into a deadlock while a transaction handle was finalizing the
      creation of its block groups, which produced the following trace:
      
        [260445.593112] fio             D ffff88022a9df468     0  8924   4518 0x00000084
        [260445.593119]  ffff88022a9df468 ffffffff81c134c0 ffff880429693c00 ffff88022a9df488
        [260445.593126]  ffff88022a9e0000 ffff8803490d7b00 ffff8803490d7b18 ffff88022a9df4b0
        [260445.593132]  ffff8803490d7af8 ffff88022a9df488 ffffffff8175a437 ffff8803490d7b00
        [260445.593137] Call Trace:
        [260445.593145]  [<ffffffff8175a437>] schedule+0x37/0x80
        [260445.593189]  [<ffffffffa0850f37>] btrfs_tree_lock+0xa7/0x1f0 [btrfs]
        [260445.593197]  [<ffffffff810db7c0>] ? prepare_to_wait_event+0xf0/0xf0
        [260445.593225]  [<ffffffffa07eac44>] btrfs_lock_root_node+0x34/0x50 [btrfs]
        [260445.593253]  [<ffffffffa07eff6b>] btrfs_search_slot+0x88b/0xa00 [btrfs]
        [260445.593295]  [<ffffffffa08389df>] ? free_extent_buffer+0x4f/0x90 [btrfs]
        [260445.593324]  [<ffffffffa07f1a06>] btrfs_insert_empty_items+0x66/0xc0 [btrfs]
        [260445.593351]  [<ffffffffa07ea94a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
        [260445.593394]  [<ffffffffa08403b9>] btrfs_finish_chunk_alloc+0x1c9/0x570 [btrfs]
        [260445.593427]  [<ffffffffa08002ab>] btrfs_create_pending_block_groups+0x11b/0x200 [btrfs]
        [260445.593459]  [<ffffffffa0800964>] do_chunk_alloc+0x2a4/0x2e0 [btrfs]
        [260445.593491]  [<ffffffffa0803815>] find_free_extent+0xa55/0xd90 [btrfs]
        [260445.593524]  [<ffffffffa0803c22>] btrfs_reserve_extent+0xd2/0x220 [btrfs]
        [260445.593532]  [<ffffffff8119fe5d>] ? account_page_dirtied+0xdd/0x170
        [260445.593564]  [<ffffffffa0803e78>] btrfs_alloc_tree_block+0x108/0x4a0 [btrfs]
        [260445.593597]  [<ffffffffa080c9de>] ? btree_set_page_dirty+0xe/0x10 [btrfs]
        [260445.593626]  [<ffffffffa07eb5cd>] __btrfs_cow_block+0x12d/0x5b0 [btrfs]
        [260445.593654]  [<ffffffffa07ebbff>] btrfs_cow_block+0x11f/0x1c0 [btrfs]
        [260445.593682]  [<ffffffffa07ef8c7>] btrfs_search_slot+0x1e7/0xa00 [btrfs]
        [260445.593724]  [<ffffffffa08389df>] ? free_extent_buffer+0x4f/0x90 [btrfs]
        [260445.593752]  [<ffffffffa07f1a06>] btrfs_insert_empty_items+0x66/0xc0 [btrfs]
        [260445.593830]  [<ffffffffa07ea94a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
        [260445.593905]  [<ffffffffa08403b9>] btrfs_finish_chunk_alloc+0x1c9/0x570 [btrfs]
        [260445.593946]  [<ffffffffa08002ab>] btrfs_create_pending_block_groups+0x11b/0x200 [btrfs]
        [260445.593990]  [<ffffffffa0815798>] btrfs_commit_transaction+0xa8/0xb40 [btrfs]
        [260445.594042]  [<ffffffffa085abcd>] ? btrfs_log_dentry_safe+0x6d/0x80 [btrfs]
        [260445.594089]  [<ffffffffa082bc84>] btrfs_sync_file+0x294/0x350 [btrfs]
        [260445.594115]  [<ffffffff8123e29b>] vfs_fsync_range+0x3b/0xa0
        [260445.594133]  [<ffffffff81023891>] ? syscall_trace_enter_phase1+0x131/0x180
        [260445.594149]  [<ffffffff8123e35d>] do_fsync+0x3d/0x70
        [260445.594169]  [<ffffffff81023bb8>] ? syscall_trace_leave+0xb8/0x110
        [260445.594187]  [<ffffffff8123e600>] SyS_fsync+0x10/0x20
        [260445.594204]  [<ffffffff8175de6e>] entry_SYSCALL_64_fastpath+0x12/0x71
      
      This happened because the same transaction handle created a large number
      of block groups and while finalizing their creation (inserting new items
      and updating existing items in the chunk and device trees) a new metadata
      extent had to be allocated and no free space was found in the current
      metadata block groups, which made find_free_extent() attempt to allocate
      a new block group via do_chunk_alloc(). However at do_chunk_alloc() we
      ended up allocating a new system chunk too and exceeded the threshold
      of 2Mb of reserved chunk bytes, which makes do_chunk_alloc() enter the
      final part of block group creation again (at
      btrfs_create_pending_block_groups()) and attempt to lock again the root
      of the chunk tree when it's already write locked by the same task.
      
      Similarly we can deadlock on extent tree nodes/leafs if while we are
      running delayed references we end up creating a new metadata block group
      in order to allocate a new node/leaf for the extent tree (as part of
      a CoW operation or growing the tree), as btrfs_create_pending_block_groups
      inserts items into the extent tree as well. In this case we get the
      following trace:
      
        [14242.773581] fio             D ffff880428ca3418     0  3615   3100 0x00000084
        [14242.773588]  ffff880428ca3418 ffff88042d66b000 ffff88042a03c800 ffff880428ca3438
        [14242.773594]  ffff880428ca4000 ffff8803e4b20190 ffff8803e4b201a8 ffff880428ca3460
        [14242.773600]  ffff8803e4b20188 ffff880428ca3438 ffffffff8175a437 ffff8803e4b20190
        [14242.773606] Call Trace:
        [14242.773613]  [<ffffffff8175a437>] schedule+0x37/0x80
        [14242.773656]  [<ffffffffa057ff07>] btrfs_tree_lock+0xa7/0x1f0 [btrfs]
        [14242.773664]  [<ffffffff810db7c0>] ? prepare_to_wait_event+0xf0/0xf0
        [14242.773692]  [<ffffffffa0519c44>] btrfs_lock_root_node+0x34/0x50 [btrfs]
        [14242.773720]  [<ffffffffa051ef6b>] btrfs_search_slot+0x88b/0xa00 [btrfs]
        [14242.773750]  [<ffffffffa0520a06>] btrfs_insert_empty_items+0x66/0xc0 [btrfs]
        [14242.773758]  [<ffffffff811ef4a2>] ? kmem_cache_alloc+0x1d2/0x200
        [14242.773786]  [<ffffffffa0520ad1>] btrfs_insert_item+0x71/0xf0 [btrfs]
        [14242.773818]  [<ffffffffa052f292>] btrfs_create_pending_block_groups+0x102/0x200 [btrfs]
        [14242.773850]  [<ffffffffa052f96e>] do_chunk_alloc+0x2ae/0x2f0 [btrfs]
        [14242.773934]  [<ffffffffa0532825>] find_free_extent+0xa55/0xd90 [btrfs]
        [14242.773998]  [<ffffffffa0532c22>] btrfs_reserve_extent+0xc2/0x1d0 [btrfs]
        [14242.774041]  [<ffffffffa0532e38>] btrfs_alloc_tree_block+0x108/0x4a0 [btrfs]
        [14242.774078]  [<ffffffffa051a5cd>] __btrfs_cow_block+0x12d/0x5b0 [btrfs]
        [14242.774118]  [<ffffffffa051abff>] btrfs_cow_block+0x11f/0x1c0 [btrfs]
        [14242.774155]  [<ffffffffa051e8c7>] btrfs_search_slot+0x1e7/0xa00 [btrfs]
        [14242.774194]  [<ffffffffa0528021>] ? __btrfs_free_extent.isra.70+0x2e1/0xcb0 [btrfs]
        [14242.774235]  [<ffffffffa0520a06>] btrfs_insert_empty_items+0x66/0xc0 [btrfs]
        [14242.774274]  [<ffffffffa051994a>] ? btrfs_alloc_path+0x1a/0x20 [btrfs]
        [14242.774318]  [<ffffffffa052c433>] __btrfs_run_delayed_refs+0xbb3/0x1020 [btrfs]
        [14242.774358]  [<ffffffffa052f404>] btrfs_run_delayed_refs.part.78+0x74/0x280 [btrfs]
        [14242.774391]  [<ffffffffa052f627>] btrfs_run_delayed_refs+0x17/0x20 [btrfs]
        [14242.774432]  [<ffffffffa05be236>] commit_cowonly_roots+0x8d/0x2bd [btrfs]
        [14242.774474]  [<ffffffffa059d07f>] ? __btrfs_run_delayed_items+0x1cf/0x210 [btrfs]
        [14242.774516]  [<ffffffffa05adac3>] ? btrfs_qgroup_account_extents+0x83/0x130 [btrfs]
        [14242.774558]  [<ffffffffa0544c40>] btrfs_commit_transaction+0x590/0xb40 [btrfs]
        [14242.774599]  [<ffffffffa0589b9d>] ? btrfs_log_dentry_safe+0x6d/0x80 [btrfs]
        [14242.774642]  [<ffffffffa055ac54>] btrfs_sync_file+0x294/0x350 [btrfs]
        [14242.774650]  [<ffffffff8123e29b>] vfs_fsync_range+0x3b/0xa0
        [14242.774657]  [<ffffffff81023891>] ? syscall_trace_enter_phase1+0x131/0x180
        [14242.774663]  [<ffffffff8123e35d>] do_fsync+0x3d/0x70
        [14242.774669]  [<ffffffff81023bb8>] ? syscall_trace_leave+0xb8/0x110
        [14242.774675]  [<ffffffff8123e600>] SyS_fsync+0x10/0x20
        [14242.774681]  [<ffffffff8175de6e>] entry_SYSCALL_64_fastpath+0x12/0x71
      
      Fix this by never recursing into the finalization phase of block group
      creation and making sure we never trigger the finalization of block group
      creation while running delayed references.
      Reported-by: default avatarJosef Bacik <jbacik@fb.com>
      Fixes: 00d80e34
      
       ("Btrfs: fix quick exhaustion of the system array in the superblock")
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      d9a0540a
    • Filipe Manana's avatar
      Btrfs: update fix for read corruption of compressed and shared extents · 808f80b4
      Filipe Manana authored
      My previous fix in commit 005efedf
      
       ("Btrfs: fix read corruption of
      compressed and shared extents") was effective only if the compressed
      extents cover a file range with a length that is not a multiple of 16
      pages. That's because the detection of when we reached a different range
      of the file that shares the same compressed extent as the previously
      processed range was done at extent_io.c:__do_contiguous_readpages(),
      which covers subranges with a length up to 16 pages, because
      extent_readpages() groups the pages in clusters no larger than 16 pages.
      So fix this by tracking the start of the previously processed file
      range's extent map at extent_readpages().
      
      The following test case for fstests reproduces the issue:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _need_to_be_root
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _require_cloner
      
        rm -f $seqres.full
      
        test_clone_and_read_compressed_extent()
        {
            local mount_opts=$1
      
            _scratch_mkfs >>$seqres.full 2>&1
            _scratch_mount $mount_opts
      
            # Create our test file with a single extent of 64Kb that is going to
            # be compressed no matter which compression algo is used (zlib/lzo).
            $XFS_IO_PROG -f -c "pwrite -S 0xaa 0K 64K" \
                $SCRATCH_MNT/foo | _filter_xfs_io
      
            # Now clone the compressed extent into an adjacent file offset.
            $CLONER_PROG -s 0 -d $((64 * 1024)) -l $((64 * 1024)) \
                $SCRATCH_MNT/foo $SCRATCH_MNT/foo
      
            echo "File digest before unmount:"
            md5sum $SCRATCH_MNT/foo | _filter_scratch
      
            # Remount the fs or clear the page cache to trigger the bug in
            # btrfs. Because the extent has an uncompressed length that is a
            # multiple of 16 pages, all the pages belonging to the second range
            # of the file (64K to 128K), which points to the same extent as the
            # first range (0K to 64K), had their contents full of zeroes instead
            # of the byte 0xaa. This was a bug exclusively in the read path of
            # compressed extents, the correct data was stored on disk, btrfs
            # just failed to fill in the pages correctly.
            _scratch_remount
      
            echo "File digest after remount:"
            # Must match the digest we got before.
            md5sum $SCRATCH_MNT/foo | _filter_scratch
        }
      
        echo -e "\nTesting with zlib compression..."
        test_clone_and_read_compressed_extent "-o compress=zlib"
      
        _scratch_unmount
      
        echo -e "\nTesting with lzo compression..."
        test_clone_and_read_compressed_extent "-o compress=lzo"
      
        status=0
        exit
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      Tested-by: default avatarTimofey Titovets <nefelim4ag@gmail.com>
      808f80b4
    • Filipe Manana's avatar
      Btrfs: send, fix corner case for reference overwrite detection · b786f16a
      Filipe Manana authored
      
      When the inode given to did_overwrite_ref() matches the current progress
      and has a reference that collides with the reference of other inode that
      has the same number as the current progress, we were always telling our
      caller that the inode's reference was overwritten, which is incorrect
      because the other inode might be a new inode (different generation number)
      in which case we must return false from did_overwrite_ref() so that its
      callers don't use an orphanized path for the inode (as it will never be
      orphanized, instead it will be unlinked and the new inode created later).
      
      The following test case for fstests reproduces the issue:
      
        seq=`basename $0`
        seqres=$RESULT_DIR/$seq
        echo "QA output created by $seq"
      
        tmp=/tmp/$$
        status=1	# failure is the default!
        trap "_cleanup; exit \$status" 0 1 2 3 15
      
        _cleanup()
        {
            rm -fr $send_files_dir
            rm -f $tmp.*
        }
      
        # get standard environment, filters and checks
        . ./common/rc
        . ./common/filter
      
        # real QA test starts here
        _supported_fs btrfs
        _supported_os Linux
        _require_scratch
        _need_to_be_root
      
        send_files_dir=$TEST_DIR/btrfs-test-$seq
      
        rm -f $seqres.full
        rm -fr $send_files_dir
        mkdir $send_files_dir
      
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount
      
        # Create our test file with a single extent of 64K.
        mkdir -p $SCRATCH_MNT/foo
        $XFS_IO_PROG -f -c "pwrite -S 0xaa 0 64K" $SCRATCH_MNT/foo/bar \
            | _filter_xfs_io
      
        _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT \
            $SCRATCH_MNT/mysnap1
        _run_btrfs_util_prog subvolume snapshot $SCRATCH_MNT \
            $SCRATCH_MNT/mysnap2
      
        echo "File digest before being replaced:"
        md5sum $SCRATCH_MNT/mysnap1/foo/bar | _filter_scratch
      
        # Remove the file and then create a new one in the same location with
        # the same name but with different content. This new file ends up
        # getting the same inode number as the previous one, because that inode
        # number was the highest inode number used by the snapshot's root and
        # therefore when attempting to find the a new inode number for the new
        # file, we end up reusing the same inode number. This happens because
        # currently btrfs uses the highest inode number summed by 1 for the
        # first inode created once a snapshot's root is loaded (done at
        # fs/btrfs/inode-map.c:btrfs_find_free_objectid in the linux kernel
        # tree).
        # Having these two different files in the snapshots with the same inode
        # number (but different generation numbers) caused the btrfs send code
        # to emit an incorrect path for the file when issuing an unlink
        # operation because it failed to realize they were different files.
        rm -f $SCRATCH_MNT/mysnap2/foo/bar
        $XFS_IO_PROG -f -c "pwrite -S 0xbb 0 96K" \
            $SCRATCH_MNT/mysnap2/foo/bar | _filter_xfs_io
      
        _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT/mysnap2 \
            $SCRATCH_MNT/mysnap2_ro
      
        _run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $send_files_dir/1.snap
        _run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 \
            $SCRATCH_MNT/mysnap2_ro -f $send_files_dir/2.snap
      
        echo "File digest in the original filesystem after being replaced:"
        md5sum $SCRATCH_MNT/mysnap2_ro/foo/bar | _filter_scratch
      
        # Now recreate the filesystem by receiving both send streams and verify
        # we get the same file contents that the original filesystem had.
        _scratch_unmount
        _scratch_mkfs >>$seqres.full 2>&1
        _scratch_mount
      
        _run_btrfs_util_prog receive -vv $SCRATCH_MNT -f $send_files_dir/1.snap
        _run_btrfs_util_prog receive -vv $SCRATCH_MNT -f $send_files_dir/2.snap
      
        echo "File digest in the new filesystem:"
        # Must match the digest from the new file.
        md5sum $SCRATCH_MNT/mysnap2_ro/foo/bar | _filter_scratch
      
        status=0
        exit
      Reported-by: default avatarMartin Raiber <martin@urbackup.org>
      Fixes: 8b191a68
      
       ("Btrfs: incremental send, check if orphanized dir inode needs delayed rename")
      Signed-off-by: default avatarFilipe Manana <fdmanana@suse.com>
      b786f16a
  11. 03 Oct, 2015 1 commit
  12. 02 Oct, 2015 8 commits
  13. 29 Sep, 2015 1 commit
    • Richard Weinberger's avatar
      UBIFS: Kill unneeded locking in ubifs_init_security · cf6f54e3
      Richard Weinberger authored
      
      Fixes the following lockdep splat:
      [    1.244527] =============================================
      [    1.245193] [ INFO: possible recursive locking detected ]
      [    1.245193] 4.2.0-rc1+ #37 Not tainted
      [    1.245193] ---------------------------------------------
      [    1.245193] cp/742 is trying to acquire lock:
      [    1.245193]  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0
      [    1.245193]
      [    1.245193] but task is already holding lock:
      [    1.245193]  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280
      [    1.245193]
      [    1.245193] other info that might help us debug this:
      [    1.245193]  Possible unsafe locking scenario:
      [    1.245193]
      [    1.245193]        CPU0
      [    1.245193]        ----
      [    1.245193]   lock(&sb->s_type->i_mutex_key#9);
      [    1.245193]   lock(&sb->s_type->i_mutex_key#9);
      [    1.245193]
      [    1.245193]  *** DEADLOCK ***
      [    1.245193]
      [    1.245193]  May be due to missing lock nesting notation
      [    1.245193]
      [    1.245193] 2 locks held by cp/742:
      [    1.245193]  #0:  (sb_writers#5){.+.+.+}, at: [<ffffffff811ad37f>] mnt_want_write+0x1f/0x50
      [    1.245193]  #1:  (&sb->s_type->i_mutex_key#9){+.+.+.}, at: [<ffffffff81198e7f>] path_openat+0x3af/0x1280
      [    1.245193]
      [    1.245193] stack backtrace:
      [    1.245193] CPU: 2 PID: 742 Comm: cp Not tainted 4.2.0-rc1+ #37
      [    1.245193] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140816_022509-build35 04/01/2014
      [    1.245193]  ffffffff8252d530 ffff88007b023a38 ffffffff814f6f49 ffffffff810b56c5
      [    1.245193]  ffff88007c30cc80 ffff88007b023af8 ffffffff810a150d ffff88007b023a68
      [    1.245193]  000000008101302a ffff880000000000 00000008f447e23f ffffffff8252d500
      [    1.245193] Call Trace:
      [    1.245193]  [<ffffffff814f6f49>] dump_stack+0x4c/0x65
      [    1.245193]  [<ffffffff810b56c5>] ? console_unlock+0x1c5/0x510
      [    1.245193]  [<ffffffff810a150d>] __lock_acquire+0x1a6d/0x1ea0
      [    1.245193]  [<ffffffff8109fa78>] ? __lock_is_held+0x58/0x80
      [    1.245193]  [<ffffffff810a1a93>] lock_acquire+0xd3/0x270
      [    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
      [    1.245193]  [<ffffffff814fc83b>] mutex_lock_nested+0x6b/0x3a0
      [    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
      [    1.245193]  [<ffffffff812b3f69>] ? ubifs_init_security+0x29/0xb0
      [    1.245193]  [<ffffffff812b3f69>] ubifs_init_security+0x29/0xb0
      [    1.245193]  [<ffffffff8128e286>] ubifs_create+0xa6/0x1f0
      [    1.245193]  [<ffffffff81198e7f>] ? path_openat+0x3af/0x1280
      [    1.245193]  [<ffffffff81195d15>] vfs_create+0x95/0xc0
      [    1.245193]  [<ffffffff8119929c>] path_openat+0x7cc/0x1280
      [    1.245193]  [<ffffffff8109ffe3>] ? __lock_acquire+0x543/0x1ea0
      [    1.245193]  [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0
      [    1.245193]  [<ffffffff81088c00>] ? calc_global_load_tick+0x60/0x90
      [    1.245193]  [<ffffffff81088f20>] ? sched_clock_cpu+0x90/0xc0
      [    1.245193]  [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180
      [    1.245193]  [<ffffffff8119ac55>] do_filp_open+0x75/0xd0
      [    1.245193]  [<ffffffff814ffd86>] ? _raw_spin_unlock+0x26/0x40
      [    1.245193]  [<ffffffff811a9cef>] ? __alloc_fd+0xaf/0x180
      [    1.245193]  [<ffffffff81189bd9>] do_sys_open+0x129/0x200
      [    1.245193]  [<ffffffff81189cc9>] SyS_open+0x19/0x20
      [    1.245193]  [<ffffffff81500717>] entry_SYSCALL_64_fastpath+0x12/0x6f
      
      While the lockdep splat is a false positive, becuase path_openat holds i_mutex
      of the parent directory and ubifs_init_security() tries to acquire i_mutex
      of a new inode, it reveals that taking i_mutex in ubifs_init_security() is
      in vain because it is only being called in the inode allocation path
      and therefore nobody else can see the inode yet.
      
      Cc: stable@vger.kernel.org # 3.20-
      Reported-and-tested-by: default avatarBoris Brezillon <boris.brezillon@free-electrons.com>
      Reviewed-and-tested-by: default avatarDongsheng Yang <yangds.fnst@cn.fujitsu.com>
      Signed-off-by: default avatarRichard Weinberger <richard@nod.at>
      Signed-off-by: dedekind1@gmail.com
      cf6f54e3
  14. 26 Sep, 2015 1 commit
  15. 24 Sep, 2015 3 commits
  16. 23 Sep, 2015 1 commit
  17. 22 Sep, 2015 3 commits
    • Joseph Qi's avatar
      ocfs2/dlm: fix deadlock when dispatch assert master · 012572d4
      Joseph Qi authored
      
      The order of the following three spinlocks should be:
      dlm_domain_lock < dlm_ctxt->spinlock < dlm_lock_resource->spinlock
      
      But dlm_dispatch_assert_master() is called while holding
      dlm_ctxt->spinlock and dlm_lock_resource->spinlock, and then it calls
      dlm_grab() which will take dlm_domain_lock.
      
      Once another thread (for example, dlm_query_join_handler) has already
      taken dlm_domain_lock, and tries to take dlm_ctxt->spinlock deadlock
      happens.
      Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
      Cc: Joel Becker <jlbec@evilplan.org>
      Cc: Mark Fasheh <mfasheh@suse.com>
      Cc: "Junxiao Bi" <junxiao.bi@oracle.com>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      012572d4
    • Andrea Arcangeli's avatar
      userfaultfd: revert "userfaultfd: waitqueue: add nr wake parameter to __wake_up_locked_key" · ac5be6b4
      Andrea Arcangeli authored
      This reverts commit 51360155
      
       and adapts
      fs/userfaultfd.c to use the old version of that function.
      
      It didn't look robust to call __wake_up_common with "nr == 1" when we
      absolutely require wakeall semantics, but we've full control of what we
      insert in the two waitqueue heads of the blocked userfaults.  No
      exclusive waitqueue risks to be inserted into those two waitqueue heads
      so we can as well stick to "nr == 1" of the old code and we can rely
      purely on the fact no waitqueue inserted in one of the two waitqueue
      heads we must enforce as wakeall, has wait->flags WQ_FLAG_EXCLUSIVE set.
      Signed-off-by: default avatarAndrea Arcangeli <aarcange@redhat.com>
      Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
      Cc: Michael Ellerman <mpe@ellerman.id.au>
      Cc: Shuah Khan <shuahkh@osg.samsung.com>
      Cc: Thierry Reding <treding@nvidia.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      ac5be6b4
    • Kinglong Mee's avatar
      NFS: Skip checking ds_cinfo.buckets when lseg's commit_through_mds is set · 834e465b
      Kinglong Mee authored
      
      When lseg's commit_through_mds is set, pnfs client always WARN once
      in nfs_direct_select_verf after checking ds_cinfo.nbuckets.
      
      nfs should use the DS verf except commit_through_mds is set for
      layout segment where nbuckets is zero.
      
      [17844.666094] ------------[ cut here ]------------
      [17844.667071] WARNING: CPU: 0 PID: 21758 at /root/source/linux-pnfs/fs/nfs/direct.c:174 nfs_direct_select_verf+0x5a/0x70 [nfs]()
      [17844.668650] Modules linked in: nfs_layout_nfsv41_files(OE) nfsv4(OE) nfs(OE) fscache(E) nfsd(OE) xfs libcrc32c btrfs ppdev coretemp crct10dif_pclmul auth_rpcgss crc32_pclmul crc32c_intel nfs_acl ghash_clmulni_intel lockd vmw_balloon xor vmw_vmci grace raid6_pq shpchp sunrpc parport_pc i2c_piix4 parport vmwgfx drm_kms_helper ttm drm serio_raw mptspi e1000 scsi_transport_spi mptscsih mptbase ata_generic pata_acpi [last unloaded: fscache]
      [17844.686676] CPU: 0 PID: 21758 Comm: kworker/0:1 Tainted: G        W  OE   4.3.0-rc1-pnfs+ #245
      [17844.687352] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 05/20/2014
      [17844.698502] Workqueue: nfsiod rpc_async_release [sunrpc]
      [17844.699212]  0000000000000009 0000000043e58010 ffff8800454fbc10 ffffffff813680c4
      [17844.699990]  ffff8800454fbc48 ffffffff8108b49d ffff88004eb20000 ffff88004eb20000
      [17844.700844]  ffff880062e26000 0000000000000000 0000000000000001 ffff8800454fbc58
      [17844.701637] Call Trace:
      [17844.725252]  [<ffffffff813680c4>] dump_stack+0x19/0x25
      [17844.732693]  [<ffffffff8108b49d>] warn_slowpath_common+0x7d/0xb0
      [17844.733855]  [<ffffffff8108b5da>] warn_slowpath_null+0x1a/0x20
      [17844.735015]  [<ffffffffa04a27ca>] nfs_direct_select_verf+0x5a/0x70 [nfs]
      [17844.735999]  [<ffffffffa04a2b83>] nfs_direct_set_hdr_verf+0x23/0x90 [nfs]
      [17844.736846]  [<ffffffffa04a2e17>] nfs_direct_write_completion+0x227/0x260 [nfs]
      [17844.737782]  [<ffffffffa04a433c>] nfs_pgio_release+0x1c/0x20 [nfs]
      [17844.738597]  [<ffffffffa0502df3>] pnfs_generic_rw_release+0x23/0x30 [nfsv4]
      [17844.739486]  [<ffffffffa01cbbea>] rpc_free_task+0x2a/0x70 [sunrpc]
      [17844.740326]  [<ffffffffa01cbcd5>] rpc_async_release+0x15/0x20 [sunrpc]
      [17844.741173]  [<ffffffff810a387c>] process_one_work+0x21c/0x4c0
      [17844.741984]  [<ffffffff810a37cd>] ? process_one_work+0x16d/0x4c0
      [17844.742837]  [<ffffffff810a3b6a>] worker_thread+0x4a/0x440
      [17844.743639]  [<ffffffff810a3b20>] ? process_one_work+0x4c0/0x4c0
      [17844.744399]  [<ffffffff810a3b20>] ? process_one_work+0x4c0/0x4c0
      [17844.745176]  [<ffffffff810a8d75>] kthread+0xf5/0x110
      [17844.745927]  [<ffffffff810a8c80>] ? kthread_create_on_node+0x240/0x240
      [17844.747105]  [<ffffffff8172ce1f>] ret_from_fork+0x3f/0x70
      [17844.747856]  [<ffffffff810a8c80>] ? kthread_create_on_node+0x240/0x240
      [17844.748642] ---[ end trace 336a2845d42b83f0 ]---
      Signed-off-by: default avatarKinglong Mee <kinglongmee@gmail.com>
      Signed-off-by: default avatarTrond Myklebust <trond.myklebust@primarydata.com>
      834e465b