1. 21 Nov, 2018 5 commits
  2. 05 Sep, 2018 4 commits
  3. 03 Jul, 2018 1 commit
    • Tejun Heo's avatar
      fuse: fix congested state leak on aborted connections · 02832578
      Tejun Heo authored
      commit 8a301eb1 upstream.
      
      If a connection gets aborted while congested, FUSE can leave
      nr_wb_congested[] stuck until reboot causing wait_iff_congested() to
      wait spuriously which can lead to severe performance degradation.
      
      The leak is caused by gating congestion state clearing with
      fc->connected test in request_end().  This was added way back in 2009
      by 26c36791
      
       ("fuse: destroy bdi on umount").  While the commit
      description doesn't explain why the test was added, it most likely was
      to avoid dereferencing bdi after it got destroyed.
      
      Since then, bdi lifetime rules have changed many times and now we're
      always guaranteed to have access to the bdi while the superblock is
      alive (fc->sb).
      
      Drop fc->connected conditional to avoid leaking congestion states.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Reported-by: default avatarJoshua Miller <joshmiller@fb.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: stable@vger.kernel.org # v2.6.29+
      Acked-by: default avatarJan Kara <jack@suse.cz>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      02832578
  4. 12 Sep, 2017 1 commit
    • Miklos Szeredi's avatar
      fuse: allow server to run in different pid_ns · 5d6d3a30
      Miklos Szeredi authored
      Commit 0b6e9ea0
      
       ("fuse: Add support for pid namespaces") broke
      Sandstorm.io development tools, which have been sending FUSE file
      descriptors across PID namespace boundaries since early 2014.
      
      The above patch added a check that prevented I/O on the fuse device file
      descriptor if the pid namespace of the reader/writer was different from the
      pid namespace of the mounter.  With this change passing the device file
      descriptor to a different pid namespace simply doesn't work.  The check was
      added because pids are transferred to/from the fuse userspace server in the
      namespace registered at mount time.
      
      To fix this regression, remove the checks and do the following:
      
      1) the pid in the request header (the pid of the task that initiated the
      filesystem operation) is translated to the reader's pid namespace.  If a
      mapping doesn't exist for this pid, then a zero pid is used.  Note: even if
      a mapping would exist between the initiator task's pid namespace and the
      reader's pid namespace the pid will be zero if either mapping from
      initator's to mounter's namespace or mapping from mounter's to reader's
      namespace doesn't exist.
      
      2) The lk.pid value in setlk/setlkw requests and getlk reply is left alone.
      Userspace should not interpret this value anyway.  Also allow the
      setlk/setlkw operations if the pid of the task cannot be represented in the
      mounter's namespace (pid being zero in that case).
      Reported-by: default avatarKenton Varda <kenton@sandstorm.io>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 0b6e9ea0 ("fuse: Add support for pid namespaces")
      Cc: <stable@vger.kernel.org> # v4.12+
      Cc: Eric W. Biederman <ebiederm@xmission.com>
      Cc: Seth Forshee <seth.forshee@canonical.com>
      5d6d3a30
  5. 20 Apr, 2017 2 commits
  6. 18 Apr, 2017 2 commits
  7. 02 Mar, 2017 1 commit
  8. 16 Feb, 2017 1 commit
  9. 15 Feb, 2017 1 commit
    • Sahitya Tummala's avatar
      fuse: fix use after free issue in fuse_dev_do_read() · 6ba4d272
      Sahitya Tummala authored
      
      
      There is a potential race between fuse_dev_do_write()
      and request_wait_answer() contexts as shown below:
      
      TASK 1:
      __fuse_request_send():
        |--spin_lock(&fiq->waitq.lock);
        |--queue_request();
        |--spin_unlock(&fiq->waitq.lock);
        |--request_wait_answer():
             |--if (test_bit(FR_SENT, &req->flags))
             <gets pre-empted after it is validated true>
                                         TASK 2:
                                         fuse_dev_do_write():
                                           |--clears bit FR_SENT,
                                           |--request_end():
                                              |--sets bit FR_FINISHED
                                              |--spin_lock(&fiq->waitq.lock);
                                              |--list_del_init(&req->intr_entry);
                                              |--spin_unlock(&fiq->waitq.lock);
                                              |--fuse_put_request();
             |--queue_interrupt();
             <request gets queued to interrupts list>
                  |--wake_up_locked(&fiq->waitq);
             |--wait_event_freezable();
             <as FR_FINISHED is set, it returns and then
             the caller frees this request>
      
      Now, the next fuse_dev_do_read(), see interrupts list is not empty
      and then calls fuse_read_interrupt() which tries to access the request
      which is already free'd and gets the below crash:
      
      [11432.401266] Unable to handle kernel paging request at virtual address
      6b6b6b6b6b6b6b6b
      ...
      [11432.418518] Kernel BUG at ffffff80083720e0
      [11432.456168] PC is at __list_del_entry+0x6c/0xc4
      [11432.463573] LR is at fuse_dev_do_read+0x1ac/0x474
      ...
      [11432.679999] [<ffffff80083720e0>] __list_del_entry+0x6c/0xc4
      [11432.687794] [<ffffff80082c65e0>] fuse_dev_do_read+0x1ac/0x474
      [11432.693180] [<ffffff80082c6b14>] fuse_dev_read+0x6c/0x78
      [11432.699082] [<ffffff80081d5638>] __vfs_read+0xc0/0xe8
      [11432.704459] [<ffffff80081d5efc>] vfs_read+0x90/0x108
      [11432.709406] [<ffffff80081d67f0>] SyS_read+0x58/0x94
      
      As FR_FINISHED bit is set before deleting the intr_entry with input
      queue lock in request completion path, do the testing of this flag and
      queueing atomically with the same lock in queue_interrupt().
      Signed-off-by: default avatarSahitya Tummala <stummala@codeaurora.org>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: fd22d62e ("fuse: no fc->lock for iqueue parts")
      Cc: <stable@vger.kernel.org> # 4.2+
      6ba4d272
  10. 13 Jan, 2017 1 commit
    • Tahsin Erdogan's avatar
      fuse: clear FR_PENDING flag when moving requests out of pending queue · a8a86d78
      Tahsin Erdogan authored
      fuse_abort_conn() moves requests from pending list to a temporary list
      before canceling them. This operation races with request_wait_answer()
      which also tries to remove the request after it gets a fatal signal. It
      checks FR_PENDING flag to determine whether the request is still in the
      pending list.
      
      Make fuse_abort_conn() clear FR_PENDING flag so that request_wait_answer()
      does not remove the request from temporary list.
      
      This bug causes an Oops when trying to delete an already deleted list entry
      in end_requests().
      
      Fixes: ee314a87
      
       ("fuse: abort: no fc->lock needed for request ending")
      Signed-off-by: default avatarTahsin Erdogan <tahsin@google.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Cc: <stable@vger.kernel.org> # 4.2+
      a8a86d78
  11. 05 Oct, 2016 4 commits
  12. 04 Oct, 2016 2 commits
    • Al Viro's avatar
      fuse_dev_splice_read(): switch to add_to_pipe() · d82718e3
      Al Viro authored
      
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      d82718e3
    • Al Viro's avatar
      splice: lift pipe_lock out of splice_to_pipe() · 8924feff
      Al Viro authored
      
      
      * splice_to_pipe() stops at pipe overflow and does *not* take pipe_lock
      * ->splice_read() instances do the same
      * vmsplice_to_pipe() and do_splice() (ultimate callers of splice_to_pipe())
        arrange for waiting, looping, etc. themselves.
      
      That should make pipe_lock the outermost one.
      
      Unfortunately, existing rules for the amount passed by vmsplice_to_pipe()
      and do_splice() are quite ugly _and_ userland code can be easily broken
      by changing those.  It's not even "no more than the maximal capacity of
      this pipe" - it's "once we'd fed pipe->nr_buffers pages into the pipe,
      leave instead of waiting".
      
      Considering how poorly these rules are documented, let's try "wait for some
      space to appear, unless given SPLICE_F_NONBLOCK, then push into pipe
      and if we run into overflow, we are done".
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      8924feff
  13. 01 Oct, 2016 1 commit
  14. 19 Jul, 2016 1 commit
  15. 11 Jun, 2016 1 commit
    • Linus Torvalds's avatar
      vfs: make the string hashes salt the hash · 8387ff25
      Linus Torvalds authored
      
      
      We always mixed in the parent pointer into the dentry name hash, but we
      did it late at lookup time.  It turns out that we can simplify that
      lookup-time action by salting the hash with the parent pointer early
      instead of late.
      
      A few other users of our string hashes also wanted to mix in their own
      pointers into the hash, and those are updated to use the same mechanism.
      
      Hash users that don't have any particular initial salt can just use the
      NULL pointer as a no-salt.
      
      Cc: Vegard Nossum <vegard.nossum@oracle.com>
      Cc: George Spelvin <linux@sciencehorizons.net>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      8387ff25
  16. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      
      
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  17. 16 Aug, 2015 1 commit
  18. 01 Jul, 2015 10 commits