1. 01 Oct, 2018 4 commits
    • Miklos Szeredi's avatar
      fuse: use mtime for readdir cache verification · 7118883b
      Miklos Szeredi authored
      Store the modification time of the directory in the cache, obtained before
      starting to fill the cache.
      When reading the cache, verify that the directory hasn't changed, by
      checking if current modification time is the same as the one stored in the
      This only needs to be done when the current file position is at the
      beginning of the directory, as mandated by POSIX.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
    • Miklos Szeredi's avatar
      fuse: add readdir cache version · 3494927e
      Miklos Szeredi authored
      Allow the cache to be invalidated when page(s) have gone missing.  In this
      case increment the version of the cache and reset to an empty state.
      Add a version number to the directory stream in struct fuse_file as well,
      indicating the version of the cache it's supposed to be reading.  If the
      cache version doesn't match the stream's version, then reset the stream to
      the beginning of the cache.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
    • Miklos Szeredi's avatar
      fuse: allow using readdir cache · 5d7bc7e8
      Miklos Szeredi authored
      The cache is only used if it's completed, not while it's still being
      filled; this constraint could be lifted later, if it turns out to be
      Introduce state in struct fuse_file that indicates the position within the
      cache.  After a seek, reset the position to the beginning of the cache and
      search the cache for the current position.  If the current position is not
      found in the cache, then fall back to uncached readdir.
      It can also happen that page(s) disappear from the cache, in which case we
      must also fall back to uncached readdir.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
    • Miklos Szeredi's avatar
      fuse: allow caching readdir · 69e34551
      Miklos Szeredi authored
      This patch just adds the cache filling functions, which are invoked if
      FOPEN_CACHE_DIR flag is set in the OPENDIR reply.
      Cache reading and cache invalidation are added by subsequent patches.
      The directory cache uses the page cache.  Directory entries are packed into
      a page in the same format as in the READDIR reply.  A page only contains
      whole entries, the space at the end of the page is cleared.  The page is
      locked while being modified.
      Multiple parallel readdirs on the same directory can fill the cache; the
      only constraint is that continuity must be maintained (d_off of last entry
      points to position of current entry).
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
  2. 28 Sep, 2018 6 commits
  3. 26 Jul, 2018 2 commits
  4. 31 May, 2018 1 commit
  5. 20 Mar, 2018 2 commits
    • Eric W. Biederman's avatar
      fuse: Support fuse filesystems outside of init_user_ns · 8cb08329
      Eric W. Biederman authored
      In order to support mounts from namespaces other than init_user_ns, fuse
      must translate uids and gids to/from the userns of the process servicing
      requests on /dev/fuse. This patch does that, with a couple of restrictions
      on the namespace:
       - The userns for the fuse connection is fixed to the namespace
         from which /dev/fuse is opened.
       - The namespace must be the same as s_user_ns.
      These restrictions simplify the implementation by avoiding the need to pass
      around userns references and by allowing fuse to rely on the checks in
      setattr_prepare for ownership changes.  Either restriction could be relaxed
      in the future if needed.
      For cuse the userns used is the opener of /dev/cuse.  Semantically the cuse
      support does not appear safe for unprivileged users.  Practically the
      permissions on /dev/cuse only make it accessible to the global root user.
      If something slips through the cracks in a user namespace the only users
      who will be able to use the cuse device are those users mapped into the
      user namespace.
      Translation in the posix acl is updated to use the uuser namespace of the
      filesystem.  Avoiding cases which might bypass this translation is handled
      in a following change.
      This change is stronlgy based on a similar change from Seth Forshee and
      Dongsu Park.
      Cc: Seth Forshee <seth.forshee@canonical.com>
      Cc: Dongsu Park <dongsu@kinvolk.io>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
    • Szymon Lukasz's avatar
      fuse: return -ECONNABORTED on /dev/fuse read after abort · 3b7008b2
      Szymon Lukasz authored
      Currently the userspace has no way of knowing whether the fuse
      connection ended because of umount or abort via sysfs. It makes it hard
      for filesystems to free the mountpoint after abort without worrying
      about removing some new mount.
      The patch fixes it by returning different errors when userspace reads
      from /dev/fuse (-ENODEV for umount and -ECONNABORTED for abort).
      Add a new capability flag FUSE_ABORT_ERROR. If set and the connection is
      gone because of sysfs abort, reading from the device will return
      Signed-off-by: default avatarSzymon Lukasz <noh4hss@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
  6. 27 Nov, 2017 1 commit
  7. 12 Sep, 2017 2 commits
  8. 03 Aug, 2017 1 commit
    • Ashish Samant's avatar
      fuse: Dont call set_page_dirty_lock() for ITER_BVEC pages for async_dio · 61c12b49
      Ashish Samant authored
      Commit 8fba54ae
       ("fuse: direct-io: don't dirty ITER_BVEC pages") fixes
      the ITER_BVEC page deadlock for direct io in fuse by checking in
      fuse_direct_io(), whether the page is a bvec page or not, before locking
      it.  However, this check is missed when the "async_dio" mount option is
      enabled.  In this case, set_page_dirty_lock() is called from the req->end
      callback in request_end(), when the fuse thread is returning from userspace
      to respond to the read request.  This will cause the same deadlock because
      the bvec condition is not checked in this path.
      Here is the stack of the deadlocked thread, while returning from userspace:
      [13706.656686] INFO: task glusterfs:3006 blocked for more than 120 seconds.
      [13706.657808] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
      this message.
      [13706.658788] glusterfs       D ffffffff816c80f0     0  3006      1
      [13706.658797]  ffff8800d6713a58 0000000000000086 ffff8800d9ad7000
      [13706.658799]  ffff88011ffd5cc0 ffff8800d6710008 ffff88011fd176c0
      [13706.658801]  0000000000000002 ffffffff816c80f0 ffff8800d6713a78
      [13706.658803] Call Trace:
      [13706.658809]  [<ffffffff816c80f0>] ? bit_wait_io_timeout+0x80/0x80
      [13706.658811]  [<ffffffff816c790e>] schedule+0x3e/0x90
      [13706.658813]  [<ffffffff816ca7e5>] schedule_timeout+0x1b5/0x210
      [13706.658816]  [<ffffffff81073ffb>] ? gup_pud_range+0x1db/0x1f0
      [13706.658817]  [<ffffffff810668fe>] ? kvm_clock_read+0x1e/0x20
      [13706.658819]  [<ffffffff81066909>] ? kvm_clock_get_cycles+0x9/0x10
      [13706.658822]  [<ffffffff810f5792>] ? ktime_get+0x52/0xc0
      [13706.658824]  [<ffffffff816c6f04>] io_schedule_timeout+0xa4/0x110
      [13706.658826]  [<ffffffff816c8126>] bit_wait_io+0x36/0x50
      [13706.658828]  [<ffffffff816c7d06>] __wait_on_bit_lock+0x76/0xb0
      [13706.658831]  [<ffffffffa0545636>] ? lock_request+0x46/0x70 [fuse]
      [13706.658834]  [<ffffffff8118800a>] __lock_page+0xaa/0xb0
      [13706.658836]  [<ffffffff810c8500>] ? wake_atomic_t_function+0x40/0x40
      [13706.658838]  [<ffffffff81194d08>] set_page_dirty_lock+0x58/0x60
      [13706.658841]  [<ffffffffa054d968>] fuse_release_user_pages+0x58/0x70 [fuse]
      [13706.658844]  [<ffffffffa0551430>] ? fuse_aio_complete+0x190/0x190 [fuse]
      [13706.658847]  [<ffffffffa0551459>] fuse_aio_complete_req+0x29/0x90 [fuse]
      [13706.658849]  [<ffffffffa05471e9>] request_end+0xd9/0x190 [fuse]
      [13706.658852]  [<ffffffffa0549126>] fuse_dev_do_write+0x336/0x490 [fuse]
      [13706.658854]  [<ffffffffa054963e>] fuse_dev_write+0x6e/0xa0 [fuse]
      [13706.658857]  [<ffffffff812a9ef3>] ? security_file_permission+0x23/0x90
      [13706.658859]  [<ffffffff81205300>] do_iter_readv_writev+0x60/0x90
      [13706.658862]  [<ffffffffa05495d0>] ? fuse_dev_splice_write+0x350/0x350
      [13706.658863]  [<ffffffff812062a1>] do_readv_writev+0x171/0x1f0
      [13706.658866]  [<ffffffff810b3d00>] ? try_to_wake_up+0x210/0x210
      [13706.658868]  [<ffffffff81206361>] vfs_writev+0x41/0x50
      [13706.658870]  [<ffffffff81206496>] SyS_writev+0x56/0xf0
      [13706.658872]  [<ffffffff810257a1>] ? syscall_trace_leave+0xf1/0x160
      [13706.658874]  [<ffffffff816cbb2e>] system_call_fastpath+0x12/0x71
      Fix this by making should_dirty a fuse_io_priv parameter that can be
      checked in fuse_aio_complete_req().
      Reported-by: default avatarTiger Yang <tiger.yang@oracle.com>
      Signed-off-by: default avatarAshish Samant <ashish.samant@oracle.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
  9. 20 Apr, 2017 2 commits
  10. 18 Apr, 2017 4 commits
  11. 22 Feb, 2017 1 commit
  12. 14 Jan, 2017 1 commit
    • Peter Zijlstra's avatar
      locking/atomic, kref: Add KREF_INIT() · 1e24edca
      Peter Zijlstra authored
      Since we need to change the implementation, stop exposing internals.
      Provide KREF_INIT() to allow static initialization of struct kref.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: linux-kernel@vger.kernel.org
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
  13. 18 Oct, 2016 1 commit
  14. 01 Oct, 2016 4 commits
  15. 22 Sep, 2016 1 commit
  16. 30 Jul, 2016 1 commit
  17. 30 Jun, 2016 2 commits
    • Ashish Sangwan's avatar
      fuse: improve aio directIO write performance for size extending writes · 7879c4e5
      Ashish Sangwan authored
      While sending the blocking directIO in fuse, the write request is broken
      into sub-requests, each of default size 128k and all the requests are sent
      in non-blocking background mode if async_dio mode is supported by libfuse.
      The process which issue the write wait for the completion of all the
      sub-requests. Sending multiple requests parallely gives a chance to perform
      parallel writes in the user space fuse implementation if it is
      multi-threaded and hence improves the performance.
      When there is a size extending aio dio write, we switch to blocking mode so
      that we can properly update the size of the file after completion of the
      writes. However, in this situation all the sub-requests are sent in
      serialized manner where the next request is sent only after receiving the
      reply of the current request. Hence the multi-threaded user space
      implementation is not utilized properly.
      This patch changes the size extending aio dio behavior to exactly follow
      blocking dio. For multi threaded fuse implementation having 10 threads and
      using buffer size of 64MB to perform async directIO, we are getting double
      the speed.
      Signed-off-by: default avatarAshish Sangwan <ashishsangwan2@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
    • Miklos Szeredi's avatar
      fuse: serialize dirops by default · 5c672ab3
      Miklos Szeredi authored
      Negotiate with userspace filesystems whether they support parallel readdir
      and lookup.  Disable parallelism by default for fear of breaking fuse
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      Fixes: 9902af79 ("parallel lookups: actual switch to rwsem")
      Fixes: d9b3dbdc ("fuse: switch to ->iterate_shared()")
  18. 14 Mar, 2016 1 commit
    • Seth Forshee's avatar
      fuse: Add reference counting for fuse_io_priv · 744742d6
      Seth Forshee authored
      The 'reqs' member of fuse_io_priv serves two purposes. First is to track
      the number of oustanding async requests to the server and to signal that
      the io request is completed. The second is to be a reference count on the
      structure to know when it can be freed.
      For sync io requests these purposes can be at odds.  fuse_direct_IO() wants
      to block until the request is done, and since the signal is sent when
      'reqs' reaches 0 it cannot keep a reference to the object. Yet it needs to
      use the object after the userspace server has completed processing
      requests. This leads to some handshaking and special casing that it
      needlessly complicated and responsible for at least one race condition.
      It's much cleaner and safer to maintain a separate reference count for the
      object lifecycle and to let 'reqs' just be a count of outstanding requests
      to the userspace server. Then we can know for sure when it is safe to free
      the object without any handshaking or special cases.
      The catch here is that most of the time these objects are stack allocated
      and should not be freed. Initializing these objects with a single reference
      that is never released prevents accidental attempts to free the objects.
      Fixes: 9d5722b7
       ("fuse: handle synchronous iocbs internally")
      Cc: stable@vger.kernel.org # v4.1+
      Signed-off-by: default avatarSeth Forshee <seth.forshee@canonical.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
  19. 10 Nov, 2015 1 commit
  20. 01 Jul, 2015 2 commits