- 14 Mar, 2016 1 commit
-
-
Seth Forshee authored
The 'reqs' member of fuse_io_priv serves two purposes. First is to track the number of oustanding async requests to the server and to signal that the io request is completed. The second is to be a reference count on the structure to know when it can be freed. For sync io requests these purposes can be at odds. fuse_direct_IO() wants to block until the request is done, and since the signal is sent when 'reqs' reaches 0 it cannot keep a reference to the object. Yet it needs to use the object after the userspace server has completed processing requests. This leads to some handshaking and special casing that it needlessly complicated and responsible for at least one race condition. It's much cleaner and safer to maintain a separate reference count for the object lifecycle and to let 'reqs' just be a count of outstanding requests to the userspace server. Then we can know for sure when it is safe to free the object without any handshaking or special cases. The catch here is that most of the time these objects are stack allocated and should not be freed. Initializing these objects with a single reference that is never released prevents accidental attempts to free the objects. Fixes: 9d5722b7 ("fuse: handle synchronous iocbs internally") Cc: stable@vger.kernel.org # v4.1+ Signed-off-by:
Seth Forshee <seth.forshee@canonical.com> Signed-off-by:
Miklos Szeredi <mszeredi@redhat.com>
-
- 10 Nov, 2015 1 commit
-
-
Ravishankar N authored
A useful performance improvement for accessing virtual machine images via FUSE mount. See https://bugzilla.redhat.com/show_bug.cgi?id=1220173 for a use-case for glusterFS. Signed-off-by:
Ravishankar N <ravishankar@redhat.com> Signed-off-by:
Miklos Szeredi <miklos@szeredi.hu>
-
- 01 Jul, 2015 12 commits
-
-
Miklos Szeredi authored
Make each fuse device clone refer to a separate processing queue. The only constraint on userspace code is that the request answer must be written to the same device clone as it was read off. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Miklos Szeredi authored
Allow fuse device clones to refer to be distinguished. This patch just adds the infrastructure by associating a separate "struct fuse_dev" with each clone. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
Miklos Szeredi authored
When an unlocked request is aborted, it is moved from fpq->io to a private list. Then, after unlocking fpq->lock, the private list is processed and the requests are finished off. To protect the private list, we need to mark the request with a flag, so if in the meantime the request is unlocked the list is not corrupted. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
Miklos Szeredi authored
Add a fpq->lock for protecting members of struct fuse_pqueue and FR_LOCKED request flag. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
Miklos Szeredi authored
This will allow checking ->connected just with the processing queue lock. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
Miklos Szeredi authored
This is just two fields: fc->io and fc->processing. This patch just rearranges the fields, no functional change. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
Miklos Szeredi authored
This will allow checking ->connected just with the input queue lock. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
Miklos Szeredi authored
The input queue contains normal requests (fc->pending), forgets (fc->forget_*) and interrupts (fc->interrupts). There's also fc->waitq and fc->fasync for waking up the readers of the fuse device when a request is available. The fc->reqctr is also moved to the input queue (assigned to the request when the request is added to the input queue. This patch just rearranges the fields, no functional change. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
Miklos Szeredi authored
Use flags for representing the state in fuse_req. This is needed since req->list will be protected by different locks in different states, hence we'll want the state itself to be split into distinct bits, each protected with the relevant lock in that state. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Miklos Szeredi authored
FUSE_REQ_INIT is actually the same state as FUSE_REQ_PENDING and FUSE_REQ_READING and FUSE_REQ_WRITING can be merged into a common FUSE_REQ_IO state. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
Miklos Szeredi authored
Reuse req->waitq.lock for protecting FR_ABORTED and FR_LOCKED flags. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
Miklos Szeredi authored
Finer grained locking will mean there's no single lock to protect modification of bitfileds in fuse_req. So move to using bitops. Can use the non-atomic variants for those which happen while the request definitely has only one reference. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Reviewed-by:
Ashish Samant <ashish.samant@oracle.com>
-
- 13 Mar, 2015 1 commit
-
-
Christoph Hellwig authored
Based on a patch from Maxim Patlasov <MPatlasov@parallels.com>. Signed-off-by:
Christoph Hellwig <hch@lst.de> Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 06 Jan, 2015 1 commit
-
-
Miklos Szeredi authored
Theoretically we need to order setting of various fields in fc with fc->initialized. No known bug reports related to this yet. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
- 12 Dec, 2014 4 commits
-
-
Miklos Szeredi authored
The following pattern is repeated many times: req = fuse_get_req_nopages(fc); /* Initialize req->(in|out).args */ fuse_request_send(fc, req); err = req->out.h.error; fuse_put_request(req); Create a new replacement helper: /* Initialize args */ err = fuse_simple_request(fc, &args); In addition to reducing the code size, this will ease moving from the complex arg-based to a simpler page-based I/O on the fuse device. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Miklos Szeredi authored
The third out-arg is never actually used. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Miklos Szeredi authored
path_put() in release could trigger a DESTROY request in fuseblk. The possible deadlock was worked around by doing the path_put() with schedule_work(). This complexity isn't needed if we just hold the inode instead of the path. Since we now flush all requests before destroying the super block we can be sure that all held inodes will be dropped. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Miklos Szeredi authored
Use fuse_abort_conn() instead of fuse_conn_kill() in fuse_put_super(). This flushes and aborts requests still on any queues. But since we've already reset fc->connected, those requests would not be useful anyway and would be flushed when the fuse device is closed. Next patches will rely on requests being flushed before the superblock is destroyed. Use fuse_abort_conn() in cuse_process_init_reply() too, since it makes no difference there, and we can get rid of fuse_conn_kill(). Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
- 06 May, 2014 1 commit
-
-
Al Viro authored
... to fuse_direct_{read,write}(). ->direct_IO() path uses the iov_iter passed by the caller instead. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
- 28 Apr, 2014 4 commits
-
-
Miklos Szeredi authored
Support RENAME_EXCHANGE and RENAME_NOREPLACE flags on the userspace ABI. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Maxim Patlasov authored
The patch extends fuse_setattr_in, and extends the flush procedure (fuse_flush_times()) called on ->write_inode() to send the ctime as well as mtime. Signed-off-by:
Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Miklos Szeredi authored
...and flush mtime from this. This allows us to use the kernel infrastructure for writing out dirty metadata (mtime at this point, but ctime in the next patches and also maybe atime). Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Fabian Frederick authored
fuse_ctl_cleanup is only called by __exit fuse_exit Signed-off-by:
Fabian Frederick <fabf@skynet.be> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
- 02 Apr, 2014 3 commits
-
-
Pavel Emelyanov authored
The problem is: 1. write cached data to a file 2. read directly from the same file (via another fd) The 2nd operation may read stale data, i.e. the one that was in a file before the 1st op. Problem is in how fuse manages writeback. When direct op occurs the core kernel code calls filemap_write_and_wait to flush all the cached ops in flight. But fuse acks the writeback right after the ->writepages callback exits w/o waiting for the real write to happen. Thus the subsequent direct op proceeds while the real writeback is still in flight. This is a problem for backends that reorder operation. Fix this by making the fuse direct IO callback explicitly wait on the in-flight writeback to finish. Signed-off-by:
Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Maxim Patlasov authored
Let the kernel maintain i_mtime locally: - clear S_NOCMTIME - implement i_op->update_time() - flush mtime on fsync and last close - update i_mtime explicitly on truncate and fallocate Fuse inode flag FUSE_I_MTIME_DIRTY serves as indication that local i_mtime should be flushed to the server eventually. Signed-off-by:
Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Pavel Emelyanov authored
Off (0) by default. Will be used in the next patches and will be turned on at the very end. Signed-off-by:
Maxim Patlasov <MPatlasov@parallels.com> Signed-off-by:
Pavel Emelyanov <xemul@openvz.org> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
- 22 Jan, 2014 2 commits
-
-
Andrew Gallagher authored
open/release operations require userspace transitions to keep track of the open count and to perform any FS-specific setup. However, for some purely read-only FSs which don't need to perform any setup at open/release time, we can avoid the performance overhead of calling into userspace for open/release calls. This patch adds the necessary support to the fuse kernel modules to prevent open/release operations from hitting in userspace. When the client returns ENOSYS, we avoid sending the subsequent release to userspace, and also remember this so that future opens also don't trigger a userspace operation. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Andrew Gallagher authored
Various read operations (e.g. readlink, readdir) invalidate the cached attrs for atime changes. This patch adds a new function 'fuse_invalidate_atime', which checks for a read-only super block and avoids the attr invalidation in that case. Signed-off-by:
Andrew Gallagher <andrewjcg@fb.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
- 25 Oct, 2013 2 commits
-
-
Al Viro authored
makes ->permission() and ->d_revalidate() safety in RCU mode independent from vfsmount_lock. Signed-off-by:
Al Viro <viro@zeniv.linux.org.uk>
-
Miklos Szeredi authored
...which just returns -EBUSY if a directory alias would be created. This is to be used by fuse mkdir to make sure that a buggy or malicious userspace filesystem doesn't do anything nasty. Previously fuse used a private mutex for this purpose, which can now go away. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
- 01 Oct, 2013 2 commits
-
-
Miklos Szeredi authored
As Maxim Patlasov pointed out, it's possible to get a dirty page while it's copy is still under writeback, despite fuse_page_mkwrite() doing its thing (direct IO). This could result in two concurrent write request for the same offset, with data corruption if they get mixed up. To prevent this, fuse needs to check and delay such writes. This implementation does this by: 1. check if page is still under writeout, if so create a new, single page secondary request for it 2. chain this secondary request onto the in-flight request 2/a. if a seconday request for the same offset was already chained to the in-flight request, then just copy the contents of the page and discard the new secondary request. This makes sure that for each page will have at most two requests associated with it 3. when the in-flight request finished, send off all secondary requests chained onto it Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Miklos Szeredi authored
Doing dput(parent) is not valid in RCU walk mode. In RCU mode it would probably be okay to update the parent flags, but it's actually not necessary most of the time... So only set the FUSE_I_ADVISE_RDPLUS flag on the parent when the entry was recently initialized by READDIRPLUS. This is achieved by setting FUSE_I_INIT_RDPLUS on entries added by READDIRPLUS and only dropping out of RCU mode if this flag is set. FUSE_I_INIT_RDPLUS is cleared once the FUSE_I_ADVISE_RDPLUS flag is set in the parent. Reported-by:
Al Viro <viro@zeniv.linux.org.uk> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org
-
- 03 Sep, 2013 1 commit
-
-
Maxim Patlasov authored
The way how fuse calls truncate_pagecache() from fuse_change_attributes() is completely wrong. Because, w/o i_mutex held, we never sure whether 'oldsize' and 'attr->size' are valid by the time of execution of truncate_pagecache(inode, oldsize, attr->size). In fact, as soon as we released fc->lock in the middle of fuse_change_attributes(), we completely loose control of actions which may happen with given inode until we reach truncate_pagecache. The list of potentially dangerous actions includes mmap-ed reads and writes, ftruncate(2) and write(2) extending file size. The typical outcome of doing truncate_pagecache() with outdated arguments is data corruption from user point of view. This is (in some sense) acceptable in cases when the issue is triggered by a change of the file on the server (i.e. externally wrt fuse operation), but it is absolutely intolerable in scenarios when a single fuse client modifies a file without any external intervention. A real life case I discovered by fsx-linux looked like this: 1. Shrinking ftruncate(2) comes to fuse_do_setattr(). The latter sends FUSE_SETATTR to the server synchronously, but before getting fc->lock ... 2. fuse_dentry_revalidate() is asynchronously called. It sends FUSE_LOOKUP to the server synchronously, then calls fuse_change_attributes(). The latter updates i_size, releases fc->lock, but before comparing oldsize vs attr->size.. 3. fuse_do_setattr() from the first step proceeds by acquiring fc->lock and updating attributes and i_size, but now oldsize is equal to outarg.attr.size because i_size has just been updated (step 2). Hence, fuse_do_setattr() returns w/o calling truncate_pagecache(). 4. As soon as ftruncate(2) completes, the user extends file size by write(2) making a hole in the middle of file, then reads data from the hole either by read(2) or mmap-ed read. The user expects to get zero data from the hole, but gets stale data because truncate_pagecache() is not executed yet. The scenario above illustrates one side of the problem: not truncating the page cache even though we should. Another side corresponds to truncating page cache too late, when the state of inode changed significantly. Theoretically, the following is possible: 1. As in the previous scenario fuse_dentry_revalidate() discovered that i_size changed (due to our own fuse_do_setattr()) and is going to call truncate_pagecache() for some 'new_size' it believes valid right now. But by the time that particular truncate_pagecache() is called ... 2. fuse_do_setattr() returns (either having called truncate_pagecache() or not -- it doesn't matter). 3. The file is extended either by write(2) or ftruncate(2) or fallocate(2). 4. mmap-ed write makes a page in the extended region dirty. The result will be the lost of data user wrote on the fourth step. The patch is a hotfix resolving the issue in a simplistic way: let's skip dangerous i_size update and truncate_pagecache if an operation changing file size is in progress. This simplistic approach looks correct for the cases w/o external changes. And to handle them properly, more sophisticated and intrusive techniques (e.g. NFS-like one) would be required. I'd like to postpone it until the issue is well discussed on the mailing list(s). Changed in v2: - improved patch description to cover both sides of the issue. Signed-off-by:
Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz> Cc: stable@vger.kernel.org
-
- 01 May, 2013 1 commit
-
-
Miklos Szeredi authored
Without async DIO write requests to a single file were always serialized. With async DIO that's no longer the case. So don't turn on async DIO by default for fear of breaking backward compatibility. Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
- 18 Apr, 2013 1 commit
-
-
Maxim Patlasov authored
The patch improves error handling in fuse_direct_IO(): if we successfully submitted several fuse requests on behalf of synchronous direct write extending file and some of them failed, let's try to do our best to clean-up. Changed in v2: reuse fuse_do_setattr(). Thanks to Brian for suggestion. Signed-off-by:
Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
- 17 Apr, 2013 3 commits
-
-
Maxim Patlasov authored
The patch implements passing "struct fuse_io_priv *io" down the stack up to fuse_send_read/write where it is used to submit request asynchronously. io->async==0 designates synchronous processing. Non-trivial part of the patch is changes in fuse_direct_io(): resources like fuse requests and user pages cannot be released immediately in async case. Signed-off-by:
Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Maxim Patlasov authored
The patch implements a framework to process an IO request asynchronously. The idea is to associate several fuse requests with a single kiocb by means of fuse_io_priv structure. The structure plays the same role for FUSE as 'struct dio' for direct-io.c. The framework is supposed to be used like this: - someone (who wants to process an IO asynchronously) allocates fuse_io_priv and initializes it setting 'async' field to non-zero value. - as soon as fuse request is filled, it can be submitted (in non-blocking way) by fuse_async_req_send() - when all submitted requests are ACKed by userspace, io->reqs drops to zero triggering aio_complete() In case of IO initiated by libaio, aio_complete() will finish processing the same way as in case of dio_complete() calling aio_complete(). But the framework may be also used for internal FUSE use when initial IO request was synchronous (from user perspective), but it's beneficial to process it asynchronously. Then the caller should wait on kiocb explicitly and aio_complete() will wake the caller up. Signed-off-by:
Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-
Maxim Patlasov authored
Existing flag fc->blocked is used to suspend request allocation both in case of many background request submitted and period of time before init_reply arrives from userspace. Next patch will skip blocking allocations of synchronous request (disregarding fc->blocked). This is mostly OK, but we still need to suspend allocations if init_reply is not arrived yet. The patch introduces flag fc->initialized which will serve this purpose. Signed-off-by:
Maxim Patlasov <mpatlasov@parallels.com> Signed-off-by:
Miklos Szeredi <mszeredi@suse.cz>
-