1. 22 Nov, 2018 1 commit
    • Myungho Jung's avatar
      fuse: Add bad inode check in fuse_destroy_inode() · 4fc4bb79
      Myungho Jung authored
      make_bad_inode() sets inode->i_mode to S_IFREG if I/O error is detected
      in fuse_do_getattr()/fuse_do_setattr(). If the inode is not a regular
      file, write_files and queued_writes in fuse_inode are not initialized
      and have NULL or invalid pointers written by other members in a union.
      So, list_empty() returns false in fuse_destroy_inode(). Add
      is_bad_inode() to check if make_bad_inode() was called.
      
      Reported-by: syzbot+b9c89b84423073226299@syzkaller.appspotmail.com
      Fixes: ab2257e9
      
       ("fuse: reduce size of struct fuse_inode")
      Signed-off-by: default avatarMyungho Jung <mhjungk@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      4fc4bb79
  2. 15 Oct, 2018 2 commits
    • Dan Schatzberg's avatar
      fuse: enable caching of symlinks · 5571f1e6
      Dan Schatzberg authored
      
      
      FUSE file reads are cached in the page cache, but symlink reads are
      not. This patch enables FUSE READLINK operations to be cached which
      can improve performance of some FUSE workloads.
      
      In particular, I'm working on a FUSE filesystem for access to source
      code and discovered that about a 10% improvement to build times is
      achieved with this patch (there are a lot of symlinks in the source
      tree).
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      5571f1e6
    • Miklos Szeredi's avatar
      fuse: allow fine grained attr cache invaldation · 2f1e8196
      Miklos Szeredi authored
      
      
      This patch adds the infrastructure for more fine grained attribute
      invalidation.  Currently only 'atime' is invalidated separately.
      
      The use of this infrastructure is extended to the statx(2) interface, which
      for now means that if only 'atime' is invalid and STATX_ATIME is not
      specified in the mask argument, then no GETATTR request will be generated.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      2f1e8196
  3. 01 Oct, 2018 4 commits
    • Constantine Shulyupin's avatar
      fuse: add max_pages to init_out · 5da784cc
      Constantine Shulyupin authored
      Replace FUSE_MAX_PAGES_PER_REQ with the configurable parameter max_pages to
      improve performance.
      
      Old RFC with detailed description of the problem and many fixes by Mitsuo
      Hayasaka (mitsuo.hayasaka.hu@hitachi.com):
       - https://lkml.org/lkml/2012/7/5/136
      
      We've encountered performance degradation and fixed it on a big and complex
      virtual environment.
      
      Environment to reproduce degradation and improvement:
      
      1. Add lag to user mode FUSE
      Add nanosleep(&(struct timespec){ 0, 1000 }, NULL); to xmp_write_buf in
      passthrough_fh.c
      
      2. patch UM fuse with configurable max_pages parameter. The patch will be
      provided latter.
      
      3. run test script and perform test on tmpfs
      fuse_test()
      {
      
             cd /tmp
             mkdir -p fusemnt
             passthrough_fh -o max_pages=$1 /tmp/fusemnt
             grep fuse /proc/self/mounts
             dd conv=fdatasync oflag=dsync if=/dev/zero of=fusemnt/tmp/tmp \
      		count=1K bs=1M 2>&1 | grep -v records
             rm fusemnt/tmp/tmp
             killall passthrough_fh
      }
      
      Test results:
      
      passthrough_fh /tmp/fusemnt fuse.passthrough_fh \
      	rw,nosuid,nodev,relatime,user_id=0,group_id=0 0 0
      1073741824 bytes (1.1 GB) copied, 1.73867 s, 618 MB/s
      
      passthrough_fh /tmp/fusemnt fuse.passthrough_fh \
      	rw,nosuid,nodev,relatime,user_id=0,group_id=0,max_pages=256 0 0
      1073741824 bytes (1.1 GB) copied, 1.15643 s, 928 MB/s
      
      Obviously with bigger lag the difference between 'before' and 'after'
      will be more significant.
      
      Mitsuo Hayasaka, in 2012 (https://lkml.org/lkml/2012/7/5/136
      
      ),
      observed improvement from 400-550 to 520-740.
      Signed-off-by: default avatarConstantine Shulyupin <const@MakeLinux.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      5da784cc
    • Miklos Szeredi's avatar
      fuse: reduce size of struct fuse_inode · ab2257e9
      Miklos Szeredi authored
      
      
      Do this by grouping fields used for cached writes and putting them into a
      union with fileds used for cached readdir (with obviously no overlap, since
      we don't have hybrid objects).
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      ab2257e9
    • Miklos Szeredi's avatar
      fuse: add readdir cache version · 3494927e
      Miklos Szeredi authored
      
      
      Allow the cache to be invalidated when page(s) have gone missing.  In this
      case increment the version of the cache and reset to an empty state.
      
      Add a version number to the directory stream in struct fuse_file as well,
      indicating the version of the cache it's supposed to be reading.  If the
      cache version doesn't match the stream's version, then reset the stream to
      the beginning of the cache.
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      3494927e
    • Miklos Szeredi's avatar
      fuse: allow caching readdir · 69e34551
      Miklos Szeredi authored
      
      
      This patch just adds the cache filling functions, which are invoked if
      FOPEN_CACHE_DIR flag is set in the OPENDIR reply.
      
      Cache reading and cache invalidation are added by subsequent patches.
      
      The directory cache uses the page cache.  Directory entries are packed into
      a page in the same format as in the READDIR reply.  A page only contains
      whole entries, the space at the end of the page is cleared.  The page is
      locked while being modified.
      
      Multiple parallel readdirs on the same directory can fill the cache; the
      only constraint is that continuity must be maintained (d_off of last entry
      points to position of current entry).
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      69e34551
  4. 28 Sep, 2018 2 commits
    • Kirill Tkhai's avatar
      fuse: Use hash table to link processing request · be2ff42c
      Kirill Tkhai authored
      
      
      We noticed the performance bottleneck in FUSE running our Virtuozzo storage
      over rdma. On some types of workload we observe 20% of times spent in
      request_find() in profiler.  This function is iterating over long requests
      list, and it scales bad.
      
      The patch introduces hash table to reduce the number of iterations, we do
      in this function. Hash generating algorithm is taken from hash_add()
      function, while 256 lines table is used to store pending requests.  This
      fixes problem and improves the performance.
      Reported-by: default avatarAlexey Kuznetsov <kuznet@virtuozzo.com>
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      be2ff42c
    • Kirill Tkhai's avatar
      fuse: introduce fc->bg_lock · ae2dffa3
      Kirill Tkhai authored
      
      
      To reduce contention of fc->lock, this patch introduces bg_lock for
      protection of fields related to background queue. These are:
      max_background, congestion_threshold, num_background, active_background,
      bg_queue and blocked.
      
      This allows next patch to make async reads not requiring fc->lock, so async
      reads and writes will have better performance executed in parallel.
      Signed-off-by: default avatarKirill Tkhai <ktkhai@virtuozzo.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      ae2dffa3
  5. 26 Jul, 2018 4 commits
  6. 05 Jun, 2018 1 commit
    • Deepa Dinamani's avatar
      vfs: change inode times to use struct timespec64 · 95582b00
      Deepa Dinamani authored
      
      
      struct timespec is not y2038 safe. Transition vfs to use
      y2038 safe struct timespec64 instead.
      
      The change was made with the help of the following cocinelle
      script. This catches about 80% of the changes.
      All the header file and logic changes are included in the
      first 5 rules. The rest are trivial substitutions.
      I avoid changing any of the function signatures or any other
      filesystem specific data structures to keep the patch simple
      for review.
      
      The script can be a little shorter by combining different cases.
      But, this version was sufficient for my usecase.
      
      virtual patch
      
      @ depends on patch @
      identifier now;
      @@
      - struct timespec
      + struct timespec64
        current_time ( ... )
        {
      - struct timespec now = current_kernel_time();
      + struct timespec64 now = current_kernel_time64();
        ...
      - return timespec_trunc(
      + return timespec64_trunc(
        ... );
        }
      
      @ depends on patch @
      identifier xtime;
      @@
       struct \( iattr \| inode \| kstat \) {
       ...
      -       struct timespec xtime;
      +       struct timespec64 xtime;
       ...
       }
      
      @ depends on patch @
      identifier t;
      @@
       struct inode_operations {
       ...
      int (*update_time) (...,
      -       struct timespec t,
      +       struct timespec64 t,
      ...);
       ...
       }
      
      @ depends on patch @
      identifier t;
      identifier fn_update_time =~ "update_time$";
      @@
       fn_update_time (...,
      - struct timespec *t,
      + struct timespec64 *t,
       ...) { ... }
      
      @ depends on patch @
      identifier t;
      @@
      lease_get_mtime( ... ,
      - struct timespec *t
      + struct timespec64 *t
        ) { ... }
      
      @te depends on patch forall@
      identifier ts;
      local idexpression struct inode *inode_node;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn_update_time =~ "update_time$";
      identifier fn;
      expression e, E3;
      local idexpression struct inode *node1;
      local idexpression struct inode *node2;
      local idexpression struct iattr *attr1;
      local idexpression struct iattr *attr2;
      local idexpression struct iattr attr;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      @@
      (
      (
      - struct timespec ts;
      + struct timespec64 ts;
      |
      - struct timespec ts = current_time(inode_node);
      + struct timespec64 ts = current_time(inode_node);
      )
      
      <+... when != ts
      (
      - timespec_equal(&inode_node->i_xtime, &ts)
      + timespec64_equal(&inode_node->i_xtime, &ts)
      |
      - timespec_equal(&ts, &inode_node->i_xtime)
      + timespec64_equal(&ts, &inode_node->i_xtime)
      |
      - timespec_compare(&inode_node->i_xtime, &ts)
      + timespec64_compare(&inode_node->i_xtime, &ts)
      |
      - timespec_compare(&ts, &inode_node->i_xtime)
      + timespec64_compare(&ts, &inode_node->i_xtime)
      |
      ts = current_time(e)
      |
      fn_update_time(..., &ts,...)
      |
      inode_node->i_xtime = ts
      |
      node1->i_xtime = ts
      |
      ts = inode_node->i_xtime
      |
      <+... attr1->ia_xtime ...+> = ts
      |
      ts = attr1->ia_xtime
      |
      ts.tv_sec
      |
      ts.tv_nsec
      |
      btrfs_set_stack_timespec_sec(..., ts.tv_sec)
      |
      btrfs_set_stack_timespec_nsec(..., ts.tv_nsec)
      |
      - ts = timespec64_to_timespec(
      + ts =
      ...
      -)
      |
      - ts = ktime_to_timespec(
      + ts = ktime_to_timespec64(
      ...)
      |
      - ts = E3
      + ts = timespec_to_timespec64(E3)
      |
      - ktime_get_real_ts(&ts)
      + ktime_get_real_ts64(&ts)
      |
      fn(...,
      - ts
      + timespec64_to_timespec(ts)
      ,...)
      )
      ...+>
      (
      <... when != ts
      - return ts;
      + return timespec64_to_timespec(ts);
      ...>
      )
      |
      - timespec_equal(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_equal(&node1->i_xtime2, &node2->i_xtime2)
      |
      - timespec_equal(&node1->i_xtime1, &attr2->ia_xtime2)
      + timespec64_equal(&node1->i_xtime2, &attr2->ia_xtime2)
      |
      - timespec_compare(&node1->i_xtime1, &node2->i_xtime2)
      + timespec64_compare(&node1->i_xtime1, &node2->i_xtime2)
      |
      node1->i_xtime1 =
      - timespec_trunc(attr1->ia_xtime1,
      + timespec64_trunc(attr1->ia_xtime1,
      ...)
      |
      - attr1->ia_xtime1 = timespec_trunc(attr2->ia_xtime2,
      + attr1->ia_xtime1 =  timespec64_trunc(attr2->ia_xtime2,
      ...)
      |
      - ktime_get_real_ts(&attr1->ia_xtime1)
      + ktime_get_real_ts64(&attr1->ia_xtime1)
      |
      - ktime_get_real_ts(&attr.ia_xtime1)
      + ktime_get_real_ts64(&attr.ia_xtime1)
      )
      
      @ depends on patch @
      struct inode *node;
      struct iattr *attr;
      identifier fn;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      expression e;
      @@
      (
      - fn(node->i_xtime);
      + fn(timespec64_to_timespec(node->i_xtime));
      |
       fn(...,
      - node->i_xtime);
      + timespec64_to_timespec(node->i_xtime));
      |
      - e = fn(attr->ia_xtime);
      + e = fn(timespec64_to_timespec(attr->ia_xtime));
      )
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      identifier i_xtime =~ "^i_[acm]time$";
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier fn;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      )
      ...+>
      }
      
      @ depends on patch forall @
      struct inode *node;
      struct iattr *attr;
      struct kstat *stat;
      identifier ia_xtime =~ "^ia_[acm]time$";
      identifier i_xtime =~ "^i_[acm]time$";
      identifier xtime =~ "^[acm]time$";
      identifier fn, ret;
      @@
      {
      + struct timespec ts;
      <+...
      (
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(node->i_xtime);
      ret = fn (...,
      - &node->i_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime,
      + &ts,
      ...);
      |
      + ts = timespec64_to_timespec(attr->ia_xtime);
      ret = fn (...,
      - &attr->ia_xtime);
      + &ts);
      |
      + ts = timespec64_to_timespec(stat->xtime);
      ret = fn (...,
      - &stat->xtime);
      + &ts);
      )
      ...+>
      }
      
      @ depends on patch @
      struct inode *node;
      struct inode *node2;
      identifier i_xtime1 =~ "^i_[acm]time$";
      identifier i_xtime2 =~ "^i_[acm]time$";
      identifier i_xtime3 =~ "^i_[acm]time$";
      struct iattr *attrp;
      struct iattr *attrp2;
      struct iattr attr ;
      identifier ia_xtime1 =~ "^ia_[acm]time$";
      identifier ia_xtime2 =~ "^ia_[acm]time$";
      struct kstat *stat;
      struct kstat stat1;
      struct timespec64 ts;
      identifier xtime =~ "^[acmb]time$";
      expression e;
      @@
      (
      ( node->i_xtime2 \| attrp->ia_xtime2 \| attr.ia_xtime2 \) = node->i_xtime1  ;
      |
       node->i_xtime2 = \( node2->i_xtime1 \| timespec64_trunc(...) \);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       node->i_xtime1 = node->i_xtime3 = \(ts \| current_time(...) \);
      |
       stat->xtime = node2->i_xtime1;
      |
       stat1.xtime = node2->i_xtime1;
      |
      ( node->i_xtime2 \| attrp->ia_xtime2 \) = attrp->ia_xtime1  ;
      |
      ( attrp->ia_xtime1 \| attr.ia_xtime1 \) = attrp2->ia_xtime2;
      |
      - e = node->i_xtime1;
      + e = timespec64_to_timespec( node->i_xtime1 );
      |
      - e = attrp->ia_xtime1;
      + e = timespec64_to_timespec( attrp->ia_xtime1 );
      |
      node->i_xtime1 = current_time(...);
      |
       node->i_xtime2 = node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
       node->i_xtime1 = node->i_xtime3 =
      - e;
      + timespec_to_timespec64(e);
      |
      - node->i_xtime1 = e;
      + node->i_xtime1 = timespec_to_timespec64(e);
      )
      Signed-off-by: default avatarDeepa Dinamani <deepa.kernel@gmail.com>
      Cc: <anton@tuxera.com>
      Cc: <balbi@kernel.org>
      Cc: <bfields@fieldses.org>
      Cc: <darrick.wong@oracle.com>
      Cc: <dhowells@redhat.com>
      Cc: <dsterba@suse.com>
      Cc: <dwmw2@infradead.org>
      Cc: <hch@lst.de>
      Cc: <hirofumi@mail.parknet.co.jp>
      Cc: <hubcap@omnibond.com>
      Cc: <jack@suse.com>
      Cc: <jaegeuk@kernel.org>
      Cc: <jaharkes@cs.cmu.edu>
      Cc: <jslaby@suse.com>
      Cc: <keescook@chromium.org>
      Cc: <mark@fasheh.com>
      Cc: <miklos@szeredi.hu>
      Cc: <nico@linaro.org>
      Cc: <reiserfs-devel@vger.kernel.org>
      Cc: <richard@nod.at>
      Cc: <sage@redhat.com>
      Cc: <sfrench@samba.org>
      Cc: <swhiteho@redhat.com>
      Cc: <tj@kernel.org>
      Cc: <trond.myklebust@primarydata.com>
      Cc: <tytso@mit.edu>
      Cc: <viro@zeniv.linux.org.uk>
      95582b00
  7. 31 May, 2018 3 commits
  8. 23 Mar, 2018 1 commit
    • Mimi Zohar's avatar
      fuse: define the filesystem as untrusted · 0834136a
      Mimi Zohar authored
      
      
      Files on FUSE can change at any point in time without IMA being able
      to detect it.  The file data read for the file signature verification
      could be totally different from what is subsequently read, making the
      signature verification useless.
      
      FUSE can be mounted by unprivileged users either today with fusermount
      installed with setuid, or soon with the upcoming patches to allow FUSE
      mounts in a non-init user namespace.
      
      This patch sets the SB_I_IMA_UNVERIFIABLE_SIGNATURE flag and when
      appropriate sets the SB_I_UNTRUSTED_MOUNTER flag.
      Signed-off-by: default avatarMimi Zohar <zohar@linux.vnet.ibm.com>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Seth Forshee <seth.forshee@canonical.com>
      Cc: Dongsu Park <dongsu@kinvolk.io>
      Cc: Alban Crequy <alban@kinvolk.io>
      Acked-by: default avatarSerge Hallyn <serge@hallyn.com>
      Acked-by: default avatar"Eric W. Biederman" <ebiederm@xmission.com>
      0834136a
  9. 20 Mar, 2018 2 commits
    • Eric W. Biederman's avatar
      fuse: Support fuse filesystems outside of init_user_ns · 8cb08329
      Eric W. Biederman authored
      
      
      In order to support mounts from namespaces other than init_user_ns, fuse
      must translate uids and gids to/from the userns of the process servicing
      requests on /dev/fuse. This patch does that, with a couple of restrictions
      on the namespace:
      
       - The userns for the fuse connection is fixed to the namespace
         from which /dev/fuse is opened.
      
       - The namespace must be the same as s_user_ns.
      
      These restrictions simplify the implementation by avoiding the need to pass
      around userns references and by allowing fuse to rely on the checks in
      setattr_prepare for ownership changes.  Either restriction could be relaxed
      in the future if needed.
      
      For cuse the userns used is the opener of /dev/cuse.  Semantically the cuse
      support does not appear safe for unprivileged users.  Practically the
      permissions on /dev/cuse only make it accessible to the global root user.
      If something slips through the cracks in a user namespace the only users
      who will be able to use the cuse device are those users mapped into the
      user namespace.
      
      Translation in the posix acl is updated to use the uuser namespace of the
      filesystem.  Avoiding cases which might bypass this translation is handled
      in a following change.
      
      This change is stronlgy based on a similar change from Seth Forshee and
      Dongsu Park.
      
      Cc: Seth Forshee <seth.forshee@canonical.com>
      Cc: Dongsu Park <dongsu@kinvolk.io>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      8cb08329
    • Szymon Lukasz's avatar
      fuse: return -ECONNABORTED on /dev/fuse read after abort · 3b7008b2
      Szymon Lukasz authored
      
      
      Currently the userspace has no way of knowing whether the fuse
      connection ended because of umount or abort via sysfs. It makes it hard
      for filesystems to free the mountpoint after abort without worrying
      about removing some new mount.
      
      The patch fixes it by returning different errors when userspace reads
      from /dev/fuse (-ENODEV for umount and -ECONNABORTED for abort).
      
      Add a new capability flag FUSE_ABORT_ERROR. If set and the connection is
      gone because of sysfs abort, reading from the device will return
      -ECONNABORTED.
      Signed-off-by: default avatarSzymon Lukasz <noh4hss@gmail.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@redhat.com>
      3b7008b2
  10. 27 Nov, 2017 1 commit
    • Linus Torvalds's avatar
      Rename superblock flags (MS_xyz -> SB_xyz) · 1751e8a6
      Linus Torvalds authored
      
      
      This is a pure automated search-and-replace of the internal kernel
      superblock flags.
      
      The s_flags are now called SB_*, with the names and the values for the
      moment mirroring the MS_* flags that they're equivalent to.
      
      Note how the MS_xyz flags are the ones passed to the mount system call,
      while the SB_xyz flags are what we then use in sb->s_flags.
      
      The script to do this was:
      
          # places to look in; re security/*: it generally should *not* be
          # touched (that stuff parses mount(2) arguments directly), but
          # there are two places where we really deal with superblock flags.
          FILES="drivers/mtd drivers/staging/lustre fs ipc mm \
                  include/linux/fs.h include/uapi/linux/bfs_fs.h \
                  security/apparmor/apparmorfs.c security/apparmor/include/lib.h"
          # the list of MS_... constants
          SYMS="RDONLY NOSUID NODEV NOEXEC SYNCHRONOUS REMOUNT MANDLOCK \
                DIRSYNC NOATIME NODIRATIME BIND MOVE REC VERBOSE SILENT \
                POSIXACL UNBINDABLE PRIVATE SLAVE SHARED RELATIME KERNMOUNT \
                I_VERSION STRICTATIME LAZYTIME SUBMOUNT NOREMOTELOCK NOSEC BORN \
                ACTIVE NOUSER"
      
          SED_PROG=
          for i in $SYMS; do SED_PROG="$SED_PROG -e s/MS_$i/SB_$i/g"; done
      
          # we want files that contain at least one of MS_...,
          # with fs/namespace.c and fs/pnode.c excluded.
          L=$(for i in $SYMS; do git grep -w -l MS_$i $FILES; done| sort|uniq|grep -v '^fs/namespace.c'|grep -v '^fs/pnode.c')
      
          for f in $L; do sed -i $f $SED_PROG; done
      Requested-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      1751e8a6
  11. 16 Nov, 2017 1 commit
  12. 31 Oct, 2017 1 commit
    • Kees Cook's avatar
      treewide: Fix function prototypes for module_param_call() · e4dca7b7
      Kees Cook authored
      
      
      Several function prototypes for the set/get functions defined by
      module_param_call() have a slightly wrong argument types. This fixes
      those in an effort to clean up the calls when running under type-enforced
      compiler instrumentation for CFI. This is the result of running the
      following semantic patch:
      
      @match_module_param_call_function@
      declarer name module_param_call;
      identifier _name, _set_func, _get_func;
      expression _arg, _mode;
      @@
      
       module_param_call(_name, _set_func, _get_func, _arg, _mode);
      
      @fix_set_prototype
       depends on match_module_param_call_function@
      identifier match_module_param_call_function._set_func;
      identifier _val, _param;
      type _val_type, _param_type;
      @@
      
       int _set_func(
      -_val_type _val
      +const char * _val
       ,
      -_param_type _param
      +const struct kernel_param * _param
       ) { ... }
      
      @fix_get_prototype
       depends on match_module_param_call_function@
      identifier match_module_param_call_function._get_func;
      identifier _val, _param;
      type _val_type, _param_type;
      @@
      
       int _get_func(
      -_val_type _val
      +char * _val
       ,
      -_param_type _param
      +const struct kernel_param * _param
       ) { ... }
      
      Two additional by-hand changes are included for places where the above
      Coccinelle script didn't notice them:
      
      	drivers/platform/x86/thinkpad_acpi.c
      	fs/lockd/svc.c
      Signed-off-by: default avatarKees Cook <keescook@chromium.org>
      Signed-off-by: default avatarJessica Yu <jeyu@kernel.org>
      e4dca7b7
  13. 18 Oct, 2017 1 commit
  14. 17 May, 2017 1 commit
  15. 20 Apr, 2017 2 commits
  16. 18 Apr, 2017 2 commits
  17. 18 Oct, 2016 1 commit
  18. 01 Oct, 2016 4 commits
  19. 30 Jul, 2016 1 commit
  20. 29 Jul, 2016 1 commit
  21. 30 Jun, 2016 1 commit
  22. 04 Apr, 2016 1 commit
    • Kirill A. Shutemov's avatar
      mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros · 09cbfeaf
      Kirill A. Shutemov authored
      
      
      PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} macros were introduced *long* time
      ago with promise that one day it will be possible to implement page
      cache with bigger chunks than PAGE_SIZE.
      
      This promise never materialized.  And unlikely will.
      
      We have many places where PAGE_CACHE_SIZE assumed to be equal to
      PAGE_SIZE.  And it's constant source of confusion on whether
      PAGE_CACHE_* or PAGE_* constant should be used in a particular case,
      especially on the border between fs and mm.
      
      Global switching to PAGE_CACHE_SIZE != PAGE_SIZE would cause to much
      breakage to be doable.
      
      Let's stop pretending that pages in page cache are special.  They are
      not.
      
      The changes are pretty straight-forward:
      
       - <foo> << (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - <foo> >> (PAGE_CACHE_SHIFT - PAGE_SHIFT) -> <foo>;
      
       - PAGE_CACHE_{SIZE,SHIFT,MASK,ALIGN} -> PAGE_{SIZE,SHIFT,MASK,ALIGN};
      
       - page_cache_get() -> get_page();
      
       - page_cache_release() -> put_page();
      
      This patch contains automated changes generated with coccinelle using
      script below.  For some reason, coccinelle doesn't patch header files.
      I've called spatch for them manually.
      
      The only adjustment after coccinelle is revert of changes to
      PAGE_CAHCE_ALIGN definition: we are going to drop it later.
      
      There are few places in the code where coccinelle didn't reach.  I'll
      fix them manually in a separate patch.  Comments and documentation also
      will be addressed with the separate patch.
      
      virtual patch
      
      @@
      expression E;
      @@
      - E << (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      expression E;
      @@
      - E >> (PAGE_CACHE_SHIFT - PAGE_SHIFT)
      + E
      
      @@
      @@
      - PAGE_CACHE_SHIFT
      + PAGE_SHIFT
      
      @@
      @@
      - PAGE_CACHE_SIZE
      + PAGE_SIZE
      
      @@
      @@
      - PAGE_CACHE_MASK
      + PAGE_MASK
      
      @@
      expression E;
      @@
      - PAGE_CACHE_ALIGN(E)
      + PAGE_ALIGN(E)
      
      @@
      expression E;
      @@
      - page_cache_get(E)
      + get_page(E)
      
      @@
      expression E;
      @@
      - page_cache_release(E)
      + put_page(E)
      Signed-off-by: default avatarKirill A. Shutemov <kirill.shutemov@linux.intel.com>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      09cbfeaf
  23. 15 Jan, 2016 1 commit
    • Vladimir Davydov's avatar
      kmemcg: account certain kmem allocations to memcg · 5d097056
      Vladimir Davydov authored
      
      
      Mark those kmem allocations that are known to be easily triggered from
      userspace as __GFP_ACCOUNT/SLAB_ACCOUNT, which makes them accounted to
      memcg.  For the list, see below:
      
       - threadinfo
       - task_struct
       - task_delay_info
       - pid
       - cred
       - mm_struct
       - vm_area_struct and vm_region (nommu)
       - anon_vma and anon_vma_chain
       - signal_struct
       - sighand_struct
       - fs_struct
       - files_struct
       - fdtable and fdtable->full_fds_bits
       - dentry and external_name
       - inode for all filesystems. This is the most tedious part, because
         most filesystems overwrite the alloc_inode method.
      
      The list is far from complete, so feel free to add more objects.
      Nevertheless, it should be close to "account everything" approach and
      keep most workloads within bounds.  Malevolent users will be able to
      breach the limit, but this was possible even with the former "account
      everything" approach (simply because it did not account everything in
      fact).
      
      [akpm@linux-foundation.org: coding-style fixes]
      Signed-off-by: default avatarVladimir Davydov <vdavydov@virtuozzo.com>
      Acked-by: default avatarJohannes Weiner <hannes@cmpxchg.org>
      Acked-by: default avatarMichal Hocko <mhocko@suse.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Greg Thelen <gthelen@google.com>
      Cc: Christoph Lameter <cl@linux.com>
      Cc: Pekka Enberg <penberg@kernel.org>
      Cc: David Rientjes <rientjes@google.com>
      Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5d097056
  24. 01 Jul, 2015 1 commit