1. 29 Jun, 2017 1 commit
  2. 11 Oct, 2016 5 commits
  3. 15 Mar, 2016 5 commits
  4. 22 Feb, 2015 1 commit
  5. 14 Oct, 2014 1 commit
    • NeilBrown's avatar
      autofs4: allow RCU-walk to walk through autofs4 · 23bfc2a2
      NeilBrown authored
      
      
      This series teaches autofs about RCU-walk so that we don't drop straight
      into REF-walk when we hit an autofs directory, and so that we avoid
      spinlocks as much as possible when performing an RCU-walk.
      
      This is needed so that the benefits of the recent NFS support for
      RCU-walk are fully available when NFS filesystems are automounted.
      
      Patches have been carefully reviewed and tested both with test suites
      and in production - thanks a lot to Ian Kent for his support there.
      
      This patch (of 6):
      
      Any attempt to look up a pathname that passes though an autofs4 mount is
      currently forced out of RCU-walk into REF-walk.
      
      This can significantly hurt performance of many-thread work loads on
      many-core systems, especially if the automounted filesystem supports
      RCU-walk but doesn't get to benefit from it.
      
      So if autofs4_d_manage is called with rcu_walk set, only fail with -ECHILD
      if it is necessary to wait longer than a spinlock.
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
      Reviewed-by: default avatarIan Kent <raven@themaw.net>
      Tested-by: default avatarIan Kent <raven@themaw.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      23bfc2a2
  6. 04 Jun, 2014 1 commit
  7. 08 Apr, 2014 1 commit
  8. 24 Jan, 2014 1 commit
    • Sukadev Bhattiprolu's avatar
      autofs4: allow autofs to work outside the initial PID namespace · 6eaba35b
      Sukadev Bhattiprolu authored
      
      
      Enable autofs4 to work in a "container".  oz_pgrp is converted from
      pid_t to struct pid and this is stored at mount time based on the
      "pgrp=" option or if the option is missing then the current pgrp.
      
      The "pgrp=" option is interpreted in the PID namespace of the current
      process.  This option is flawed in that it doesn't carry the namespace
      information, so it should be deprecated.  AFAICS the autofs daemon
      always sends the current pgrp, which is the default anyway.
      
      The oz_pgrp is also set from the AUTOFS_DEV_IOCTL_SETPIPEFD_CMD ioctl.
      This ioctl sets oz_pgrp to the current pgrp.  It is not allowed to
      change the pid namespace.
      
      oz_pgrp is used mainly to determine whether the process traversing the
      autofs mount tree is the autofs daemon itself or not.  This function now
      compares the pid pointers instead of the pid_t values.
      
      One other use of oz_pgrp is in autofs4_show_options.  There is shows the
      virtual pid number (i.e.  the one that is valid inside the PID namespace
      of the calling process)
      
      For debugging printk convert oz_pgrp to the value in the initial pid
      namespace.
      Signed-off-by: default avatarSukadev Bhattiprolu <sukadev@us.ibm.com>
      Signed-off-by: default avatarMiklos Szeredi <mszeredi@suse.cz>
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Eric Biederman <ebiederm@xmission.com>
      Acked-by: default avatarIan Kent <raven@themaw.net>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      6eaba35b
  9. 25 Oct, 2013 1 commit
  10. 09 Sep, 2013 1 commit
    • Ian Kent's avatar
      autofs4 - fix device ioctl mount lookup · ac838719
      Ian Kent authored
      
      
      When reconnecting to automounts at startup an autofs ioctl is used
      to find the device and inode of existing mounts so they can be used
      to open a file descriptor of possibly covered mounts.
      
      At this time the the caller might not yet "own" the mount so it can
      trigger calling ->d_automount(). This causes automount to hang when
      trying to reconnect to direct or offset mount types.
      
      Consequently kern_path() can't be used but kern_path_mountpoint() can be.
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Cc: Jeff Layton <jlayton@redhat.com>
      Cc: Al Viro <viro@ZenIV.linux.org.uk>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      ac838719
  11. 23 Feb, 2013 1 commit
  12. 15 Nov, 2012 1 commit
    • Eric W. Biederman's avatar
      userns: Support autofs4 interacing with multiple user namespaces · 45634cd8
      Eric W. Biederman authored
      
      
      Use kuid_t and kgid_t in struct autofs_info and struct autofs_wait_queue.
      
      When creating directories and symlinks default the uid and gid of
      the mount requester to the global root uid and gid.  autofs4_wait
      will update these fields when a mount is requested.
      
      When generating autofsv5 packets report the uid and gid of the mount
      requestor in user namespace of the process that opened the pipe,
      reporting unmapped uids and gids as overflowuid and overflowgid.
      
      In autofs_dev_ioctl_requester return the uid and gid of the last mount
      requester converted into the calling processes user namespace.  When the
      uid or gid don't map return overflowuid and overflowgid as appropriate,
      allowing failure to find a mount requester to be distinguished from
      failure to map a mount requester.
      
      The uid and gid mount options specifying the user and group of the
      root autofs inode are converted into kuid and kgid as they are parsed
      defaulting to the current uid and current gid of the process that
      mounts autofs.
      
      Mounting of autofs for the present remains confined to processes in
      the initial user namespace.
      
      Cc: Ian Kent <raven@themaw.net>
      Acked-by: default avatarSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: default avatarEric W. Biederman <ebiederm@xmission.com>
      45634cd8
  13. 27 Sep, 2012 1 commit
  14. 22 Jul, 2012 1 commit
  15. 29 Apr, 2012 1 commit
    • Linus Torvalds's avatar
      autofs: make the autofsv5 packet file descriptor use a packetized pipe · 64f371bc
      Linus Torvalds authored
      The autofs packet size has had a very unfortunate size problem on x86:
      because the alignment of 'u64' differs in 32-bit and 64-bit modes, and
      because the packet data was not 8-byte aligned, the size of the autofsv5
      packet structure differed between 32-bit and 64-bit modes despite
      looking otherwise identical (300 vs 304 bytes respectively).
      
      We first fixed that up by making the 64-bit compat mode know about this
      problem in commit a32744d4 ("autofs: work around unhappy compat
      problem on x86-64"), and that made a 32-bit 'systemd' work happily on a
      64-bit kernel because everything then worked the same way as on a 32-bit
      kernel.
      
      But it turned out that 'automount' had actually known and worked around
      this problem in user space, so fixing the kernel to do the proper 32-bit
      compatibility handling actually *broke* 32-bit automount on a 64-bit
      kernel, because it knew that the packet sizes were wrong and expected
      those incorrect sizes.
      
      As a result, we ended up reverting that compatibility mode fix, and
      thus breaking systemd again, in commit fcbf94b9
      
      .
      
      With both automount and systemd doing a single read() system call, and
      verifying that they get *exactly* the size they expect but using
      different sizes, it seemed that fixing one of them inevitably seemed to
      break the other.  At one point, a patch I seriously considered applying
      from Michael Tokarev did a "strcmp()" to see if it was automount that
      was doing the operation.  Ugly, ugly.
      
      However, a prettier solution exists now thanks to the packetized pipe
      mode.  By marking the communication pipe as being packetized (by simply
      setting the O_DIRECT flag), we can always just write the bigger packet
      size, and if user-space does a smaller read, it will just get that
      partial end result and the extra alignment padding will simply be thrown
      away.
      
      This makes both automount and systemd happy, since they now get the size
      they asked for, and the kernel side of autofs simply no longer needs to
      care - it could pad out the packet arbitrarily.
      
      Of course, if there is some *other* user of autofs (please, please,
      please tell me it ain't so - and we haven't heard of any) that tries to
      read the packets with multiple writes, that other user will now be
      broken - the whole point of the packetized mode is that one system call
      gets exactly one packet, and you cannot read a packet in pieces.
      Tested-by: default avatarMichael Tokarev <mjt@tls.msk.ru>
      Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
      Cc: David Miller <davem@davemloft.net>
      Cc: Ian Kent <raven@themaw.net>
      Cc: Thomas Meyer <thomas@m3y3r.de>
      Cc: stable@kernel.org
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      64f371bc
  16. 28 Apr, 2012 1 commit
    • Linus Torvalds's avatar
      Revert "autofs: work around unhappy compat problem on x86-64" · fcbf94b9
      Linus Torvalds authored
      This reverts commit a32744d4
      
      .
      
      While that commit was technically the right thing to do, and made the
      x86-64 compat mode work identically to native 32-bit mode (and thus
      fixing the problem with a 32-bit systemd install on a 64-bit kernel), it
      turns out that the automount binaries had workarounds for this compat
      problem.
      
      Now, the workarounds are disgusting: doing an "uname()" to find out the
      architecture of the kernel, and then comparing it for the 64-bit cases
      and fixing up the size of the read() in automount for those.  And they
      were confused: it's not actually a generic 64-bit issue at all, it's
      very much tied to just x86-64, which has different alignment for an
      'u64' in 64-bit mode than in 32-bit mode.
      
      But the end result is that fixing the compat layer actually breaks the
      case of a 32-bit automount on a x86-64 kernel.
      
      There are various approaches to fix this (including just doing a
      "strcmp()" on current->comm and comparing it to "automount"), but I
      think that I will do the one that teaches pipes about a special "packet
      mode", which will allow user space to not have to care too deeply about
      the padding at the end of the autofs packet.
      
      That change will make the compat workaround unnecessary, so let's revert
      it first, and get automount working again in compat mode.  The
      packetized pipes will then fix autofs for systemd.
      Reported-and-requested-by: default avatarMichael Tokarev <mjt@tls.msk.ru>
      Cc: Ian Kent <raven@themaw.net>
      Cc: stable@kernel.org # for 3.3
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      fcbf94b9
  17. 25 Feb, 2012 1 commit
    • Ian Kent's avatar
      autofs: work around unhappy compat problem on x86-64 · a32744d4
      Ian Kent authored
      When the autofs protocol version 5 packet type was added in commit
      5c0a32fc
      
       ("autofs4: add new packet type for v5 communications"), it
      obvously tried quite hard to be word-size agnostic, and uses explicitly
      sized fields that are all correctly aligned.
      
      However, with the final "char name[NAME_MAX+1]" array at the end, the
      actual size of the structure ends up being not very well defined:
      because the struct isn't marked 'packed', doing a "sizeof()" on it will
      align the size of the struct up to the biggest alignment of the members
      it has.
      
      And despite all the members being the same, the alignment of them is
      different: a "__u64" has 4-byte alignment on x86-32, but native 8-byte
      alignment on x86-64.  And while 'NAME_MAX+1' ends up being a nice round
      number (256), the name[] array starts out a 4-byte aligned.
      
      End result: the "packed" size of the structure is 300 bytes: 4-byte, but
      not 8-byte aligned.
      
      As a result, despite all the fields being in the same place on all
      architectures, sizeof() will round up that size to 304 bytes on
      architectures that have 8-byte alignment for u64.
      
      Note that this is *not* a problem for 32-bit compat mode on POWER, since
      there __u64 is 8-byte aligned even in 32-bit mode.  But on x86, 32-bit
      and 64-bit alignment is different for 64-bit entities, and as a result
      the structure that has exactly the same layout has different sizes.
      
      So on x86-64, but no other architecture, we will just subtract 4 from
      the size of the structure when running in a compat task.  That way we
      will write the properly sized packet that user mode expects.
      
      Not pretty.  Sadly, this very subtle, and unnecessary, size difference
      has been encoded in user space that wants to read packets of *exactly*
      the right size, and will refuse to touch anything else.
      Reported-and-tested-by: default avatarThomas Meyer <thomas@m3y3r.de>
      Signed-off-by: default avatarIan Kent <raven@themaw.net>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      a32744d4
  18. 19 Feb, 2012 1 commit
    • David Howells's avatar
      Wrap accesses to the fd_sets in struct fdtable · 1dce27c5
      David Howells authored
      
      
      Wrap accesses to the fd_sets in struct fdtable (for recording open files and
      close-on-exec flags) so that we can move away from using fd_sets since we
      abuse the fd_set structs by not allocating the full-sized structure under
      normal circumstances and by non-core code looking at the internals of the
      fd_sets.
      
      The first abuse means that use of FD_ZERO() on these fd_sets is not permitted,
      since that cannot be told about their abnormal lengths.
      
      This introduces six wrapper functions for setting, clearing and testing
      close-on-exec flags and fd-is-open flags:
      
      	void __set_close_on_exec(int fd, struct fdtable *fdt);
      	void __clear_close_on_exec(int fd, struct fdtable *fdt);
      	bool close_on_exec(int fd, const struct fdtable *fdt);
      	void __set_open_fd(int fd, struct fdtable *fdt);
      	void __clear_open_fd(int fd, struct fdtable *fdt);
      	bool fd_is_open(int fd, const struct fdtable *fdt);
      
      Note that I've prepended '__' to the names of the set/clear functions because
      they require the caller to hold a lock to use them.
      
      Note also that I haven't added wrappers for looking behind the scenes at the
      the array.  Possibly that should exist too.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Link: http://lkml.kernel.org/r/20120216174942.23314.1364.stgit@warthog.procyon.org.uk
      
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      1dce27c5
  19. 07 Jan, 2012 1 commit
  20. 24 Mar, 2011 1 commit
  21. 16 Jan, 2011 1 commit
    • David Howells's avatar
      Add a dentry op to allow processes to be held during pathwalk transit · cc53ce53
      David Howells authored
      
      
      Add a dentry op (d_manage) to permit a filesystem to hold a process and make it
      sleep when it tries to transit away from one of that filesystem's directories
      during a pathwalk.  The operation is keyed off a new dentry flag
      (DCACHE_MANAGE_TRANSIT).
      
      The filesystem is allowed to be selective about which processes it holds and
      which it permits to continue on or prohibits from transiting from each flagged
      directory.  This will allow autofs to hold up client processes whilst letting
      its userspace daemon through to maintain the directory or the stuff behind it
      or mounted upon it.
      
      The ->d_manage() dentry operation:
      
      	int (*d_manage)(struct path *path, bool mounting_here);
      
      takes a pointer to the directory about to be transited away from and a flag
      indicating whether the transit is undertaken by do_add_mount() or
      do_move_mount() skipping through a pile of filesystems mounted on a mountpoint.
      
      It should return 0 if successful and to let the process continue on its way;
      -EISDIR to prohibit the caller from skipping to overmounted filesystems or
      automounting, and to use this directory; or some other error code to return to
      the user.
      
      ->d_manage() is called with namespace_sem writelocked if mounting_here is true
      and no other locks held, so it may sleep.  However, if mounting_here is true,
      it may not initiate or wait for a mount or unmount upon the parameter
      directory, even if the act is actually performed by userspace.
      
      Within fs/namei.c, follow_managed() is extended to check with d_manage() first
      on each managed directory, before transiting away from it or attempting to
      automount upon it.
      
      follow_down() is renamed follow_down_one() and should only be used where the
      filesystem deliberately intends to avoid management steps (e.g. autofs).
      
      A new follow_down() is added that incorporates the loop done by all other
      callers of follow_down() (do_add/move_mount(), autofs and NFSD; whilst AFS, NFS
      and CIFS do use it, their use is removed by converting them to use
      d_automount()).  The new follow_down() calls d_manage() as appropriate.  It
      also takes an extra parameter to indicate if it is being called from mount code
      (with namespace_sem writelocked) which it passes to d_manage().  follow_down()
      ignores automount points so that it can be used to mount on them.
      
      __follow_mount_rcu() is made to abort rcu-walk mode if it hits a directory with
      DCACHE_MANAGE_TRANSIT set on the basis that we're probably going to have to
      sleep.  It would be possible to enter d_manage() in rcu-walk mode too, and have
      that determine whether to abort or not itself.  That would allow the autofs
      daemon to continue on in rcu-walk mode.
      
      Note that DCACHE_MANAGE_TRANSIT on a directory should be cleared when it isn't
      required as every tranist from that directory will cause d_manage() to be
      invoked.  It can always be set again when necessary.
      
      ==========================
      WHAT THIS MEANS FOR AUTOFS
      ==========================
      
      Autofs currently uses the lookup() inode op and the d_revalidate() dentry op to
      trigger the automounting of indirect mounts, and both of these can be called
      with i_mutex held.
      
      autofs knows that the i_mutex will be held by the caller in lookup(), and so
      can drop it before invoking the daemon - but this isn't so for d_revalidate(),
      since the lock is only held on _some_ of the code paths that call it.  This
      means that autofs can't risk dropping i_mutex from its d_revalidate() function
      before it calls the daemon.
      
      The bug could manifest itself as, for example, a process that's trying to
      validate an automount dentry that gets made to wait because that dentry is
      expired and needs cleaning up:
      
      	mkdir         S ffffffff8014e05a     0 32580  24956
      	Call Trace:
      	 [<ffffffff885371fd>] :autofs4:autofs4_wait+0x674/0x897
      	 [<ffffffff80127f7d>] avc_has_perm+0x46/0x58
      	 [<ffffffff8009fdcf>] autoremove_wake_function+0x0/0x2e
      	 [<ffffffff88537be6>] :autofs4:autofs4_expire_wait+0x41/0x6b
      	 [<ffffffff88535cfc>] :autofs4:autofs4_revalidate+0x91/0x149
      	 [<ffffffff80036d96>] __lookup_hash+0xa0/0x12f
      	 [<ffffffff80057a2f>] lookup_create+0x46/0x80
      	 [<ffffffff800e6e31>] sys_mkdirat+0x56/0xe4
      
      versus the automount daemon which wants to remove that dentry, but can't
      because the normal process is holding the i_mutex lock:
      
      	automount     D ffffffff8014e05a     0 32581      1              32561
      	Call Trace:
      	 [<ffffffff80063c3f>] __mutex_lock_slowpath+0x60/0x9b
      	 [<ffffffff8000ccf1>] do_path_lookup+0x2ca/0x2f1
      	 [<ffffffff80063c89>] .text.lock.mutex+0xf/0x14
      	 [<ffffffff800e6d55>] do_rmdir+0x77/0xde
      	 [<ffffffff8005d229>] tracesys+0x71/0xe0
      	 [<ffffffff8005d28d>] tracesys+0xd5/0xe0
      
      which means that the system is deadlocked.
      
      This patch allows autofs to hold up normal processes whilst the daemon goes
      ahead and does things to the dentry tree behind the automouter point without
      risking a deadlock as almost no locks are held in d_manage() and none in
      d_automount().
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
      Was-Acked-by: default avatarIan Kent <raven@themaw.net>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
      cc53ce53
  22. 15 Oct, 2010 1 commit
    • Arnd Bergmann's avatar
      llseek: automatically add .llseek fop · 6038f373
      Arnd Bergmann authored
      
      
      All file_operations should get a .llseek operation so we can make
      nonseekable_open the default for future file operations without a
      .llseek pointer.
      
      The three cases that we can automatically detect are no_llseek, seq_lseek
      and default_llseek. For cases where we can we can automatically prove that
      the file offset is always ignored, we use noop_llseek, which maintains
      the current behavior of not returning an error from a seek.
      
      New drivers should normally not use noop_llseek but instead use no_llseek
      and call nonseekable_open at open time.  Existing drivers can be converted
      to do the same when the maintainer knows for certain that no user code
      relies on calling seek on the device file.
      
      The generated code is often incorrectly indented and right now contains
      comments that clarify for each added line why a specific variant was
      chosen. In the version that gets submitted upstream, the comments will
      be gone and I will manually fix the indentation, because there does not
      seem to be a way to do that using coccinelle.
      
      Some amount of new code is currently sitting in linux-next that should get
      the same modifications, which I will do at the end of the merge window.
      
      Many thanks to Julia Lawall for helping me learn to write a semantic
      patch that does all this.
      
      ===== begin semantic patch =====
      // This adds an llseek= method to all file operations,
      // as a preparation for making no_llseek the default.
      //
      // The rules are
      // - use no_llseek explicitly if we do nonseekable_open
      // - use seq_lseek for sequential files
      // - use default_llseek if we know we access f_pos
      // - use noop_llseek if we know we don't access f_pos,
      //   but we still want to allow users to call lseek
      //
      @ open1 exists @
      identifier nested_open;
      @@
      nested_open(...)
      {
      <+...
      nonseekable_open(...)
      ...+>
      }
      
      @ open exists@
      identifier open_f;
      identifier i, f;
      identifier open1.nested_open;
      @@
      int open_f(struct inode *i, struct file *f)
      {
      <+...
      (
      nonseekable_open(...)
      |
      nested_open(...)
      )
      ...+>
      }
      
      @ read disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      <+...
      (
         *off = E
      |
         *off += E
      |
         func(..., off, ...)
      |
         E = *off
      )
      ...+>
      }
      
      @ read_no_fpos disable optional_qualifier exists @
      identifier read_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t read_f(struct file *f, char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ write @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      expression E;
      identifier func;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      <+...
      (
        *off = E
      |
        *off += E
      |
        func(..., off, ...)
      |
        E = *off
      )
      ...+>
      }
      
      @ write_no_fpos @
      identifier write_f;
      identifier f, p, s, off;
      type ssize_t, size_t, loff_t;
      @@
      ssize_t write_f(struct file *f, const char *p, size_t s, loff_t *off)
      {
      ... when != off
      }
      
      @ fops0 @
      identifier fops;
      @@
      struct file_operations fops = {
       ...
      };
      
      @ has_llseek depends on fops0 @
      identifier fops0.fops;
      identifier llseek_f;
      @@
      struct file_operations fops = {
      ...
       .llseek = llseek_f,
      ...
      };
      
      @ has_read depends on fops0 @
      identifier fops0.fops;
      identifier read_f;
      @@
      struct file_operations fops = {
      ...
       .read = read_f,
      ...
      };
      
      @ has_write depends on fops0 @
      identifier fops0.fops;
      identifier write_f;
      @@
      struct file_operations fops = {
      ...
       .write = write_f,
      ...
      };
      
      @ has_open depends on fops0 @
      identifier fops0.fops;
      identifier open_f;
      @@
      struct file_operations fops = {
      ...
       .open = open_f,
      ...
      };
      
      // use no_llseek if we call nonseekable_open
      ////////////////////////////////////////////
      @ nonseekable1 depends on !has_llseek && has_open @
      identifier fops0.fops;
      identifier nso ~= "nonseekable_open";
      @@
      struct file_operations fops = {
      ...  .open = nso, ...
      +.llseek = no_llseek, /* nonseekable */
      };
      
      @ nonseekable2 depends on !has_llseek @
      identifier fops0.fops;
      identifier open.open_f;
      @@
      struct file_operations fops = {
      ...  .open = open_f, ...
      +.llseek = no_llseek, /* open uses nonseekable */
      };
      
      // use seq_lseek for sequential files
      /////////////////////////////////////
      @ seq depends on !has_llseek @
      identifier fops0.fops;
      identifier sr ~= "seq_read";
      @@
      struct file_operations fops = {
      ...  .read = sr, ...
      +.llseek = seq_lseek, /* we have seq_read */
      };
      
      // use default_llseek if there is a readdir
      ///////////////////////////////////////////
      @ fops1 depends on !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier readdir_e;
      @@
      // any other fop is used that changes pos
      struct file_operations fops = {
      ... .readdir = readdir_e, ...
      +.llseek = default_llseek, /* readdir is present */
      };
      
      // use default_llseek if at least one of read/write touches f_pos
      /////////////////////////////////////////////////////////////////
      @ fops2 depends on !fops1 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read.read_f;
      @@
      // read fops use offset
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = default_llseek, /* read accesses f_pos */
      };
      
      @ fops3 depends on !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ... .write = write_f, ...
      +	.llseek = default_llseek, /* write accesses f_pos */
      };
      
      // Use noop_llseek if neither read nor write accesses f_pos
      ///////////////////////////////////////////////////////////
      
      @ fops4 depends on !fops1 && !fops2 && !fops3 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      identifier write_no_fpos.write_f;
      @@
      // write fops use offset
      struct file_operations fops = {
      ...
       .write = write_f,
       .read = read_f,
      ...
      +.llseek = noop_llseek, /* read and write both use no f_pos */
      };
      
      @ depends on has_write && !has_read && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier write_no_fpos.write_f;
      @@
      struct file_operations fops = {
      ... .write = write_f, ...
      +.llseek = noop_llseek, /* write uses no f_pos */
      };
      
      @ depends on has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      identifier read_no_fpos.read_f;
      @@
      struct file_operations fops = {
      ... .read = read_f, ...
      +.llseek = noop_llseek, /* read uses no f_pos */
      };
      
      @ depends on !has_read && !has_write && !fops1 && !fops2 && !has_llseek && !nonseekable1 && !nonseekable2 && !seq @
      identifier fops0.fops;
      @@
      struct file_operations fops = {
      ...
      +.llseek = noop_llseek, /* no read or write fn */
      };
      ===== End semantic patch =====
      Signed-off-by: default avatarArnd Bergmann <arnd@arndb.de>
      Cc: Julia Lawall <julia@diku.dk>
      Cc: Christoph Hellwig <hch@infradead.org>
      6038f373
  23. 27 May, 2010 1 commit
    • Julia Lawall's avatar
      fs/autofs4: use memdup_user · 7ca5ca60
      Julia Lawall authored
      Use memdup_user when user data is immediately copied into the allocated
      region.  Elimination of the variable ads, which is no longer useful.
      
      The semantic patch that makes this change is as follows:
      (http://coccinelle.lip6.fr/
      
      )
      
      // <smpl>
      @@
      expression from,to,size,flag;
      position p;
      identifier l1,l2;
      @@
      
      -  to = \(kmalloc@p\|kzalloc@p\)(size,flag);
      +  to = memdup_user(from,size);
         if (
      -      to==NULL
      +      IS_ERR(to)
                       || ...) {
         <+... when != goto l1;
      -  -ENOMEM
      +  PTR_ERR(to)
         ...+>
         }
      -  if (copy_from_user(to, from, size) != 0) {
      -    <+... when != goto l2;
      -    -EFAULT
      -    ...+>
      -  }
      // </smpl>
      Signed-off-by: default avatarJulia Lawall <julia@diku.dk>
      Cc: Ian Kent <raven@themaw.net>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      7ca5ca60
  24. 25 May, 2010 1 commit
    • Kay Sievers's avatar
      driver core: add devname module aliases to allow module on-demand auto-loading · 578454ff
      Kay Sievers authored
      This adds:
        alias: devname:<name>
      to some common kernel modules, which will allow the on-demand loading
      of the kernel module when the device node is accessed.
      
      Ideally all these modules would be compiled-in, but distros seems too
      much in love with their modularization that we need to cover the common
      cases with this new facility. It will allow us to remove a bunch of pretty
      useless init scripts and modprobes from init scripts.
      
      The static device node aliases will be carried in the module itself. The
      program depmod will extract this information to a file in the module directory:
        $ cat /lib/modules/2.6.34-00650-g537b60d1
      
      -dirty/modules.devname
        # Device nodes to trigger on-demand module loading.
        microcode cpu/microcode c10:184
        fuse fuse c10:229
        ppp_generic ppp c108:0
        tun net/tun c10:200
        dm_mod mapper/control c10:235
      
      Udev will pick up the depmod created file on startup and create all the
      static device nodes which the kernel modules specify, so that these modules
      get automatically loaded when the device node is accessed:
        $ /sbin/udevd --debug
        ...
        static_dev_create_from_modules: mknod '/dev/cpu/microcode' c10:184
        static_dev_create_from_modules: mknod '/dev/fuse' c10:229
        static_dev_create_from_modules: mknod '/dev/ppp' c108:0
        static_dev_create_from_modules: mknod '/dev/net/tun' c10:200
        static_dev_create_from_modules: mknod '/dev/mapper/control' c10:235
        udev_rules_apply_static_dev_perms: chmod '/dev/net/tun' 0666
        udev_rules_apply_static_dev_perms: chmod '/dev/fuse' 0666
      
      A few device nodes are switched to statically allocated numbers, to allow
      the static nodes to work. This might also useful for systems which still run
      a plain static /dev, which is completely unsafe to use with any dynamic minor
      numbers.
      
      Note:
      The devname aliases must be limited to the *common* and *single*instance*
      device nodes, like the misc devices, and never be used for conceptually limited
      systems like the loop devices, which should rather get fixed properly and get a
      control node for losetup to talk to, instead of creating a random number of
      device nodes in advance, regardless if they are ever used.
      
      This facility is to hide the mess distros are creating with too modualized
      kernels, and just to hide that these modules are not compiled-in, and not to
      paper-over broken concepts. Thanks! :)
      
      Cc: Greg Kroah-Hartman <gregkh@suse.de>
      Cc: David S. Miller <davem@davemloft.net>
      Cc: Miklos Szeredi <miklos@szeredi.hu>
      Cc: Chris Mason <chris.mason@oracle.com>
      Cc: Alasdair G Kergon <agk@redhat.com>
      Cc: Tigran Aivazian <tigran@aivazian.fsnet.co.uk>
      Cc: Ian Kent <raven@themaw.net>
      Signed-Off-By: default avatarKay Sievers <kay.sievers@vrfy.org>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      578454ff
  25. 30 Mar, 2010 1 commit
    • Tejun Heo's avatar
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking... · 5a0e3ad6
      Tejun Heo authored
      include cleanup: Update gfp.h and slab.h includes to prepare for breaking implicit slab.h inclusion from percpu.h
      
      percpu.h is included by sched.h and module.h and thus ends up being
      included when building most .c files.  percpu.h includes slab.h which
      in turn includes gfp.h making everything defined by the two files
      universally available and complicating inclusion dependencies.
      
      percpu.h -> slab.h dependency is about to be removed.  Prepare for
      this change by updating users of gfp and slab facilities include those
      headers directly instead of assuming availability.  As this conversion
      needs to touch large number of source files, the following script is
      used as the basis of conversion.
      
        http://userweb.kernel.org/~tj/misc/slabh-sweep.py
      
      
      
      The script does the followings.
      
      * Scan files for gfp and slab usages and update includes such that
        only the necessary includes are there.  ie. if only gfp is used,
        gfp.h, if slab is used, slab.h.
      
      * When the script inserts a new include, it looks at the include
        blocks and try to put the new include such that its order conforms
        to its surrounding.  It's put in the include block which contains
        core kernel includes, in the same order that the rest are ordered -
        alphabetical, Christmas tree, rev-Xmas-tree or at the end if there
        doesn't seem to be any matching order.
      
      * If the script can't find a place to put a new include (mostly
        because the file doesn't have fitting include block), it prints out
        an error message indicating which .h file needs to be added to the
        file.
      
      The conversion was done in the following steps.
      
      1. The initial automatic conversion of all .c files updated slightly
         over 4000 files, deleting around 700 includes and adding ~480 gfp.h
         and ~3000 slab.h inclusions.  The script emitted errors for ~400
         files.
      
      2. Each error was manually checked.  Some didn't need the inclusion,
         some needed manual addition while adding it to implementation .h or
         embedding .c file was more appropriate for others.  This step added
         inclusions to around 150 files.
      
      3. The script was run again and the output was compared to the edits
         from #2 to make sure no file was left behind.
      
      4. Several build tests were done and a couple of problems were fixed.
         e.g. lib/decompress_*.c used malloc/free() wrappers around slab
         APIs requiring slab.h to be added manually.
      
      5. The script was run on all .h files but without automatically
         editing them as sprinkling gfp.h and slab.h inclusions around .h
         files could easily lead to inclusion dependency hell.  Most gfp.h
         inclusion directives were ignored as stuff from gfp.h was usually
         wildly available and often used in preprocessor macros.  Each
         slab.h inclusion directive was examined and added manually as
         necessary.
      
      6. percpu.h was updated not to include slab.h.
      
      7. Build test were done on the following configurations and failures
         were fixed.  CONFIG_GCOV_KERNEL was turned off for all tests (as my
         distributed build env didn't work with gcov compiles) and a few
         more options had to be turned off depending on archs to make things
         build (like ipr on powerpc/64 which failed due to missing writeq).
      
         * x86 and x86_64 UP and SMP allmodconfig and a custom test config.
         * powerpc and powerpc64 SMP allmodconfig
         * sparc and sparc64 SMP allmodconfig
         * ia64 SMP allmodconfig
         * s390 SMP allmodconfig
         * alpha SMP allmodconfig
         * um on x86_64 SMP allmodconfig
      
      8. percpu.h modifications were reverted so that it could be applied as
         a separate patch and serve as bisection point.
      
      Given the fact that I had only a couple of failures from tests on step
      6, I'm fairly confident about the coverage of this conversion patch.
      If there is a breakage, it's likely to be something in one of the arch
      headers which should be easily discoverable easily on most builds of
      the specific arch.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Guess-its-ok-by: default avatarChristoph Lameter <cl@linux-foundation.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Lee Schermerhorn <Lee.Schermerhorn@hp.com>
      5a0e3ad6
  26. 03 Mar, 2010 1 commit
  27. 12 Jul, 2009 1 commit
  28. 12 Jun, 2009 3 commits
  29. 21 Apr, 2009 2 commits