1. 24 Jan, 2013 1 commit
    • Jonathan Brassow's avatar
      DM-RAID: Fix RAID10's check for sufficient redundancy · 55ebbb59
      Jonathan Brassow authored
      Before attempting to activate a RAID array, it is checked for sufficient
      redundancy.  That is, we make sure that there are not too many failed
      devices - or devices specified for rebuild - to undermine our ability to
      activate the array.  The current code performs this check twice - once to
      ensure there were not too many devices specified for rebuild by the user
      ('validate_rebuild_devices') and again after possibly experiencing a failure
      to read the superblock ('analyse_superblocks').  Neither of these checks are
      sufficient.  The first check is done properly but with insufficient
      information about the possible failure state of the devices to make a good
      determination if the array can be activated.  The second check is simply
      done wrong in the case of RAID10 because it doesn't account for the
      independence of the stripes (i.e. mirror sets).  The solution is to use the
      properly written check ('validate_rebuild_devices'), but perform the check
      after the superblocks have been read and we know which devices have failed.
      This gives us one check instead of two and performs it in a location where
      it can be done right.
      Only RAID10 was affected and it was affected in the following ways:
      - the code did not properly catch the condition where a user specified
        a device for rebuild that already had a failed device in the same mirror
        set.  (This condition would, however, be caught at a deeper level in MD.)
      - the code triggers a false positive and denies activation when devices in
        independent mirror sets have failed - counting the failures as though they
        were all in the same set.
      The most likely place this error was introduced (or this patch should have
      been included) is in commit 4ec1e369
       - first introduced in v3.7-rc1.
      Consequently this fix should also go in v3.7.y, however there is a
      small conflict on the .version in raid_target, so I'll submit a
      separate patch to -stable.
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarJonathan Brassow <jbrassow@redhat.com>
      Signed-off-by: default avatarNeilBrown <neilb@suse.de>
  2. 23 Jan, 2013 1 commit
  3. 06 Jan, 2013 1 commit
  4. 05 Jan, 2013 3 commits
    • Rafael J. Wysocki's avatar
      PM: Move disabling/enabling runtime PM to late suspend/early resume · 9f6d8f6a
      Rafael J. Wysocki authored
      Currently, the PM core disables runtime PM for all devices right
      after executing subsystem/driver .suspend() callbacks for them
      and re-enables it right before executing subsystem/driver .resume()
      callbacks for them.  This may lead to problems when there are
      two devices such that the .suspend() callback executed for one of
      them depends on runtime PM working for the other.  In that case,
      if runtime PM has already been disabled for the second device,
      the first one's .suspend() won't work correctly (and analogously
      for resume).
      To make those issues go away, make the PM core disable runtime PM
      for devices right before executing subsystem/driver .suspend_late()
      callbacks for them and enable runtime PM for them right after
      executing subsystem/driver .resume_early() callbacks for them.  This
      way the potential conflitcs between .suspend_late()/.resume_early()
      and their runtime PM counterparts are still prevented from happening,
      but the subtle ordering issues related to disabling/enabling runtime
      PM for devices during system suspend/resume are much easier to avoid.
      Reported-and-tested-by: default avatarJan-Matthias Braun <jan_braun@gmx.net>
      Signed-off-by: default avatarRafael J. Wysocki <rafael.j.wysocki@intel.com>
      Reviewed-by: default avatarUlf Hansson <ulf.hansson@linaro.org>
      Reviewed-by: default avatarKevin Hilman <khilman@deeprootsystems.com>
      Cc: 3.4+ <stable@vger.kernel.org>
    • Carlos Alberto Lopez Perez's avatar
      Documentation/sysctl/kernel.txt: document /proc/sys/shmall · 358e419f
      Carlos Alberto Lopez Perez authored
      Signed-off-by: default avatarCarlos Alberto Lopez Perez <clopez@igalia.com>
      Cc: Rob Landley <rob@landley.net>
      Cc: Larry Finger <Larry.Finger@lwfinger.net>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: Mitsuo Hayasaka <mitsuo.hayasaka.hu@hitachi.com>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    • Stanislav Kinsbursky's avatar
      ipc: add sysctl to specify desired next object id · 03f59566
      Stanislav Kinsbursky authored
      Add 3 new variables and sysctls to tune them (by one "next_id" variable
      for messages, semaphores and shared memory respectively).  This variable
      can be used to set desired id for next allocated IPC object.  By default
      it's equal to -1 and old behaviour is preserved.  If this variable is
      non-negative, then desired idr will be extracted from it and used as a
      start value to search for free IDR slot.
      1) this patch doesn't guarantee that the new object will have desired
         id.  So it's up to user space how to handle new object with wrong id.
      2) After a sucessful id allocation attempt, "next_id" will be set back
         to -1 (if it was non-negative).
      [akpm@linux-foundation.org: checkpatch fixes]
      Signed-off-by: default avatarStanislav Kinsbursky <skinsbursky@parallels.com>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Pavel Emelyanov <xemul@parallels.com>
      Cc: Al Viro <viro@zeniv.linux.org.uk>
      Cc: KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>
      Cc: Michael Kerrisk <mtk.manpages@gmail.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  5. 04 Jan, 2013 4 commits
  6. 03 Jan, 2013 1 commit
    • Greg Kroah-Hartman's avatar
      Documentation: remove __dev* attributes. · 63a29f74
      Greg Kroah-Hartman authored
      CONFIG_HOTPLUG is going away as an option.  As a result, the __dev*
      markings need to be removed.
      This change removes the use of __devinit, __devexit_p, __devinitdata,
      __devinitconst, and __devexit from the kernel documentation.
      Based on patches originally written by Bill Pemberton, but redone by me
      in order to handle some of the coding style issues better, by hand.
      Cc: Bill Pemberton <wfp5p@virginia.edu>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
  7. 02 Jan, 2013 1 commit
  8. 26 Dec, 2012 2 commits
  9. 21 Dec, 2012 2 commits
  10. 20 Dec, 2012 3 commits
    • Marco Stornelli's avatar
      documentation: drop vmtruncate · b9f61c3c
      Marco Stornelli authored
      Removed vmtruncate
      Signed-off-by: default avatarMarco Stornelli <marco.stornelli@gmail.com>
      Signed-off-by: default avatarAl Viro <viro@zeniv.linux.org.uk>
    • David Howells's avatar
      FS-Cache: Provide proper invalidation · ef778e7a
      David Howells authored
      Provide a proper invalidation method rather than relying on the netfs retiring
      the cookie it has and getting a new one.  The problem with this is that isn't
      easy for the netfs to make sure that it has completed/cancelled all its
      outstanding storage and retrieval operations on the cookie it is retiring.
      Instead, have the cache provide an invalidation method that will cancel or wait
      for all currently outstanding operations before invalidating the cache, and
      will cause new operations to queue up behind that.  Whilst invalidation is in
      progress, some requests will be rejected until the cache can stack a barrier on
      the operation queue to cause new operations to be deferred behind it.
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
    • David Howells's avatar
      FS-Cache: Fix operation state management and accounting · 9f10523f
      David Howells authored
      Fix the state management of internal fscache operations and the accounting of
      what operations are in what states.
      This is done by:
       (1) Give struct fscache_operation a enum variable that directly represents the
           state it's currently in, rather than spreading this knowledge over a bunch
           of flags, who's processing the operation at the moment and whether it is
           queued or not.
           This makes it easier to write assertions to check the state at various
           points and to prevent invalid state transitions.
       (2) Add an 'operation complete' state and supply a function to indicate the
           completion of an operation (fscache_op_complete()) and make things call
           it.  The final call to fscache_put_operation() can then check that an op
           in the appropriate state (complete or cancelled).
       (3) Adjust the use of object->n_ops, ->n_in_progress, ->n_exclusive to better
           govern the state of an object:
      	(a) The ->n_ops is now the number of extant operations on the object
      	    and is now decremented by fscache_put_operation() only.
      	(b) The ->n_in_progress is simply the number of objects that have been
      	    taken off of the object's pending queue for the purposes of being
      	    run.  This is decremented by fscache_op_complete() only.
      	(c) The ->n_exclusive is the number of exclusive ops that have been
      	    submitted and queued or are in progress.  It is decremented by
      	    fscache_op_complete() and by fscache_cancel_op().
           fscache_put_operation() and fscache_operation_gc() now no longer try to
           clean up ->n_exclusive and ->n_in_progress.  That was leading to double
           decrements against fscache_cancel_op().
           fscache_cancel_op() now no longer decrements ->n_ops.  That was leading to
           double decrements against fscache_put_operation().
           fscache_submit_exclusive_op() now decides whether it has to queue an op
           based on ->n_in_progress being > 0 rather than ->n_ops > 0 as the latter
           will persist in being true even after all preceding operations have been
           cancelled or completed.  Furthermore, if an object is active and there are
           runnable ops against it, there must be at least one op running.
       (4) Add a remaining-pages counter (n_pages) to struct fscache_retrieval and
           provide a function to record completion of the pages as they complete.
           When n_pages reaches 0, the operation is deemed to be complete and
           fscache_op_complete() is called.
           Add calls to fscache_retrieval_complete() anywhere we've finished with a
           page we've been given to read or allocate for.  This includes places where
           we just return pages to the netfs for reading from the server and where
           accessing the cache fails and we discard the proposed netfs page.
      The bugs in the unfixed state management manifest themselves as oopses like the
      following where the operation completion gets out of sync with return of the
      cookie by the netfs.  This is possible because the cache unlocks and returns
      all the netfs pages before recording its completion - which means that there's
      nothing to stop the netfs discarding them and returning the cookie.
      FS-Cache: Cookie 'NFS.fh' still has outstanding reads
      ------------[ cut here ]------------
      kernel BUG at fs/fscache/cookie.c:519!
      invalid opcode: 0000 [#1] SMP
      CPU 1
      Modules linked in: cachefiles nfs fscache auth_rpcgss nfs_acl lockd sunrpc
      Pid: 400, comm: kswapd0 Not tainted 3.1.0-rc7-fsdevel+ #1090                  /DG965RY
      RIP: 0010:[<ffffffffa007050a>]  [<ffffffffa007050a>] __fscache_relinquish_cookie+0x170/0x343 [fscache]
      RSP: 0018:ffff8800368cfb00  EFLAGS: 00010282
      RAX: 000000000000003c RBX: ffff880023cc8790 RCX: 0000000000000000
      RDX: 0000000000002f2e RSI: 0000000000000001 RDI: ffffffff813ab86c
      RBP: ffff8800368cfb50 R08: 0000000000000002 R09: 0000000000000000
      R10: ffff88003a1b7890 R11: ffff88001df6e488 R12: ffff880023d8ed98
      R13: ffff880023cc8798 R14: 0000000000000004 R15: ffff88003b8bf370
      FS:  0000000000000000(0000) GS:ffff88003bd00000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
      CR2: 00000000008ba008 CR3: 0000000023d93000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process kswapd0 (pid: 400, threadinfo ffff8800368ce000, task ffff88003b8bf040)
       ffff88003b8bf040 ffff88001df6e528 ffff88001df6e528 ffffffffa00b46b0
       ffff88003b8bf040 ffff88001df6e488 ffff88001df6e620 ffffffffa00b46b0
       ffff88001ebd04c8 0000000000000004 ffff8800368cfb70 ffffffffa00b2c91
      Call Trace:
       [<ffffffffa00b2c91>] nfs_fscache_release_inode_cookie+0x3b/0x47 [nfs]
       [<ffffffffa008f25f>] nfs_clear_inode+0x3c/0x41 [nfs]
       [<ffffffffa0090df1>] nfs4_evict_inode+0x2f/0x33 [nfs]
       [<ffffffff810d8d47>] evict+0xa1/0x15c
       [<ffffffff810d8e2e>] dispose_list+0x2c/0x38
       [<ffffffff810d9ebd>] prune_icache_sb+0x28c/0x29b
       [<ffffffff810c56b7>] prune_super+0xd5/0x140
       [<ffffffff8109b615>] shrink_slab+0x102/0x1ab
       [<ffffffff8109d690>] balance_pgdat+0x2f2/0x595
       [<ffffffff8103e009>] ? process_timeout+0xb/0xb
       [<ffffffff8109dba3>] kswapd+0x270/0x289
       [<ffffffff8104c5ea>] ? __init_waitqueue_head+0x46/0x46
       [<ffffffff8109d933>] ? balance_pgdat+0x595/0x595
       [<ffffffff8104bf7a>] kthread+0x7f/0x87
       [<ffffffff813ad6b4>] kernel_thread_helper+0x4/0x10
       [<ffffffff81026b98>] ? finish_task_switch+0x45/0xc0
       [<ffffffff813abcdd>] ? retint_restore_args+0xe/0xe
       [<ffffffff8104befb>] ? __init_kthread_worker+0x53/0x53
       [<ffffffff813ad6b0>] ? gs_change+0xb/0xb
      Signed-off-by: default avatarDavid Howells <dhowells@redhat.com>
  11. 19 Dec, 2012 5 commits
  12. 18 Dec, 2012 16 commits