1. 27 Apr, 2018 1 commit
  2. 29 Feb, 2016 1 commit
  3. 08 May, 2015 4 commits
    • Peter Zijlstra (Intel)'s avatar
      locking/qspinlock: Optimize for smaller NR_CPUS · 69f9cae9
      Peter Zijlstra (Intel) authored
      
      
      When we allow for a max NR_CPUS < 2^14 we can optimize the pending
      wait-acquire and the xchg_tail() operations.
      
      By growing the pending bit to a byte, we reduce the tail to 16bit.
      This means we can use xchg16 for the tail part and do away with all
      the repeated compxchg() operations.
      
      This in turn allows us to unconditionally acquire; the locked state
      as observed by the wait loops cannot change. And because both locked
      and pending are now a full byte we can use simple stores for the
      state transition, obviating one atomic operation entirely.
      
      This optimization is needed to make the qspinlock achieve performance
      parity with ticket spinlock at light load.
      
      All this is horribly broken on Alpha pre EV56 (and any other arch that
      cannot do single-copy atomic byte stores).
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarWaiman Long <Waiman.Long@hp.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Daniel J Blueman <daniel@numascale.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Douglas Hatch <doug.hatch@hp.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Scott J Norton <scott.norton@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1429901803-29771-6-git-send-email-Waiman.Long@hp.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      69f9cae9
    • Waiman Long's avatar
      locking/qspinlock: Extract out code snippets for the next patch · 6403bd7d
      Waiman Long authored
      
      
      This is a preparatory patch that extracts out the following 2 code
      snippets to prepare for the next performance optimization patch.
      
       1) the logic for the exchange of new and previous tail code words
          into a new xchg_tail() function.
       2) the logic for clearing the pending bit and setting the locked bit
          into a new clear_pending_set_locked() function.
      
      This patch also simplifies the trylock operation before queuing by
      calling queued_spin_trylock() directly.
      Signed-off-by: default avatarWaiman Long <Waiman.Long@hp.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Daniel J Blueman <daniel@numascale.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Douglas Hatch <doug.hatch@hp.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Scott J Norton <scott.norton@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1429901803-29771-5-git-send-email-Waiman.Long@hp.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      6403bd7d
    • Peter Zijlstra (Intel)'s avatar
      locking/qspinlock: Add pending bit · c1fb159d
      Peter Zijlstra (Intel) authored
      
      
      Because the qspinlock needs to touch a second cacheline (the per-cpu
      mcs_nodes[]); add a pending bit and allow a single in-word spinner
      before we punt to the second cacheline.
      
      It is possible so observe the pending bit without the locked bit when
      the last owner has just released but the pending owner has not yet
      taken ownership.
      
      In this case we would normally queue -- because the pending bit is
      already taken. However, in this case the pending bit is guaranteed
      to be released 'soon', therefore wait for it and avoid queueing.
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Signed-off-by: default avatarWaiman Long <Waiman.Long@hp.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Daniel J Blueman <daniel@numascale.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Douglas Hatch <doug.hatch@hp.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Scott J Norton <scott.norton@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1429901803-29771-4-git-send-email-Waiman.Long@hp.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      c1fb159d
    • Waiman Long's avatar
      locking/qspinlock: Introduce a simple generic 4-byte queued spinlock · a33fda35
      Waiman Long authored
      
      
      This patch introduces a new generic queued spinlock implementation that
      can serve as an alternative to the default ticket spinlock. Compared
      with the ticket spinlock, this queued spinlock should be almost as fair
      as the ticket spinlock. It has about the same speed in single-thread
      and it can be much faster in high contention situations especially when
      the spinlock is embedded within the data structure to be protected.
      
      Only in light to moderate contention where the average queue depth
      is around 1-3 will this queued spinlock be potentially a bit slower
      due to the higher slowpath overhead.
      
      This queued spinlock is especially suit to NUMA machines with a large
      number of cores as the chance of spinlock contention is much higher
      in those machines. The cost of contention is also higher because of
      slower inter-node memory traffic.
      
      Due to the fact that spinlocks are acquired with preemption disabled,
      the process will not be migrated to another CPU while it is trying
      to get a spinlock. Ignoring interrupt handling, a CPU can only be
      contending in one spinlock at any one time. Counting soft IRQ, hard
      IRQ and NMI, a CPU can only have a maximum of 4 concurrent lock waiting
      activities.  By allocating a set of per-cpu queue nodes and used them
      to form a waiting queue, we can encode the queue node address into a
      much smaller 24-bit size (including CPU number and queue node index)
      leaving one byte for the lock.
      
      Please note that the queue node is only needed when waiting for the
      lock. Once the lock is acquired, the queue node can be released to
      be used later.
      Signed-off-by: default avatarWaiman Long <Waiman.Long@hp.com>
      Signed-off-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
      Cc: Borislav Petkov <bp@alien8.de>
      Cc: Daniel J Blueman <daniel@numascale.com>
      Cc: David Vrabel <david.vrabel@citrix.com>
      Cc: Douglas Hatch <doug.hatch@hp.com>
      Cc: H. Peter Anvin <hpa@zytor.com>
      Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
      Cc: Linus Torvalds <torvalds@linux-foundation.org>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Paolo Bonzini <paolo.bonzini@gmail.com>
      Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com>
      Cc: Rik van Riel <riel@redhat.com>
      Cc: Scott J Norton <scott.norton@hp.com>
      Cc: Thomas Gleixner <tglx@linutronix.de>
      Cc: virtualization@lists.linux-foundation.org
      Cc: xen-devel@lists.xenproject.org
      Link: http://lkml.kernel.org/r/1429901803-29771-2-git-send-email-Waiman.Long@hp.com
      
      Signed-off-by: default avatarIngo Molnar <mingo@kernel.org>
      a33fda35