Skip to content
  • Anton Blanchard's avatar
    powerpc: Use lwsync for acquire barrier if CPU supports it · 5a0e9b57
    Anton Blanchard authored
    
    
    Nick Piggin discovered that lwsync barriers around locks were faster than isync
    on 970. That was a long time ago and I completely dropped the ball in testing
    his patches across other ppc64 processors.
    
    Turns out the idea helps on other chips. Using a microbenchmark that
    uses a lot of threads to contend on a global pthread mutex (and therefore a
    global futex), POWER6 improves 8% and POWER7 improves 2%. I checked POWER5
    and while I couldn't measure an improvement, there was no regression.
    
    This patch uses the lwsync patching code to replace the isyncs with lwsyncs
    on CPUs that support the instruction. We were marking POWER3 and RS64 as lwsync
    capable but in reality they treat it as a full sync (ie slow). Remove the
    CPU_FTR_LWSYNC bit from these CPUs so they continue to use the faster isync
    method.
    
    Signed-off-by: default avatarAnton Blanchard <anton@samba.org>
    Signed-off-by: default avatarBenjamin Herrenschmidt <benh@kernel.crashing.org>
    5a0e9b57