1. 08 Mar, 2012 1 commit
  2. 05 Mar, 2012 1 commit
    • Alex Shi's avatar
      x86/numa: Improve internode cache alignment · 901b0445
      Alex Shi authored
      Currently cache alignment among nodes in the kernel is still 128
      bytes on x86 NUMA machines - we got that X86_INTERNODE_CACHE_SHIFT
      default from old P4 processors.
      But now most modern x86 CPUs use the same size: 64 bytes from L1 to
      last level L3. so let's remove the incorrect setting, and directly
      use the L1 cache size to do SMP cache line alignment.
      This patch saves some memory space on kernel data, and it also
      improves the cache locality of kernel data.
      The System.map is quite different with/without this change:
      	before patch			after patch
        000000000000b000 d tlb_vector_|  000000000000b000 d tlb_vector
        000000000000b080 d cpu_loops_p|  000000000000b040 d cpu_loops_
      Signed-off-by: default avatarAlex Shi <alex.shi@intel.com>
      Cc: asit.k.mallick@intel.com
      Link: http://lkml.kernel.org/r/1330774047-18597-1-git-send-email-alex.shi@intel.comSigned-off-by: default avatarIngo Molnar <mingo@elte.hu>
  3. 13 Jan, 2012 2 commits
  4. 25 Jun, 2011 1 commit
    • Christoph Lameter's avatar
      x86: Add support for cmpxchg_double · 3824abd1
      Christoph Lameter authored
      A simple implementation that only supports the word size and does not
      have a fallback mode (would require a spinlock).
      Add 32 and 64 bit support for cmpxchg_double. cmpxchg double uses
      the cmpxchg8b or cmpxchg16b instruction on x86 processors to compare
      and swap 2 machine words. This allows lockless algorithms to move more
      context information through critical sections.
      Set a flag CONFIG_CMPXCHG_DOUBLE to signal that support for double word
      cmpxchg detection has been build into the kernel. Note that each subsystem
      using cmpxchg_double has to implement a fall back mechanism as long as
      we offer support for processors that do not implement cmpxchg_double.
      Reviewed-by: default avatarH. Peter Anvin <hpa@zytor.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Link: http://lkml.kernel.org/r/20110601172614.173427964@linux.comSigned-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
  5. 08 Apr, 2011 1 commit
  6. 18 Mar, 2011 1 commit
  7. 17 Mar, 2011 1 commit
  8. 09 Mar, 2011 1 commit
  9. 21 Jan, 2011 1 commit
    • David Rientjes's avatar
      kconfig: rename CONFIG_EMBEDDED to CONFIG_EXPERT · 6a108a14
      David Rientjes authored
      The meaning of CONFIG_EMBEDDED has long since been obsoleted; the option
      is used to configure any non-standard kernel with a much larger scope than
      only small devices.
      This patch renames the option to CONFIG_EXPERT in init/Kconfig and fixes
      references to the option throughout the kernel.  A new CONFIG_EMBEDDED
      option is added that automatically selects CONFIG_EXPERT when enabled and
      can be used in the future to isolate options that should only be
      considered for embedded systems (RISC architectures, SLOB, etc).
      Calling the option "EXPERT" more accurately represents its intention: only
      expert users who understand the impact of the configuration changes they
      are making should enable it.
      Reviewed-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatarDavid Woodhouse <david.woodhouse@intel.com>
      Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
      Cc: Greg KH <gregkh@suse.de>
      Cc: "David S. Miller" <davem@davemloft.net>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Arnd Bergmann <arnd@arndb.de>
      Cc: Robin Holt <holt@sgi.com>
      Cc: <linux-arch@vger.kernel.org>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
  10. 18 Dec, 2010 1 commit
    • Christoph Lameter's avatar
      x86: this_cpu_cmpxchg and this_cpu_xchg operations · 7296e08a
      Christoph Lameter authored
      Provide support as far as the hardware capabilities of the x86 cpus
      Define CONFIG_CMPXCHG_LOCAL in Kconfig.cpu to allow core code to test for
      fast cpuops implementations.
      	- Take out the definition for this_cpu_cmpxchg_8 and move it into
      	  a separate patch.
      tj: - Reordered ops to better follow this_cpu_* organization.
          - Renamed macro temp variables similar to their existing
      Signed-off-by: default avatarChristoph Lameter <cl@linux.com>
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
  11. 03 May, 2010 1 commit
    • Brian Gerst's avatar
      x86-32: Rework cache flush denied handler · 40d2e763
      Brian Gerst authored
      The cache flush denied error is an erratum on some AMD 486 clones.  If an invd
      instruction is executed in userspace, the processor calls exception 19 (13 hex)
      instead of #GP (13 decimal).  On cpus where XMM is not supported, redirect
      exception 19 to do_general_protection().  Also, remove die_if_kernel(), since
      this was the last user.
      Signed-off-by: default avatarBrian Gerst <brgerst@gmail.com>
      LKML-Reference: <1269176446-2489-2-git-send-email-brgerst@gmail.com>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
  12. 26 Mar, 2010 1 commit
    • Peter Zijlstra's avatar
      x86, perf, bts, mm: Delete the never used BTS-ptrace code · faa4602e
      Peter Zijlstra authored
      Support for the PMU's BTS features has been upstreamed in
      v2.6.32, but we still have the old and disabled ptrace-BTS,
      as Linus noticed it not so long ago.
      It's buggy: TIF_DEBUGCTLMSR is trampling all over that MSR without
      regard for other uses (perf) and doesn't provide the flexibility
      needed for perf either.
      Its users are ptrace-block-step and ptrace-bts, since ptrace-bts
      was never used and ptrace-block-step can be implemented using a
      much simpler approach.
      So axe all 3000 lines of it. That includes the *locked_memory*()
      APIs in mm/mlock.c as well.
      Reported-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: default avatarPeter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Roland McGrath <roland@redhat.com>
      Cc: Oleg Nesterov <oleg@redhat.com>
      Cc: Markus Metzger <markus.t.metzger@intel.com>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      Cc: Andrew Morton <akpm@linux-foundation.org>
      LKML-Reference: <20100325135413.938004390@chello.nl>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  13. 14 Jan, 2010 1 commit
    • Linus Torvalds's avatar
      x86-64: support native xadd rwsem implementation · bafaecd1
      Linus Torvalds authored
      This one is much faster than the spinlock based fallback rwsem code,
      with certain artifical benchmarks having shown 300%+ improvement on
      threaded page faults etc.
      Again, note the 32767-thread limit here. So this really does need that
      whole "make rwsem_count_t be 64-bit and fix the BIAS values to match"
      extension on top of it, but that is conceptually a totally independent
      NOT TESTED! The original patch that this all was based on were tested by
      KAMEZAWA Hiroyuki, but maybe I screwed up something when I created the
      cleaned-up series, so caveat emptor..
      Also note that it _may_ be a good idea to mark some more registers
      clobbered on x86-64 in the inline asms instead of saving/restoring them.
      They are inline functions, but they are only used in places where there
      are not a lot of live registers _anyway_, so doing for example the
      clobbers of %r8-%r11 in the asm wouldn't make the fast-path code any
      worse, and would make the slow-path code smaller.
      (Not that the slow-path really matters to that degree. Saving a few
      unnecessary registers is the _least_ of our problems when we hit the slow
      path. The instruction/cycle counting really only matters in the fast
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      LKML-Reference: <alpine.LFD.2.00.1001121810410.17145@localhost.localdomain>
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>
  14. 06 Jan, 2010 1 commit
  15. 05 Jan, 2010 1 commit
  16. 19 Nov, 2009 1 commit
    • Jan Beulich's avatar
      x86: Eliminate redundant/contradicting cache line size config options · 350f8f56
      Jan Beulich authored
      Rather than having X86_L1_CACHE_BYTES and X86_L1_CACHE_SHIFT
      (with inconsistent defaults), just having the latter suffices as
      the former can be easily calculated from it.
      To be consistent, also change X86_INTERNODE_CACHE_BYTES to
      X86_INTERNODE_CACHE_SHIFT, and set it to 7 (128 bytes) for NUMA
      to account for last level cache line size (which here matters
      more than L1 cache line size).
      Finally, make sure the default value for X86_L1_CACHE_SHIFT,
      when X86_GENERIC is selected, is being seen before that for the
      individual CPU model options (other than on x86-64, where
      GENERIC_CPU is part of the choice construct, X86_GENERIC is a
      separate option on ix86).
      Signed-off-by: default avatarJan Beulich <jbeulich@novell.com>
      Acked-by: default avatarRavikiran Thirumalai <kiran@scalex86.org>
      Acked-by: default avatarNick Piggin <npiggin@suse.de>
      LKML-Reference: <4AFD5710020000780001F8F0@vpn.id2.novell.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  17. 26 Oct, 2009 1 commit
  18. 02 Oct, 2009 1 commit
  19. 01 Oct, 2009 1 commit
    • Linus Torvalds's avatar
      x86: Optimize cmpxchg64() at build-time some more · 982d007a
      Linus Torvalds authored
      Try to avoid the 'alternates()' code when we can statically
      determine that cmpxchg8b is fine. We already have that
      CONFIG_x86_CMPXCHG64 (enabled by PAE support), and we could easily
      also enable it for some of the CPU cases.
      Note, this patch only adds CMPXCHG8B for the obvious Intel CPU's,
      not for others. (There was something really messy about cmpxchg8b
      and clone CPU's, so if you enable it on other CPUs later, do it
      If we avoid that asm-alternative thing when we can assume the
      instruction exists, we'll generate less support crud, and we'll
      avoid the whole issue with that extra 'nop' for padding instruction
      sizes etc.
      LKML-Reference: <alpine.LFD.2.01.0909301743150.6996@localhost.localdomain>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  20. 23 Aug, 2009 1 commit
  21. 10 Jun, 2009 1 commit
  22. 24 Apr, 2009 1 commit
  23. 15 Apr, 2009 1 commit
    • Ingo Molnar's avatar
      x86: disable X86_PTRACE_BTS for now · d45b41ae
      Ingo Molnar authored
      Oleg Nesterov found a couple of races in the ptrace-bts code
      and fixes are queued up for it but they did not get ready in time
      for the merge window. We'll merge them in v2.6.31 - until then
      mark the feature as CONFIG_BROKEN. There's no user-space yet
      making use of this so it's not a big issue.
      Cc: <stable@kernel.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  24. 14 Mar, 2009 1 commit
  25. 05 Feb, 2009 1 commit
  26. 04 Feb, 2009 1 commit
  27. 21 Jan, 2009 1 commit
    • Ingo Molnar's avatar
      x86: make x86_32 use tlb_64.c, build fix, clean up X86_L1_CACHE_BYTES · ace6c6c8
      Ingo Molnar authored
        arch/x86/mm/tlb.c:47: error: ‘CONFIG_X86_INTERNODE_CACHE_BYTES’ undeclared here (not in a function)
      The CONFIG_X86_INTERNODE_CACHE_BYTES symbol is only defined on 64-bit,
      because vsmp support is 64-bit only. Define it on 32-bit too - where it
      will always be equal to X86_L1_CACHE_BYTES.
      Also move the default of X86_L1_CACHE_BYTES (which is separate from the
      more commonly used L1_CACHE_SHIFT kconfig symbol) from 128 bytes to
      64 bytes.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  28. 14 Jan, 2009 1 commit
    • Ingo Molnar's avatar
      x86: change the default cache size to 64 bytes · 0a2a18b7
      Ingo Molnar authored
      Right now the generic cacheline size is 128 bytes - that is wasteful
      when structures are aligned, as all modern x86 CPUs have an (effective)
      cacheline sizes of 64 bytes.
      It was set to 128 bytes due to some cacheline aliasing problems on
      older P4 systems, but those are many years old and we dont optimize
      for them anymore. (They'll still get the 128 bytes cacheline size if
      the kernel is specifically built for Pentium 4)
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Acked-by: default avatarArjan van de Ven <arjan@linux.intel.com>
  29. 06 Jan, 2009 1 commit
  30. 24 Dec, 2008 2 commits
    • Ingo Molnar's avatar
      Revert "x86: disable X86_PTRACE_BTS" · 67be403d
      Ingo Molnar authored
      This reverts commit 40f15ad8.
      The CONFIG_X86_PTRACE_BTS bugs have been fixed via:
       c5dee617: x86, bts: memory accounting
       bf53de90: x86, bts: add fork and exit handling
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
    • Ingo Molnar's avatar
      x86: disable X86_PTRACE_BTS · 40f15ad8
      Ingo Molnar authored
      there's a new ptrace arch level feature in .28:
        config X86_PTRACE_BTS
        bool "Branch Trace Store"
      it has broken fork() handling: the old DS area gets copied over into
      a new task without clearing it.
      Fixes exist but they came too late:
        c5dee617: x86, bts: memory accounting
        bf53de90: x86, bts: add fork and exit handling
      and are queued up for v2.6.29. This shows that the facility is still not
      tested well enough to release into a stable kernel - disable it for now and
      reactivate in .29. In .29 the hardware-branch-tracer will use the DS/BTS
      facilities too - hopefully resulting in better code.
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
  31. 25 Nov, 2008 1 commit
  32. 28 Oct, 2008 1 commit
  33. 13 Oct, 2008 2 commits
  34. 12 Oct, 2008 2 commits
  35. 10 Sep, 2008 1 commit
  36. 08 Sep, 2008 1 commit
    • Linus Torvalds's avatar
      x86: disable static NOPLs on 32 bits · 14469a8d
      Linus Torvalds authored
      On 32-bit, at least the generic nops are fairly reasonable, but the
      default nops for 64-bit really look pretty sad, and the P6 nops really do
      look better.
      So I would suggest perhaps moving the static P6 nop selection into the
      CONFIG_X86_64 thing.
      The alternative is to just get rid of that static nop selection, and just
      have two cases: 32-bit and 64-bit, and just pick obviously safe cases for
      Signed-off-by: default avatarH. Peter Anvin <hpa@zytor.com>