1. 12 Jan, 2006 9 commits
  2. 06 Jan, 2006 2 commits
  3. 29 Dec, 2005 1 commit
  4. 15 Dec, 2005 1 commit
  5. 13 Dec, 2005 2 commits
  6. 15 Nov, 2005 11 commits
    • Bob Picco's avatar
      [PATCH] x86_64: Fix sparse mem · d3ee871e
      Bob Picco authored
      
      
      Fix up booting with sparse mem enabled. Otherwise it would just
      cause an early PANIC at boot.
      Signed-off-by: default avatarBob Picco <bob.picco@hp.com>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d3ee871e
    • Andi Kleen's avatar
      [PATCH] x86_64: Remove CONFIG_CHECKING and add command line option for pagefault tracing · 9e43e1b7
      Andi Kleen authored
      
      
      CONFIG_CHECKING covered some debugging code used in the early times
      of the port. But it wasn't even SMP safe for quite some time
      and the bugs it checked for seem to be gone.
      
      This patch removes all the code to verify GS at kernel entry. There
      haven't been any new bugs in this area for a long time.
      
      Previously it also covered the sysctl for the page fault tracing.
      That didn't make much sense because that code was unconditionally
      compiled in. I made that a boot option now because it is typically
      only useful at boot.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      9e43e1b7
    • Magnus Damm's avatar
      [PATCH] x86_64: Make node boundaries consistent · ffd10a2b
      Magnus Damm authored
      
      
      The current x86_64 NUMA memory code is inconsequent when it comes to node
      memory ranges. The exact behaviour varies depending on which config option
      that is used.
      
      setup_node_bootmem() has start and end as arguments and these are used to
      calculate the size of the node like this: (end - start). This is all fine
      if end is pointing to the first non-available byte. The problem is that the
      current x86_64 code sometimes treats it as the last present byte and sometimes
      as the first non-available byte. The result is that some configurations might
      lose a page at the end of the range.
      
      This patch tries to fix CONFIG_ACPI_NUMA, CONFIG_K8_NUMA and CONFIG_NUMA_EMU
      so they all treat the end variable as the first non-available byte. This is
      the same way as the single node code.
      
      The patch is boot tested on dual x86_64 hardware with the above configurations,
      but maybe the removed code is needed as some workaround?
      Signed-off-by: default avatarMagnus Damm <magnus@valinux.co.jp>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      ffd10a2b
    • Eric Dumazet's avatar
      [PATCH] x86_64: Optimize NUMA node hash function · 529a3404
      Eric Dumazet authored
      
      
      Compute the highest possible value for memnode_shift, in order to reduce
      footprint of memnodemap[] to the minimum, thus making all users
      (phys_to_nid(), kfree()), more cache friendly.
      
      Before the patch :
      
       Node 0 MemBase 0000000000000000 Limit 00000001ffffffff
       Node 1 MemBase 0000000200000000 Limit 00000003ffffffff
       Using 23 for the hash shift. Max adder is 3ffffffff
      
      After the patch :
      
       Node 0 MemBase 0000000000000000 Limit 00000001ffffffff
       Node 1 MemBase 0000000200000000 Limit 00000003ffffffff
       Using 33 for the hash shift.
      
      In this case, only 2 bytes of memnodemap[] are used, instead of 2048
      Signed-off-by: default avatarEric Dumazet <dada1@cosmosbay.com>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      529a3404
    • Andi Kleen's avatar
      [PATCH] x86_64: Replace swiotlb extern with include · 59170891
      Andi Kleen authored
      
      
      Minor victory on the continuous quest against all stray extern.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      59170891
    • Andi Kleen's avatar
      [PATCH] x86_64: Replace cpu_pda extern with include · 4d74dbd7
      Andi Kleen authored
      
      
      Minor cleanup - remove obsolete extern
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4d74dbd7
    • Andi Kleen's avatar
      [PATCH] x86_64: Only use asm/sections.h to declare section symbols · 2bc0414e
      Andi Kleen authored
      
      
      Adding __initdata_* to asm-generic/sections.h
      Replaces a lot of open coded externs in arch/x86_64/*
      I had to change __bss_end to __bss_stop to match the other architectures.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      2bc0414e
    • Siddha, Suresh B's avatar
      [PATCH] x86_64: Unmap NULL during early bootup · f6c2e333
      Siddha, Suresh B authored
      
      
      We should zap the low mappings, as soon as possible, so that we can catch
      kernel bugs more effectively. Previously early boot had NULL mapped
      and didn't trap on NULL references.
      
      This patch introduces boot_level4_pgt, which will always have low identity
      addresses mapped.  Druing boot, all the processors will use this as their
      level4 pgt.  On BP, we will switch to init_level4_pgt as soon as we enter C
      code and zap the low mappings as soon as we are done with the usage of
      identity low mapped addresses.  On AP's we will zap the low mappings as
      soon as we jump to C code.
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarAshok Raj <ashok.raj@intel.com>
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      f6c2e333
    • Andi Kleen's avatar
      [PATCH] x86_64: Speed up numa_node_id by putting it directly into the PDA · 69d81fcd
      Andi Kleen authored
      
      
      Not go from the CPU number to an mapping array.
      Mode number is often used now in fast paths.
      
      This also adds a generic numa_node_id to all the topology includes
      
      Suggested by Eric Dumazet
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      69d81fcd
    • Andi Kleen's avatar
      [PATCH] x86_64: Account mem_map in VM holes accounting · e18c6874
      Andi Kleen authored
      
      
      The VM needs to know about lost memory in zones to accurately
      balance dirty pages. This patch accounts mem_map in there too,
      which fixes a constant errror of a few percent. Also some
      other misc mappings and the kernel text itself are accounted
      too.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e18c6874
    • Andi Kleen's avatar
      [PATCH] x86_64: Add 4GB DMA32 zone · a2f1b424
      Andi Kleen authored
      
      
      Add a new 4GB GFP_DMA32 zone between the GFP_DMA and GFP_NORMAL zones.
      
      As a bit of historical background: when the x86-64 port
      was originally designed we had some discussion if we should
      use a 16MB DMA zone like i386 or a 4GB DMA zone like IA64 or
      both. Both was ruled out at this point because it was in early
      2.4 when VM is still quite shakey and had bad troubles even
      dealing with one DMA zone.  We settled on the 16MB DMA zone mainly
      because we worried about older soundcards and the floppy.
      
      But this has always caused problems since then because
      device drivers had trouble getting enough DMA able memory. These days
      the VM works much better and the wide use of NUMA has proven
      it can deal with many zones successfully.
      
      So this patch adds both zones.
      
      This helps drivers who need a lot of memory below 4GB because
      their hardware is not accessing more (graphic drivers - proprietary
      and free ones, video frame buffer drivers, sound drivers etc.).
      Previously they could only use IOMMU+16MB GFP_DMA, which
      was not enough memory.
      
      Another common problem is that hardware who has full memory
      addressing for >4GB misses it for some control structures in memory
      (like transmit rings or other metadata).  They tended to allocate memory
      in the 16MB GFP_DMA or the IOMMU/swiotlb then using pci_alloc_consistent,
      but that can tie up a lot of precious 16MB GFPDMA/IOMMU/swiotlb memory
      (even on AMD systems the IOMMU tends to be quite small) especially if you have
      many devices.  With the new zone pci_alloc_consistent can just put
      this stuff into memory below 4GB which works better.
      
      One argument was still if the zone should be 4GB or 2GB. The main
      motivation for 2GB would be an unnamed not so unpopular hardware
      raid controller (mostly found in older machines from a particular four letter
      company) who has a strange 2GB restriction in firmware. But
      that one works ok with swiotlb/IOMMU anyways, so it doesn't really
      need GFP_DMA32. I chose 4GB to be compatible with IA64 and because
      it seems to be the most common restriction.
      
      The new zone is so far added only for x86-64.
      
      For other architectures who don't set up this
      new zone nothing changes. Architectures can set a compatibility
      define in Kconfig CONFIG_DMA_IS_DMA32 that will define GFP_DMA32
      as GFP_DMA. Otherwise it's a nop because on 32bit architectures
      it's normally not needed because GFP_NORMAL (=0) is DMA able
      enough.
      
      One problem is still that GFP_DMA means different things on different
      architectures. e.g. some drivers used to have #ifdef ia64  use GFP_DMA
      (trusting it to be 4GB) #elif __x86_64__ (use other hacks like
      the swiotlb because 16MB is not enough) ... . This was quite
      ugly and is now obsolete.
      
      These should be now converted to use GFP_DMA32 unconditionally. I haven't done
      this yet. Or best only use pci_alloc_consistent/dma_alloc_coherent
      which will use GFP_DMA32 transparently.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a2f1b424
  7. 30 Oct, 2005 1 commit
    • Hugh Dickins's avatar
      [PATCH] mm: init_mm without ptlock · 872fec16
      Hugh Dickins authored
      
      
      First step in pushing down the page_table_lock.  init_mm.page_table_lock has
      been used throughout the architectures (usually for ioremap): not to serialize
      kernel address space allocation (that's usually vmlist_lock), but because
      pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.
      
      Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
      architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
      and drop it when allocating a new one, to check lest a racing task already
      did.  Similarly no page_table_lock in vmalloc's map_vm_area.
      
      Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
      user mms, which are converted only by a later patch, for now they have to lock
      differently according to whether or not it's init_mm.
      
      If sources get muddled, there's a danger that an arch source taking
      init_mm.page_table_lock will be mixed with common source also taking it (or
      neither take it).  So break the rules and make another change, which should
      break the build for such a mismatch: remove the redundant mm arg from
      pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).
      
      Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
      used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
      pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
      map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
      took page_table_lock for no good reason.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      872fec16
  8. 10 Oct, 2005 1 commit
  9. 30 Sep, 2005 2 commits
    • Ravikiran G Thirumalai's avatar
      [PATCH] x86_64 early numa init fix · 85cc5135
      Ravikiran G Thirumalai authored
      
      
      The tests Alok carried out on Petr's box confirmed that cpu_to_node[BP] is
      not setup early enough by numa_init_array due to the x86_64 changes in
      2.6.14-rc*, and unfortunately set wrongly by the work around code in
      numa_init_array().  cpu_to_node[0] gets set with 1 early and later gets set
      properly to 0 during identify_cpu() when all cpus are brought up, but
      confusing the numa slab in the process.
      
      Here is a quick fix for this.  The right fix obviously is to have
      cpu_to_node[bsp] setup early for numa_init_array().  The following patch
      will fix the problem now, and the code can stay on even when
      cpu_to_node{BP] gets fixed early correctly.
      
      Thanks to Petr for access to his box.
      
      Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: default avatarAlok N Kataria <alokk@calsoftinc.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      85cc5135
    • Ravikiran G Thirumalai's avatar
      [PATCH] x86_64: fix the BP node_to_cpumask · e6a045a5
      Ravikiran G Thirumalai authored
      
      
      Fix the BP node_to_cpumask.  2.6.14-rc* broke the boot cpu bit as the
      cpu_to_node(0) is now not setup early enough for numa_init_array.
      cpu_to_node[] is setup much later at srat_detect_node on acpi srat based
      em64t machines.  This seems like a problem on amd machines too, Tested on
      em64t though.  /sys/devices/system/node/node0/cpumap shows up sanely after
      this patch.
      
      Signed off by: Ravikiran Thirumalai <kiran@scalex86.org>
      Signed-off-by: default avatarShai Fultheim <shai@scalex86.org>
      Cc: Andi Kleen <ak@muc.de>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      e6a045a5
  10. 12 Sep, 2005 10 commits