1. 28 Apr, 2008 2 commits
    • Yasunori Goto's avatar
      memory hotplug: make alloc_bootmem_section() · e70260aa
      Yasunori Goto authored
      
      
      alloc_bootmem_section() can allocate specified section's area.  This is used
      for usemap to keep same section with pgdat by later patch.
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      e70260aa
    • Yasunori Goto's avatar
      memory hotplug: register section/node id to free · 04753278
      Yasunori Goto authored
      
      
      This patch set is to free pages which is allocated by bootmem for
      memory-hotremove.  Some structures of memory management are allocated by
      bootmem.  ex) memmap, etc.
      
      To remove memory physically, some of them must be freed according to
      circumstance.  This patch set makes basis to free those pages, and free
      memmaps.
      
      Basic my idea is using remain members of struct page to remember information
      of users of bootmem (section number or node id).  When the section is
      removing, kernel can confirm it.  By this information, some issues can be
      solved.
      
        1) When the memmap of removing section is allocated on other
           section by bootmem, it should/can be free.
        2) When the memmap of removing section is allocated on the
           same section, it shouldn't be freed. Because the section has to be
           logical memory offlined already and all pages must be isolated against
           page allocater. If it is freed, page allocator may use it which will
           be removed physically soon.
        3) When removing section has other section's memmap,
           kernel will be able to show easily which section should be removed
           before it for user. (Not implemented yet)
        4) When the above case 2), the page isolation will be able to check and skip
           memmap's page when logical memory offline (offline_pages()).
           Current page isolation code fails in this case because this page is
           just reserved page and it can't distinguish this pages can be
           removed or not. But, it will be able to do by this patch.
           (Not implemented yet.)
        5) The node information like pgdat has similar issues. But, this
           will be able to be solved too by this.
           (Not implemented yet, but, remembering node id in the pages.)
      
      Fortunately, current bootmem allocator just keeps PageReserved flags,
      and doesn't use any other members of page struct. The users of
      bootmem doesn't use them too.
      
      This patch:
      
      This is to register information which is node or section's id.  Kernel can
      distinguish which node/section uses the pages allcated by bootmem.  This is
      basis for hot-remove sections or nodes.
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Badari Pulavarty <pbadari@us.ibm.com>
      Cc: Yinghai Lu <yhlu.kernel@gmail.com>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      04753278
  2. 26 Apr, 2008 3 commits
  3. 25 Mar, 2008 1 commit
    • Yinghai Lu's avatar
      mm: fix boundary checking in free_bootmem_core · 5a982cbc
      Yinghai Lu authored
      
      
      With numa enabled, some callers could have a range of memory on one node
      but try to free that on other node.  This can cause some pages to be
      freed wrongly.
      
      For example: when we try to allocate 128g boot ram early for
      gart/swiotlb, and free that range later so gart/swiotlb can get some
      range afterwards.
      
      With this patch, we don't need to care which node holds the range, just
      loop to call free_bootmem_node for all online nodes.
      
      This patch makes free_bootmem_core() more robust by trimming the sidx
      and eidx according the ram range that the node has.
      
      And make the free_bootmem_core handle this out of range case.  We could
      use bdata_list to make sure the range can be freed for sure.  So next
      time, we don't need to loop online nodes and could use free_bootmem
      directly.
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Cc: Andi Kleen <ak@suse.de>
      Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Acked-by: default avatarIngo Molnar <mingo@elte.hu>
      Tested-by: default avatarIngo Molnar <mingo@elte.hu>
      Cc: Christoph Lameter <clameter@sgi.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      5a982cbc
  4. 07 Feb, 2008 1 commit
    • Bernhard Walle's avatar
      Introduce flags for reserve_bootmem() · 72a7fe39
      Bernhard Walle authored
      
      
      This patchset adds a flags variable to reserve_bootmem() and uses the
      BOOTMEM_EXCLUSIVE flag in crashkernel reservation code to detect collisions
      between crashkernel area and already used memory.
      
      This patch:
      
      Change the reserve_bootmem() function to accept a new flag BOOTMEM_EXCLUSIVE.
      If that flag is set, the function returns with -EBUSY if the memory already
      has been reserved in the past.  This is to avoid conflicts.
      
      Because that code runs before SMP initialisation, there's no race condition
      inside reserve_bootmem_core().
      
      [akpm@linux-foundation.org: coding-style fixes]
      [akpm@linux-foundation.org: fix powerpc build]
      Signed-off-by: default avatarBernhard Walle <bwalle@suse.de>
      Cc: <linux-arch@vger.kernel.org>
      Cc: "Eric W. Biederman" <ebiederm@xmission.com>
      Cc: Vivek Goyal <vgoyal@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      72a7fe39
  5. 07 Dec, 2006 2 commits
  6. 26 Sep, 2006 6 commits
  7. 10 Jul, 2006 1 commit
  8. 09 Apr, 2006 1 commit
    • Andi Kleen's avatar
      [PATCH] x86_64: Handle empty PXMs that only contain hotplug memory · a8062231
      Andi Kleen authored
      
      
      The node setup code would try to allocate the node metadata in the node
      itself, but that fails if there is no memory in there.
      
      This can happen with memory hotplug when the hotplug area defines an so
      far empty node.
      
      Now use bootmem to try to allocate the mem_map in other nodes.
      
      And if it fails don't panic, but just ignore the node.
      
      To make this work I added a new __alloc_bootmem_nopanic function that
      does what its name implies.
      
      TBD should try to use nearby nodes here.  Currently we just use any.
      It's hard to do it better because bootmem doesn't have proper fallback
      lists yet.
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      a8062231
  9. 27 Mar, 2006 1 commit
  10. 25 Mar, 2006 1 commit
  11. 06 Jan, 2006 2 commits
  12. 12 Dec, 2005 1 commit
    • Haren Myneni's avatar
      [PATCH] fix in __alloc_bootmem_core() when there is no free page in first node's memory · 66d43e98
      Haren Myneni authored
      
      
      Hitting BUG_ON() in __alloc_bootmem_core() when there is no free page
      available in the first node's memory.  For the case of kdump on PPC64
      (Power 4 machine), the captured kernel is used two memory regions - memory
      for TCE tables (tce-base and tce-size at top of RAM and reserved) and
      captured kernel memory region (crashk_base and crashk_size).  Since we
      reserve the memory for the first node, we should be returning from
      __alloc_bootmem_core() to search for the next node (pg_dat).
      
      Currently, find_next_zero_bit() is returning the n^th bit (eidx) when there
      is no free page.  Then, test_bit() is failed since we set 0xff only for the
      actual size initially (init_bootmem_core()) even though rounded up to one
      page for bdata->node_bootmem_map.  We are hitting the BUG_ON after failing
      to enter second "for" loop.
      Signed-off-by: default avatarHaren Myneni <haren@us.ibm.com>
      Cc: Andy Whitcroft <apw@shadowen.org>
      Cc: Dave Hansen <haveblue@us.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      66d43e98
  13. 30 Oct, 2005 1 commit
    • Nick Piggin's avatar
      [PATCH] core remove PageReserved · b5810039
      Nick Piggin authored
      
      
      Remove PageReserved() calls from core code by tightening VM_RESERVED
      handling in mm/ to cover PageReserved functionality.
      
      PageReserved special casing is removed from get_page and put_page.
      
      All setting and clearing of PageReserved is retained, and it is now flagged
      in the page_alloc checks to help ensure we don't introduce any refcount
      based freeing of Reserved pages.
      
      MAP_PRIVATE, PROT_WRITE of VM_RESERVED regions is tentatively being
      deprecated.  We never completely handled it correctly anyway, and is be
      reintroduced in future if required (Hugh has a proof of concept).
      
      Once PageReserved() calls are removed from kernel/power/swsusp.c, and all
      arch/ and driver code, the Set and Clear calls, and the PG_reserved bit can
      be trivially removed.
      
      Last real user of PageReserved is swsusp, which uses PageReserved to
      determine whether a struct page points to valid memory or not.  This still
      needs to be addressed (a generic page_is_ram() should work).
      
      A last caveat: the ZERO_PAGE is now refcounted and managed with rmap (and
      thus mapcounted and count towards shared rss).  These writes to the struct
      page could cause excessive cacheline bouncing on big systems.  There are a
      number of ways this could be addressed if it is an issue.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      
      Refcount bug fix for filemap_xip.c
      Signed-off-by: default avatarCarsten Otte <cotte@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      b5810039
  14. 20 Oct, 2005 1 commit
    • Yasunori Goto's avatar
      [PATCH] swiotlb: make sure initial DMA allocations really are in DMA memory · 281dd25c
      Yasunori Goto authored
      
      
      This introduces a limit parameter to the core bootmem allocator; The new
      parameter indicates that physical memory allocated by the bootmem
      allocator should be within the requested limit.
      
      We also introduce alloc_bootmem_low_pages_limit, alloc_bootmem_node_limit,
      alloc_bootmem_low_pages_node_limit apis, but alloc_bootmem_low_pages_limit
      is the only api used for swiotlb.
      
      The existing alloc_bootmem_low_pages() api could instead have been
      changed and made to pass right limit to the core allocator.  But that
      would make the patch more intrusive for 2.6.14, as other arches use
      alloc_bootmem_low_pages().  We may be done that post 2.6.14 as a
      cleanup.
      
      With this, swiotlb gets memory within 4G for both x86_64 and ia64
      arches.
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Cc: Ravikiran G Thirumalai <kiran@scalex86.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      281dd25c
  15. 30 Sep, 2005 1 commit
  16. 12 Sep, 2005 1 commit
  17. 25 Jun, 2005 2 commits
  18. 23 Jun, 2005 1 commit
    • Andy Whitcroft's avatar
      [PATCH] sparsemem memory model · d41dee36
      Andy Whitcroft authored
      
      
      Sparsemem abstracts the use of discontiguous mem_maps[].  This kind of
      mem_map[] is needed by discontiguous memory machines (like in the old
      CONFIG_DISCONTIGMEM case) as well as memory hotplug systems.  Sparsemem
      replaces DISCONTIGMEM when enabled, and it is hoped that it can eventually
      become a complete replacement.
      
      A significant advantage over DISCONTIGMEM is that it's completely separated
      from CONFIG_NUMA.  When producing this patch, it became apparent in that NUMA
      and DISCONTIG are often confused.
      
      Another advantage is that sparse doesn't require each NUMA node's ranges to be
      contiguous.  It can handle overlapping ranges between nodes with no problems,
      where DISCONTIGMEM currently throws away that memory.
      
      Sparsemem uses an array to provide different pfn_to_page() translations for
      each SECTION_SIZE area of physical memory.  This is what allows the mem_map[]
      to be chopped up.
      
      In order to do quick pfn_to_page() operations, the section number of the page
      is encoded in page->flags.  Part of the sparsemem infrastructure enables
      sharing of these bits more dynamically (at compile-time) between the
      page_zone() and sparsemem operations.  However, on 32-bit architectures, the
      number of bits is quite limited, and may require growing the size of the
      page->flags type in certain conditions.  Several things might force this to
      occur: a decrease in the SECTION_SIZE (if you want to hotplug smaller areas of
      memory), an increase in the physical address space, or an increase in the
      number of used page->flags.
      
      One thing to note is that, once sparsemem is present, the NUMA node
      information no longer needs to be stored in the page->flags.  It might provide
      speed increases on certain platforms and will be stored there if there is
      room.  But, if out of room, an alternate (theoretically slower) mechanism is
      used.
      
      This patch introduces CONFIG_FLATMEM.  It is used in almost all cases where
      there used to be an #ifndef DISCONTIG, because SPARSEMEM and DISCONTIGMEM
      often have to compile out the same areas of code.
      Signed-off-by: default avatarAndy Whitcroft <apw@shadowen.org>
      Signed-off-by: default avatarDave Hansen <haveblue@us.ibm.com>
      Signed-off-by: default avatarMartin Bligh <mbligh@aracnet.com>
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Signed-off-by: default avatarYasunori Goto <y-goto@jp.fujitsu.com>
      Signed-off-by: default avatarBob Picco <bob.picco@hp.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      d41dee36
  19. 16 Apr, 2005 1 commit
    • Linus Torvalds's avatar
      Linux-2.6.12-rc2 · 1da177e4
      Linus Torvalds authored
      Initial git repository build. I'm not bothering with the full history,
      even though we have it. We can create a separate "historical" git
      archive of that later if we want to, and in the meantime it's about
      3.2GB when imported into git - space that would just make the early
      git days unnecessarily complicated, when we don't have a lot of good
      infrastructure for it.
      
      Let it rip!
      1da177e4