1. 17 Apr, 2008 19 commits
  2. 11 Apr, 2008 2 commits
    • Linus Torvalds's avatar
      Add commentary about the new "asmlinkage_protect()" macro · d10d89ec
      Linus Torvalds authored
      
      
      It's really a pretty ugly thing to need, and some day it will hopefully
      be obviated by teaching gcc about the magic calling conventions for the
      low-level system call code, but in the meantime we can at least add big
      honking comments about why we need these insane and strange macros.
      
      I took my comments from my version of the macro, but I ended up deciding
      to just pick Roland's version of the actual code instead (with his
      prettier syntax that uses vararg macros).  Thus the previous two commits
      that actually implement it.
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      d10d89ec
    • Roland McGrath's avatar
      asmlinkage_protect replaces prevent_tail_call · 54a01510
      Roland McGrath authored
      
      
      The prevent_tail_call() macro works around the problem of the compiler
      clobbering argument words on the stack, which for asmlinkage functions
      is the caller's (user's) struct pt_regs.  The tail/sibling-call
      optimization is not the only way that the compiler can decide to use
      stack argument words as scratch space, which we have to prevent.
      Other optimizations can do it too.
      
      Until we have new compiler support to make "asmlinkage" binding on the
      compiler's own use of the stack argument frame, we have work around all
      the manifestations of this issue that crop up.
      
      More cases seem to be prevented by also keeping the incoming argument
      variables live at the end of the function.  This makes their original
      stack slots attractive places to leave those variables, so the compiler
      tends not clobber them for something else.  It's still no guarantee, but
      it handles some observed cases that prevent_tail_call() did not.
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
      54a01510
  3. 07 Apr, 2008 1 commit
    • Suresh Siddha's avatar
      x86: fix 64-bit asm NOPS for CONFIG_GENERIC_CPU · 871de939
      Suresh Siddha authored
      
      
      ASM_NOP's for 64-bit kernel with CONFIG_GENERIC_CPU is broken
      with the recent x86 nops merge. They were using GENERIC_NOPS
      which will truncate the upper 32bits of %rsi, because of the missing
      64bit rex prefix.
      
      For now, fall back ASM NOPS for generic cpu to K8 NOPS, similar
      to the code before the wrong x86 nop merge.
      
      This should resolve the crash seen by Ingo on a test-system:
      
      BUG: unable to handle kernel paging request at 00000000d80d8ee8
      IP: [<ffffffff802121af>] save_i387_ia32+0x61/0xd8
      PGD b8e0067 PUD 51490067 PMD 0
      Oops: 0000 [1] SMP
      CPU 2
      Modules linked in:
      Pid: 3871, comm: distcc Not tainted 2.6.25-rc7-sched-devel.git-x86-latest.git #359
      RIP: 0010:[<ffffffff802121af>]  [<ffffffff802121af>] save_i387_ia32+0x61/0xd8
      RSP: 0000:ffff81003abd3cb8  EFLAGS: 00010246
      RAX: ffff810082e93400 RBX: 00000000ffc37f84 RCX: ffff8100d80d8ee0
      RDX: 0000000000000000 RSI: 00000000d80d8ee0 RDI: ffff810082e93400
      RBP: 00000000ffc37fdc R08: 00000000ffc37f88 R09: 0000000000000008
      R10: ffff81003abd2000 R11: 0000000000000000 R12: ffff810082e93400
      R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff81011fb12dc0(0063) knlGS:00000000f7f1a6c0
      CS:  0010 DS: 002b ES: 002b CR0: 0000000080050033
      CR2: 00000000d80d8ee8 CR3: 0000000076922000 CR4: 00000000000006e0
      DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
      DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
      Process distcc (pid: 3871, threadinfo ffff81003abd2000, task ffff8100d80d8ee0)
      Stack:  ffff8100bb670380 ffffffff8026de50 0000000000000118 0000000000000002
       0000000000000002 ffff81003abd3e68 ffff81003abd3ed8 ffff81003abd3de8
       ffff81003abd3d18 ffffffff80229785 ffff8100d80d8ee0 ffff810001041280
      Call Trace:
       [<ffffffff8026de50>] ? __generic_file_aio_write_nolock+0x343/0x377
       [<ffffffff80229785>] ? update_curr+0x54/0x64
       [<ffffffff80227cd3>] ? ia32_setup_sigcontext+0x125/0x1d2
       [<ffffffff8022839f>] ? ia32_setup_frame+0x73/0x1a5
       [<ffffffff8020b2a5>] ? do_notify_resume+0x1aa/0x7db
       [<ffffffff8024ae8c>] ? getnstimeofday+0x31/0x85
       [<ffffffff80249858>] ? ktime_get_ts+0x17/0x48
       [<ffffffff80249933>] ? ktime_get+0xc/0x41
       [<ffffffff8024973e>] ? hrtimer_nanosleep+0x75/0xd5
       [<ffffffff80249261>] ? hrtimer_wakeup+0x0/0x21
       [<ffffffff8020bfbc>] ? int_signal+0x12/0x17
       [<ffffffff8030e6b3>] ? dummy_file_free_security+0x0/0x1
      
      Code: a6 08 05 00 00 f6 40 14 01 74 34 4c 89 e7 48 0f ae 07 48 8b 86 08 05 00 00 80 78 02 00 79 02 db e2 90 8d b4 26 00 00 00 00 89 f6 <48> 8b 46 08 83 60 14 fe 0f 20 c0 48 83 c8 08 0f 22 c0 eb 07 c6 
      Signed-off-by: default avatarSuresh Siddha <suresh.b.siddha@intel.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      871de939
  4. 04 Apr, 2008 1 commit
  5. 28 Mar, 2008 1 commit
  6. 27 Mar, 2008 1 commit
  7. 26 Mar, 2008 1 commit
  8. 24 Mar, 2008 1 commit
  9. 22 Mar, 2008 1 commit
    • Thomas Gleixner's avatar
      x86: revert: reserve dma32 early for gart · 9e963048
      Thomas Gleixner authored
      Revert
      
      commit f62f1fc9
      
      
      Author: Yinghai Lu <yhlu.kernel@gmail.com>
      Date:   Fri Mar 7 15:02:50 2008 -0800
      
          x86: reserve dma32 early for gart
      
      The patch has a dependency on bootmem modifications which are not .25
      material that late in the -rc cycle. The problem which is addressed by
      the patch is limited to machines with 256G and more memory booted with
      NUMA disabled. This is not a .25 regression and the audience which is
      affected by this problem is very limited, so it's safer to do the
      revert than pulling in intrusive bootmem changes right now.
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      9e963048
  10. 21 Mar, 2008 5 commits
    • Matti Linnanvuori's avatar
      sync_bitops: fix wrong comments [Bug 10247] · 7800c0c3
      Matti Linnanvuori authored
      
      
      Fix wrong function name and references to non-x86 architectures.
      
      Signed-off-by: Matti Linnanvuori mattilinnanvuori@yahoo.com
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      7800c0c3
    • Yinghai Lu's avatar
      x86: trim mtrr don't close gap for resource allocation. · 5dca6a1b
      Yinghai Lu authored
      fix the bug reported here:
      
      	http://bugzilla.kernel.org/show_bug.cgi?id=10232
      
      
      
      use update_memory_range() instead of add_memory_range() directly
      to avoid closing the gap.
      
      ( the new code only affects and runs on systems where the MTRR
        workaround triggers. )
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      5dca6a1b
    • Yinghai Lu's avatar
      x86: reserve dma32 early for gart · f62f1fc9
      Yinghai Lu authored
      
      
      a system with 256 GB of RAM, when NUMA is disabled crashes the
      following way:
      
      Your BIOS doesn't leave a aperture memory hole
      Please enable the IOMMU option in the BIOS setup
      This costs you 64 MB of RAM
      Cannot allocate aperture memory hole (ffff8101c0000000,65536K)
      Kernel panic - not syncing: Not enough memory for aperture
      Pid: 0, comm: swapper Not tainted 2.6.25-rc4-x86-latest.git #33
      
      Call Trace:
       [<ffffffff84037c62>] panic+0xb2/0x190
       [<ffffffff840381fc>] ? release_console_sem+0x7c/0x250
       [<ffffffff847b1628>] ? __alloc_bootmem_nopanic+0x48/0x90
       [<ffffffff847b0ac9>] ? free_bootmem+0x29/0x50
       [<ffffffff847ac1f7>] gart_iommu_hole_init+0x5e7/0x680
       [<ffffffff847b255b>] ? alloc_large_system_hash+0x16b/0x310
       [<ffffffff84506a2f>] ? _etext+0x0/0x1
       [<ffffffff847a2e8c>] pci_iommu_alloc+0x1c/0x40
       [<ffffffff847ac795>] mem_init+0x45/0x1a0
       [<ffffffff8479ff35>] start_kernel+0x295/0x380
       [<ffffffff8479f1c2>] _sinittext+0x1c2/0x230
      
      the root cause is : memmap PMD is too big,
      [ffffe200e0600000-ffffe200e07fffff] PMD ->ffff81383c000000 on node 0
      almost near 4G..., and vmemmap_alloc_block will use up the ram under 4G.
      
      solution will be:
      1. make memmap allocation get memory above 4G...
      2. reserve some dma32 range early before we try to set up memmap for all.
      and release that before pci_iommu_alloc, so gart or swiotlb could get some
      range under 4g limit for sure.
      
      the patch is using method 2.
      because method1 may need more code to handle SPARSEMEM and SPASEMEM_VMEMMAP
      
      will get
      Your BIOS doesn't leave a aperture memory hole
      Please enable the IOMMU option in the BIOS setup
      This costs you 64 MB of RAM
      Mapping aperture over 65536 KB of RAM @ 4000000
      Memory: 264245736k/268959744k available (8484k kernel code, 4187464k reserved, 4004k data, 724k init)
      Signed-off-by: default avatarYinghai Lu <yhlu.kernel@gmail.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      f62f1fc9
    • Chuck Lever's avatar
      x86: fix {clear,copy}_user_page() declarations in page.h · f2f7abcb
      Chuck Lever authored
      Clean up: eliminate some compiler noise on x86 when building with strict
      warnings enabled, introduced by commit 345b904c
      
      .
      
      In file included from include2/asm/thread_info_64.h:12,
                       from include2/asm/thread_info.h:4,
                       from
      /home/cel/src/linux/nfs-2.6/include/linux/thread_info.h:35,
                       from
      /home/cel/src/linux/nfs-2.6/include/linux/preempt.h:9,
                       from
      /home/cel/src/linux/nfs-2.6/include/linux/spinlock.h:49,
                       from /home/cel/src/linux/nfs-2.6/include/linux/mmzone.h:7,
                       from /home/cel/src/linux/nfs-2.6/include/linux/gfp.h:4,
                       from /home/cel/src/linux/nfs-2.6/include/linux/slab.h:14,
                       from /home/cel/src/linux/nfs-2.6/fs/nfsd/nfs4acl.c:40:
      include2/asm/page.h:55: warning: `inline' is not at beginning of
      declaration
      include2/asm/page.h:61: warning: `inline' is not at beginning of
      declaration
      Signed-off-by: default avatarChuck Lever <chuck.lever@oracle.com>
      Cc: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      f2f7abcb
    • Mathieu Desnoyers's avatar
      x86: cast cmpxchg and cmpxchg_local result for 386 and 486 · 3078b79d
      Mathieu Desnoyers authored
      
      
      mm/slub.c: In function 'slab_alloc':
      mm/slub.c:1637: warning: assignment makes pointer from integer without a cast
      mm/slub.c:1637: warning: assignment makes pointer from integer without a cast
      mm/slub.c: In function 'slab_free':
      mm/slub.c:1796: warning: assignment makes pointer from integer without a cast
      mm/slub.c:1796: warning: assignment makes pointer from integer without a cast
      
      A cast is needed in the 386 and 486 code because the type is a pointer.  In
      every other integer case the original cmpxchg code (and the cmpxchg_local
      which has been copied from it) worked fine, but since we touch a pointer,
      the type needs to be casted in the cmpxchg_local and cmpxchg macros.
      
      The more recent code (586+) does not have this problem (the cast is already
      there).
      Signed-off-by: default avatarMathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Reviewed-by: default avatarChristoph Lameter <clameter@sgi.com>
      Cc: Vegard Nossum <vegard.nossum@gmail.com>
      Cc: Pekka Enberg <penberg@cs.helsinki.fi>
      Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      3078b79d
  11. 11 Mar, 2008 1 commit
    • Thomas Gleixner's avatar
      x86: remove quicklists · 985a34bd
      Thomas Gleixner authored
      quicklists cause a serious memory leak on 32-bit x86,
      as documented at:
      
        http://bugzilla.kernel.org/show_bug.cgi?id=9991
      
      
      
      the reason is that the quicklist pool is a special-purpose
      cache that grows out of proportion. It is not accounted for
      anywhere and users have no way to even realize that it's
      the quicklists that are causing RAM usage spikes. It was
      supposed to be a relatively small pool, but as demonstrated
      by KOSAKI Motohiro, they can grow as large as:
      
        Quicklists:    1194304 kB
      
      given how much trouble this code has caused historically,
      and given that Andrew objected to its introduction on x86
      (years ago), the best option at this point is to remove them.
      
      [ any performance benefits of caching constructed pgds should
        be implemented in a more generic way (possibly within the page
        allocator), while still allowing constructed pages to be
        allocated by other workloads. ]
      Signed-off-by: default avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      985a34bd
  12. 06 Mar, 2008 1 commit
  13. 05 Mar, 2008 1 commit
  14. 03 Mar, 2008 2 commits
  15. 29 Feb, 2008 2 commits
    • Dave Anderson's avatar
      x86 ptrace: fix ptrace_bts_config structure declaration · 53c58588
      Dave Anderson authored
      
      
      The 2.6.25 ptrace_bts_config structure in asm-x86/ptrace-abi.h
      is defined with u32 types:
      
         #include <asm/types.h>
      
         /* configuration/status structure used in PTRACE_BTS_CONFIG and
            PTRACE_BTS_STATUS commands.
         */
         struct ptrace_bts_config {
                 /* requested or actual size of BTS buffer in bytes */
                 u32 size;
                 /* bitmask of below flags */
                 u32 flags;
                 /* buffer overflow signal */
                 u32 signal;
                 /* actual size of bts_struct in bytes */
                 u32 bts_size;
         };
         #endif
      
      But u32 is only accessible in asm-x86/types.h if __KERNEL__,
      leading to compile errors when ptrace.h is included from
      user-space. The double-underscore versions that are exported
      to user-space in asm-x86/types.h should be used instead.
      Signed-off-by: default avatarDave Anderson <anderson@redhat.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      53c58588
    • Hans Rosenfeld's avatar
      x86: fix pmd_bad and pud_bad to support huge pages · cded932b
      Hans Rosenfeld authored
      
      
      I recently stumbled upon a problem in the support for huge pages. If a
      program using huge pages does not explicitly unmap them, they remain
      mapped (and therefore, are lost) after the program exits.
      
      I observed that the free huge page count in /proc/meminfo decreased when
      running my program, and it did not increase after the program exited.
      After running the program a few times, no more huge pages could be
      allocated.
      
      The reason for this seems to be that the x86 pmd_bad and pud_bad
      consider pmd/pud entries having the PSE bit set invalid. I think there
      is nothing wrong with this bit being set, it just indicates that the
      lowest level of translation has been reached. This bit has to be (and
      is) checked after the basic validity of the entry has been checked, like
      in this fragment from follow_page() in mm/memory.c:
      
        if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd)))
                goto no_page_table;
      
        if (pmd_huge(*pmd)) {
                BUG_ON(flags & FOLL_GET);
                page = follow_huge_pmd(mm, address, pmd, flags & FOLL_WRITE);
                goto out;
        }
      
      Note that this code currently doesn't work as intended if the pmd refers
      to a huge page, the pmd_huge() check can not be reached if the page is
      huge.
      
      Extending pmd_bad() (and, for future 1GB page support, pud_bad()) to
      allow for the PSE bit being set fixes this. For similar reasons,
      allowing the NX bit being set is necessary, too. I have seen huge pages
      having the NX bit set in their pmd entry, which would cause the same
      problem.
      Signed-Off-By: default avatarHans Rosenfeld <hans.rosenfeld@amd.com>
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      cded932b