1. 06 Jan, 2006 9 commits
  2. 09 Nov, 2005 4 commits
    • Nick Piggin's avatar
      [PATCH] sched: resched and cpu_idle rework · 64c7c8f8
      Nick Piggin authored
      
      
      Make some changes to the NEED_RESCHED and POLLING_NRFLAG to reduce
      confusion, and make their semantics rigid.  Improves efficiency of
      resched_task and some cpu_idle routines.
      
      * In resched_task:
      - TIF_NEED_RESCHED is only cleared with the task's runqueue lock held,
        and as we hold it during resched_task, then there is no need for an
        atomic test and set there. The only other time this should be set is
        when the task's quantum expires, in the timer interrupt - this is
        protected against because the rq lock is irq-safe.
      
      - If TIF_NEED_RESCHED is set, then we don't need to do anything. It
        won't get unset until the task get's schedule()d off.
      
      - If we are running on the same CPU as the task we resched, then set
        TIF_NEED_RESCHED and no further action is required.
      
      - If we are running on another CPU, and TIF_POLLING_NRFLAG is *not* set
        after TIF_NEED_RESCHED has been set, then we need to send an IPI.
      
      Using these rules, we are able to remove the test and set operation in
      resched_task, and make clear the previously vague semantics of
      POLLING_NRFLAG.
      
      * In idle routines:
      - Enter cpu_idle with preempt disabled. When the need_resched() condition
        becomes true, explicitly call schedule(). This makes things a bit clearer
        (IMO), but haven't updated all architectures yet.
      
      - Many do a test and clear of TIF_NEED_RESCHED for some reason. According
        to the resched_task rules, this isn't needed (and actually breaks the
        assumption that TIF_NEED_RESCHED is only cleared with the runqueue lock
        held). So remove that. Generally one less locked memory op when switching
        to the idle thread.
      
      - Many idle routines clear TIF_POLLING_NRFLAG, and only set it in the inner
        most polling idle loops. The above resched_task semantics allow it to be
        set until before the last time need_resched() is checked before going into
        a halt requiring interrupt wakeup.
      
        Many idle routines simply never enter such a halt, and so POLLING_NRFLAG
        can be always left set, completely eliminating resched IPIs when rescheduling
        the idle task.
      
        POLLING_NRFLAG width can be increased, to reduce the chance of resched IPIs.
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Cc: Ingo Molnar <mingo@elte.hu>
      Cc: Con Kolivas <kernel@kolivas.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      64c7c8f8
    • Nick Piggin's avatar
      [PATCH] sched: disable preempt in idle tasks · 5bfb5d69
      Nick Piggin authored
      
      
      Run idle threads with preempt disabled.
      
      Also corrected a bugs in arm26's cpu_idle (make it actually call schedule()).
      How did it ever work before?
      
      Might fix the CPU hotplugging hang which Nigel Cunningham noted.
      
      We think the bug hits if the idle thread is preempted after checking
      need_resched() and before going to sleep, then the CPU offlined.
      
      After calling stop_machine_run, the CPU eventually returns from preemption and
      into the idle thread and goes to sleep.  The CPU will continue executing
      previous idle and have no chance to call play_dead.
      
      By disabling preemption until we are ready to explicitly schedule, this bug is
      fixed and the idle threads generally become more robust.
      
      From: alexs <ashepard@u.washington.edu>
      
        PPC build fix
      
      From: Yoichi Yuasa <yuasa@hh.iij4u.or.jp>
      
        MIPS build fix
      Signed-off-by: default avatarNick Piggin <npiggin@suse.de>
      Signed-off-by: default avatarYoichi Yuasa <yuasa@hh.iij4u.or.jp>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      5bfb5d69
    • Adrian Bunk's avatar
      [PATCH] s390: "extern inline" -> "static inline" · 4448aaf0
      Adrian Bunk authored
      
      
      "extern inline" -> "static inline"
      Signed-off-by: default avatarAdrian Bunk <bunk@stusta.de>
      Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      4448aaf0
    • Al Viro's avatar
      [PATCH] Fix sysctl unregistration oops (CVE-2005-2709) · 330d57fb
      Al Viro authored
      
      
      You could open the /proc/sys/net/ipv4/conf/<if>/<whatever> file, then
      wait for interface to go away, try to grab as much memory as possible in
      hope to hit the (kfreed) ctl_table.  Then fill it with pointers to your
      function.  Then do read from file you've opened and if you are lucky,
      you'll get it called as ->proc_handler() in kernel mode.
      
      So this is at least an Oops and possibly more.  It does depend on an
      interface going away though, so less of a security risk than it would
      otherwise be.
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      330d57fb
  3. 07 Nov, 2005 6 commits
  4. 31 Oct, 2005 4 commits
  5. 30 Oct, 2005 1 commit
    • Hugh Dickins's avatar
      [PATCH] mm: init_mm without ptlock · 872fec16
      Hugh Dickins authored
      
      
      First step in pushing down the page_table_lock.  init_mm.page_table_lock has
      been used throughout the architectures (usually for ioremap): not to serialize
      kernel address space allocation (that's usually vmlist_lock), but because
      pud_alloc,pmd_alloc,pte_alloc_kernel expect caller holds it.
      
      Reverse that: don't lock or unlock init_mm.page_table_lock in any of the
      architectures; instead rely on pud_alloc,pmd_alloc,pte_alloc_kernel to take
      and drop it when allocating a new one, to check lest a racing task already
      did.  Similarly no page_table_lock in vmalloc's map_vm_area.
      
      Some temporary ugliness in __pud_alloc and __pmd_alloc: since they also handle
      user mms, which are converted only by a later patch, for now they have to lock
      differently according to whether or not it's init_mm.
      
      If sources get muddled, there's a danger that an arch source taking
      init_mm.page_table_lock will be mixed with common source also taking it (or
      neither take it).  So break the rules and make another change, which should
      break the build for such a mismatch: remove the redundant mm arg from
      pte_alloc_kernel (ppc64 scrapped its distinct ioremap_mm in 2.6.13).
      
      Exceptions: arm26 used pte_alloc_kernel on user mm, now pte_alloc_map; ia64
      used pte_alloc_map on init_mm, now pte_alloc_kernel; parisc had bad args to
      pmd_alloc and pte_alloc_kernel in unused USE_HPPA_IOREMAP code; ppc64
      map_io_page forgot to unlock on failure; ppc mmu_mapin_ram and ppc64 im_free
      took page_table_lock for no good reason.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      872fec16
  6. 29 Sep, 2005 1 commit
  7. 17 Sep, 2005 4 commits
  8. 12 Sep, 2005 1 commit
    • Tsuneo.Yoshioka@f-secure.com's avatar
      [PATCH] x86-64: Fix 32bit sendfile · 83b942bd
      Tsuneo.Yoshioka@f-secure.com authored
      
      
      If we use 64bit kernel on ia64/x86_64/s390 architecture, and we run
      32bit binary on 32bit compatibility mode, sendfile system call seems be
      not set offset argument.
      
      This is because sendfile's return value is not zero but the code regards
      the result by return value is zero or not.
      
      This problem will be affect to ia64/x86_64/s390 and not affect to other
      architecture does not affect other architecture (mips/parisc/ppc64/sparc64).
      Signed-off-by: default avatarAndi Kleen <ak@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      83b942bd
  9. 10 Sep, 2005 1 commit
    • Ingo Molnar's avatar
      [PATCH] spinlock consolidation · fb1c8f93
      Ingo Molnar authored
      
      
      This patch (written by me and also containing many suggestions of Arjan van
      de Ven) does a major cleanup of the spinlock code.  It does the following
      things:
      
       - consolidates and enhances the spinlock/rwlock debugging code
      
       - simplifies the asm/spinlock.h files
      
       - encapsulates the raw spinlock type and moves generic spinlock
         features (such as ->break_lock) into the generic code.
      
       - cleans up the spinlock code hierarchy to get rid of the spaghetti.
      
      Most notably there's now only a single variant of the debugging code,
      located in lib/spinlock_debug.c.  (previously we had one SMP debugging
      variant per architecture, plus a separate generic one for UP builds)
      
      Also, i've enhanced the rwlock debugging facility, it will now track
      write-owners.  There is new spinlock-owner/CPU-tracking on SMP builds too.
      All locks have lockup detection now, which will work for both soft and hard
      spin/rwlock lockups.
      
      The arch-level include files now only contain the minimally necessary
      subset of the spinlock code - all the rest that can be generalized now
      lives in the generic headers:
      
       include/asm-i386/spinlock_types.h       |   16
       include/asm-x86_64/spinlock_types.h     |   16
      
      I have also split up the various spinlock variants into separate files,
      making it easier to see which does what. The new layout is:
      
         SMP                         |  UP
         ----------------------------|-----------------------------------
         asm/spinlock_types_smp.h    |  linux/spinlock_types_up.h
         linux/spinlock_types.h      |  linux/spinlock_types.h
         asm/spinlock_smp.h          |  linux/spinlock_up.h
         linux/spinlock_api_smp.h    |  linux/spinlock_api_up.h
         linux/spinlock.h            |  linux/spinlock.h
      
      /*
       * here's the role of the various spinlock/rwlock related include files:
       *
       * on SMP builds:
       *
       *  asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the
       *                        initializers
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  asm/spinlock.h:       contains the __raw_spin_*()/etc. lowlevel
       *                        implementations, mostly inline assembly code
       *
       *   (also included on UP-debug builds:)
       *
       *  linux/spinlock_api_smp.h:
       *                        contains the prototypes for the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       *
       * on UP builds:
       *
       *  linux/spinlock_type_up.h:
       *                        contains the generic, simplified UP spinlock type.
       *                        (which is an empty structure on non-debug builds)
       *
       *  linux/spinlock_types.h:
       *                        defines the generic type and initializers
       *
       *  linux/spinlock_up.h:
       *                        contains the __raw_spin_*()/etc. version of UP
       *                        builds. (which are NOPs on non-debug, non-preempt
       *                        builds)
       *
       *   (included on UP-non-debug builds:)
       *
       *  linux/spinlock_api_up.h:
       *                        builds the _spin_*() APIs.
       *
       *  linux/spinlock.h:     builds the final spin_*() APIs.
       */
      
      All SMP and UP architectures are converted by this patch.
      
      arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via
      crosscompilers.  m32r, mips, sh, sparc, have not been tested yet, but should
      be mostly fine.
      
      From: Grant Grundler <grundler@parisc-linux.org>
      
        Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU).
        Builds 32-bit SMP kernel (not booted or tested).  I did not try to build
        non-SMP kernels.  That should be trivial to fix up later if necessary.
      
        I converted bit ops atomic_hash lock to raw_spinlock_t.  Doing so avoids
        some ugly nesting of linux/*.h and asm/*.h files.  Those particular locks
        are well tested and contained entirely inside arch specific code.  I do NOT
        expect any new issues to arise with them.
      
       If someone does ever need to use debug/metrics with them, then they will
        need to unravel this hairball between spinlocks, atomic ops, and bit ops
        that exist only because parisc has exactly one atomic instruction: LDCW
        (load and clear word).
      
      From: "Luck, Tony" <tony.luck@intel.com>
      
         ia64 fix
      Signed-off-by: default avatarIngo Molnar <mingo@elte.hu>
      Signed-off-by: default avatarArjan van de Ven <arjanv@infradead.org>
      Signed-off-by: default avatarGrant Grundler <grundler@parisc-linux.org>
      Cc: Matthew Wilcox <willy@debian.org>
      Signed-off-by: default avatarHirokazu Takata <takata@linux-m32r.org>
      Signed-off-by: default avatarMikael Pettersson <mikpe@csd.uu.se>
      Signed-off-by: default avatarBenoit Boissinot <benoit.boissinot@ens-lyon.org>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      fb1c8f93
  10. 09 Sep, 2005 2 commits
  11. 07 Sep, 2005 1 commit
  12. 05 Sep, 2005 4 commits
  13. 29 Aug, 2005 1 commit
    • Steven Rostedt's avatar
      [PATCH] convert signal handling of NODEFER to act like other Unix boxes. · 69be8f18
      Steven Rostedt authored
      
      
      It has been reported that the way Linux handles NODEFER for signals is
      not consistent with the way other Unix boxes handle it.  I've written a
      program to test the behavior of how this flag affects signals and had
      several reports from people who ran this on various Unix boxes,
      confirming that Linux seems to be unique on the way this is handled.
      
      The way NODEFER affects signals on other Unix boxes is as follows:
      
      1) If NODEFER is set, other signals in sa_mask are still blocked.
      
      2) If NODEFER is set and the signal is in sa_mask, then the signal is
      still blocked. (Note: this is the behavior of all tested but Linux _and_
      NetBSD 2.0 *).
      
      The way NODEFER affects signals on Linux:
      
      1) If NODEFER is set, other signals are _not_ blocked regardless of
      sa_mask (Even NetBSD doesn't do this).
      
      2) If NODEFER is set and the signal is in sa_mask, then the signal being
      handled is not blocked.
      
      The patch converts signal handling in all current Linux architectures to
      the way most Unix boxes work.
      
      Unix boxes that were tested:  DU4, AIX 5.2, Irix 6.5, NetBSD 2.0, SFU
      3.5 on WinXP, AIX 5.3, Mac OSX, and of course Linux 2.6.13-rcX.
      
      * NetBSD was the only other Unix to behave like Linux on point #2. The
      main concern was brought up by point #1 which even NetBSD isn't like
      Linux.  So with this patch, we leave NetBSD as the lonely one that
      behaves differently here with #2.
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
      69be8f18
  14. 24 Aug, 2005 1 commit