1. 31 Oct, 2005 2 commits
  2. 30 Oct, 2005 1 commit
    • Hugh Dickins's avatar
      [PATCH] mm: update_hiwaters just in time · 365e9c87
      Hugh Dickins authored
      update_mem_hiwater has attracted various criticisms, in particular from those
      concerned with mm scalability.  Originally it was called whenever rss or
      total_vm got raised.  Then many of those callsites were replaced by a timer
      tick call from account_system_time.  Now Frank van Maarseveen reports that to
      be found inadequate.  How about this?  Works for Frank.
      Replace update_mem_hiwater, a poor combination of two unrelated ops, by macros
      update_hiwater_rss and update_hiwater_vm.  Don't attempt to keep
      mm->hiwater_rss up to date at timer tick, nor every time we raise rss (usually
      by 1): those are hot paths.  Do the opposite, update only when about to lower
      rss (usually by many), or just before final accounting in do_exit.  Handle
      mm->hiwater_vm in the same way, though it's much less of an issue.  Demand
      that whoever collects these hiwater statistics do the work of taking the
      maximum with rss or total_vm.
      And there has been no collector of these hiwater statistics in the tree.  The
      new convention needs an example, so match Frank's usage by adding a VmPeak
      line above VmSize to /proc/<pid>/status, and also a VmHWM line above VmRSS
      (High-Water-Mark or High-Water-Memory).
      There was a particular anomaly during mremap move, that hiwater_vm might be
      captured too high.  A fleeting such anomaly remains, but it's quickly
      corrected now, whereas before it would stick.
      What locking?  None: if the app is racy then these statistics will be racy,
      it's not worth any overhead to make them exact.  But whenever it suits,
      hiwater_vm is updated under exclusive mmap_sem, and hiwater_rss under
      page_table_lock (for now) or with preemption disabled (later on): without
      going to any trouble, minimize the time between reading current values and
      updating, to minimize those occasions when a racing thread bumps a count up
      and back down in between.
      Signed-off-by: default avatarHugh Dickins <hugh@veritas.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  3. 27 Oct, 2005 1 commit
  4. 24 Oct, 2005 1 commit
    • Oleg Nesterov's avatar
      [PATCH] posix-timers: remove false BUG_ON() from run_posix_cpu_timers() · 3de463c7
      Oleg Nesterov authored
      do_exit() clears ->it_##clock##_expires, but nothing prevents
      another cpu to attach the timer to exiting process after that.
      After exit_notify() does 'write_unlock_irq(&tasklist_lock)' and
      before do_exit() calls 'schedule() local timer interrupt can find
      tsk->exit_state != 0. If that state was EXIT_DEAD (or another cpu
      does sys_wait4) interrupted task has ->signal == NULL.
      At this moment exiting task has no pending cpu timers, they were cleaned
      up in __exit_signal()->posix_cpu_timers_exit{,_group}(), so we can just
      return from irq.
      Signed-off-by: default avatarOleg Nesterov <oleg@tv-sign.ru>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  5. 21 Oct, 2005 1 commit
    • Roland McGrath's avatar
      [PATCH] Call exit_itimers from do_exit, not __exit_signal · 25f407f0
      Roland McGrath authored
      When I originally moved exit_itimers into __exit_signal, that was the only
      place where we could reliably know it was the last thread in the group
      dying, without races.  Since then we've gotten the signal_struct.live
      counter, and do_exit can reliably do group-wide cleanup work.
      This patch moves the call to do_exit, where it's made without locks.  This
      avoids the deadlock issues that the old __exit_signal code's comment talks
      about, and the one that Oleg found recently with process CPU timers.
      [ This replaces e03d13e9
      , which is why
        it was just reverted. ]
      Signed-off-by: default avatarRoland McGrath <roland@redhat.com>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  6. 01 Oct, 2005 1 commit
    • Linus Torvalds's avatar
      Fix inequality comparison against "task->state" · 14bf01bb
      Linus Torvalds authored
      We should always use bitmask ops, rather than depend on some ordering of
      the different states.  With the TASK_NONINTERACTIVE flag, the inequality
      doesn't really work.
      Oleg Nesterov argues (likely correctly) that this test is unnecessary in
      the first place.  However, the minimal fix for now is to at least make
      it work in the presense of TASK_NONINTERACTIVE.  Waiting for consensus
      from Roland & co on potential bigger cleanups.
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  7. 17 Sep, 2005 1 commit
  8. 09 Sep, 2005 2 commits
    • Dipankar Sarma's avatar
      [PATCH] files: files struct with RCU · ab2af1f5
      Dipankar Sarma authored
      Patch to eliminate struct files_struct.file_lock spinlock on the reader side
      and use rcu refcounting rcuref_xxx api for the f_count refcounter.  The
      updates to the fdtable are done by allocating a new fdtable structure and
      setting files->fdt to point to the new structure.  The fdtable structure is
      protected by RCU thereby allowing lock-free lookup.  For fd arrays/sets that
      are vmalloced, we use keventd to free them since RCU callbacks can't sleep.  A
      global list of fdtable to be freed is not scalable, so we use a per-cpu list.
      If keventd is already handling the current cpu's work, we use a timer to defer
      queueing of that work.
      Since the last publication, this patch has been re-written to avoid using
      explicit memory barriers and use rcu_assign_pointer(), rcu_dereference()
      premitives instead.  This required that the fd information is kept in a
      separate structure (fdtable) and updated atomically.
      Signed-off-by: default avatarDipankar Sarma <dipankar@in.ibm.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    • Dipankar Sarma's avatar
      [PATCH] files: break up files struct · badf1662
      Dipankar Sarma authored
      In order for the RCU to work, the file table array, sets and their sizes must
      be updated atomically.  Instead of ensuring this through too many memory
      barriers, we put the arrays and their sizes in a separate structure.  This
      patch takes the first step of putting the file table elements in a separate
      structure fdtable that is embedded withing files_struct.  It also changes all
      the users to refer to the file table using files_fdtable() macro.  Subsequent
      applciation of RCU becomes easier after this.
      Signed-off-by: default avatarDipankar Sarma <dipankar@in.ibm.com>
      Signed-Off-By: default avatarDavid Howells <dhowells@redhat.com>
      Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  9. 04 Aug, 2005 1 commit
  10. 27 Jun, 2005 1 commit
    • Jens Axboe's avatar
      [PATCH] Update cfq io scheduler to time sliced design · 22e2c507
      Jens Axboe authored
      This updates the CFQ io scheduler to the new time sliced design (cfq
      v3).  It provides full process fairness, while giving excellent
      aggregate system throughput even for many competing processes.  It
      supports io priorities, either inherited from the cpu nice value or set
      directly with the ioprio_get/set syscalls.  The latter closely mimic
      This import is based on my latest from -mm.
      Signed-off-by: default avatarJens Axboe <axboe@suse.de>
      Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
  11. 23 Jun, 2005 2 commits
  12. 17 Jun, 2005 1 commit
  13. 03 May, 2005 1 commit
    • Russ Anderson's avatar
      [patch] MCA recovery module undefined symbol fix · 012914da
      Russ Anderson authored
      The patch "MCA recovery improvements" added do_exit to mca_drv.c.
      That's fine when the mca recovery code is built in the kernel
      (CONFIG_IA64_MCA_RECOVERY=y) but breaks building the mca recovery
      code as a module (CONFIG_IA64_MCA_RECOVERY=m).
      Most users are currently building this as a module, as loading
      and unloading the module provides a very convenient way to turn
      on/off error recovery.
      This patch exports do_exit, so mca_drv.c can build as a module.
      Signed-off-by: Russ Anderson (rja@sgi.com)
      Signed-off-by: default avatarTony Luck <tony.luck@intel.com>
  14. 01 May, 2005 3 commits
  15. 29 Apr, 2005 1 commit
  16. 16 Apr, 2005 2 commits