• Paul Jackson's avatar
    [PATCH] cpuset: fix obscure attach_task vs exiting race · 181b6480
    Paul Jackson authored
    
    
    Fix obscure race condition in kernel/cpuset.c attach_task() code.
    
    There is basically zero chance of anyone accidentally being harmed by this
    race.
    
    It requires a special 'micro-stress' load and a special timing loop hacks
    in the kernel to hit in less than an hour, and even then you'd have to hit
    it hundreds or thousands of times, followed by some unusual and senseless
    cpuset configuration requests, including removing the top cpuset, to cause
    any visibly harm affects.
    
    One could, with perhaps a few days or weeks of such effort, get the
    reference count on the top cpuset below zero, and manage to crash the
    kernel by asking to remove the top cpuset.
    
    I found it by code inspection.
    
    The race was introduced when 'the_top_cpuset_hack' was introduced, and one
    piece of code was not updated.  An old check for a possibly null task
    cpuset pointer needed to be changed to a check for a task marked
    PF_EXITING.  The pointer can't be null anymore, thanks to
    the_top_cpuset_hack (documented in kernel/cpuset.c).  But the task could
    have gone into PF_EXITING state after it was found in the task_list scan.
    
    If a task is PF_EXITING in this code, it is possible that its task->cpuset
    pointer is pointing to the top cpuset due to the_top_cpuset_hack, rather
    than because the top_cpuset was that tasks last valid cpuset.  In that
    case, the wrong cpuset reference counter would be decremented.
    
    The fix is trivial.  Instead of failing the system call if the tasks cpuset
    pointer is null here, fail it if the task is in PF_EXITING state.
    
    The code for 'the_top_cpuset_hack' that changes an exiting tasks cpuset to
    the top_cpuset is done without locking, so could happen at anytime.  But it
    is done during the exit handling, after the PF_EXITING flag is set.  So if
    we verify that a task is still not PF_EXITING after we copy out its cpuset
    pointer (into 'oldcs', below), we know that 'oldcs' is not one of these
    hack references to the top_cpuset.
    
    Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
    Signed-off-by: default avatarAndrew Morton <akpm@osdl.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@osdl.org>
    181b6480