Skip to content
  • Paul Jackson's avatar
    mempolicy: add bitmap_onto() and bitmap_fold() operations · 7ea931c9
    Paul Jackson authored
    
    
    The following adds two more bitmap operators, bitmap_onto() and bitmap_fold(),
    with the usual cpumask and nodemask wrappers.
    
    The bitmap_onto() operator computes one bitmap relative to another.  If the
    n-th bit in the origin mask is set, then the m-th bit of the destination mask
    will be set, where m is the position of the n-th set bit in the relative mask.
    
    The bitmap_fold() operator folds a bitmap into a second that has bit m set iff
    the input bitmap has some bit n set, where m == n mod sz, for the specified sz
    value.
    
    There are two substantive changes between this patch and its
    predecessor bitmap_relative:
     1) Renamed bitmap_relative() to be bitmap_onto().
     2) Added bitmap_fold().
    
    The essential motivation for bitmap_onto() is to provide a mechanism for
    converting a cpuset-relative CPU or Node mask to an absolute mask.  Cpuset
    relative masks are written as if the current task were in a cpuset whose CPUs
    or Nodes were just the consecutive ones numbered 0..N-1, for some N.  The
    bitmap_onto() operator is provided in anticipation of adding support for the
    first such cpuset relative mask, by the mbind() and set_mempolicy() system
    calls, using a planned flag of MPOL_F_RELATIVE_NODES.  These bitmap operators
    (and their nodemask wrappers, in particular) will be used in code that
    converts the user specified cpuset relative memory policy to a specific system
    node numbered policy, given the current mems_allowed of the tasks cpuset.
    
    Such cpuset relative mempolicies will address two deficiencies
    of the existing interface between cpusets and mempolicies:
     1) A task cannot at present reliably establish a cpuset
        relative mempolicy because there is an essential race
        condition, in that the tasks cpuset may be changed in
        between the time the task can query its cpuset placement,
        and the time the task can issue the applicable mbind or
        set_memplicy system call.
     2) A task cannot at present establish what cpuset relative
        mempolicy it would like to have, if it is in a smaller
        cpuset than it might have mempolicy preferences for,
        because the existing interface only allows specifying
        mempolicies for nodes currently allowed by the cpuset.
    
    Cpuset relative mempolicies are useful for tasks that don't distinguish
    particularly between one CPU or Node and another, but only between how many of
    each are allowed, and the proper placement of threads and memory pages on the
    various CPUs and Nodes available.
    
    The motivation for the added bitmap_fold() can be seen in the following
    example.
    
    Let's say an application has specified some mempolicies that presume 16 memory
    nodes, including say a mempolicy that specified MPOL_F_RELATIVE_NODES (cpuset
    relative) nodes 12-15.  Then lets say that application is crammed into a
    cpuset that only has 8 memory nodes, 0-7.  If one just uses bitmap_onto(),
    this mempolicy, mapped to that cpuset, would ignore the requested relative
    nodes above 7, leaving it empty of nodes.  That's not good; better to fold the
    higher nodes down, so that some nodes are included in the resulting mapped
    mempolicy.  In this case, the mempolicy nodes 12-15 are taken modulo 8 (the
    weight of the mems_allowed of the confining cpuset), resulting in a mempolicy
    specifying nodes 4-7.
    
    Signed-off-by: default avatarPaul Jackson <pj@sgi.com>
    Signed-off-by: default avatarDavid Rientjes <rientjes@google.com>
    Cc: Christoph Lameter <clameter@sgi.com>
    Cc: Andi Kleen <ak@suse.de>
    Cc: Mel Gorman <mel@csn.ul.ie>
    Cc: Lee Schermerhorn <lee.schermerhorn@hp.com>
    Cc: <kosaki.motohiro@jp.fujitsu.com>
    Cc: <ray-lk@madrabbit.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    7ea931c9