• Tejun Heo's avatar
    blkcg: shoot down blkio_groups on elevator switch · 72e06c25
    Tejun Heo authored
    Elevator switch may involve changes to blkcg policies.  Implement
    shoot down of blkio_groups.
    Combined with the previous bypass updates, the end goal is updating
    blkcg core such that it can ensure that blkcg's being affected become
    quiescent and don't have any per-blkg data hanging around before
    commencing any policy updates.  Until queues are made aware of the
    policies that applies to them, as an interim step, all per-policy blkg
    data will be shot down.
    * blk-throtl doesn't need this change as it can't be disabled for a
      live queue; however, update it anyway as the scheduled blkg
      unification requires this behavior change.  This means that
      blk-throtl configuration will be unnecessarily lost over elevator
      switch.  This oddity will be removed after blkcg learns to associate
      individual policies with request_queues.
    * blk-throtl dosen't shoot down root_tg.  This is to ease transition.
      Unified blkg will always have persistent root group and not shooting
      down root_tg for now eases transition to that point by avoiding
      having to update td->root_tg and is safe as blk-throtl can never be
    -v2: Vivek pointed out that group list is not guaranteed to be empty
         on return from clear function if it raced cgroup removal and
         lost.  Fix it by waiting a bit and retrying.  This kludge will
         soon be removed once locking is updated such that blkg is never
         in limbo state between blkcg and request_queue locks.
         blk-throtl no longer shoots down root_tg to avoid breaking
         Also, Nest queue_lock inside blkio_list_lock not the other way
         around to avoid introduce possible deadlock via blkcg lock.
    -v3: blkcg_clear_queue() repositioned and renamed to
         blkg_destroy_all() to increase consistency with later changes.
         cfq_clear_queue() updated to check q->elevator before
         dereferencing it to avoid NULL dereference on not fully
         initialized queues (used by later change).
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Cc: Vivek Goyal <vgoyal@redhat.com>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>