• Tejun Heo's avatar
    blkcg: fix use-after-free in __blkg_release_rcu() by making blkcg_gq refcnt an atomic_t · a5049a8a
    Tejun Heo authored
    So, this patch should do.  Joe, Vivek, can one of you guys please
    verify that the oops goes away with this patch?
    Jens, the original thread can be read at
    The fix converts blkg->refcnt from int to atomic_t.  It does some
    overhead but it should be minute compared to everything else which is
    going on and the involved cacheline bouncing, so I think it's highly
    unlikely to cause any noticeable difference.  Also, the refcnt in
    question should be converted to a perpcu_ref for blk-mq anyway, so the
    atomic_t is likely to go away pretty soon anyway.
    ------- 8< -------
    __blkg_release_rcu() may be invoked after the associated request_queue
    is released with a RCU grace period inbetween.  As such, the function
    and callbacks invoked from it must not dereference the associated
    request_queue.  This is clearly indicated in the comment above the
    Unfortunately, while trying to fix a different issue, 2a4fd070
    ("blkcg: move bulk of blkcg_gq release operations to the RCU
    callback") ignored this and added [un]locking of @blkg->q->queue_lock
    to __blkg_release_rcu().  This of course can cause oops as the
    request_queue may be long gone by the time this code gets executed.
      general protection fault: 0000 [#1] SMP
      CPU: 21 PID: 30 Comm: rcuos/21 Not tainted 3.15.0 #1
      Hardware name: Stratus ftServer 6400/G7LAZ, BIOS BIOS Version 6.3:57 12/25/2013
      task: ffff880854021de0 ti: ffff88085403c000 task.ti: ffff88085403c000
      RIP: 0010:[<ffffffff8162e9e5>]  [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60
      RSP: 0018:ffff88085403fdf0  EFLAGS: 00010086
      RAX: 0000000000020000 RBX: 0000000000000010 RCX: 0000000000000000
      RDX: 000060ef80008248 RSI: 0000000000000286 RDI: 6b6b6b6b6b6b6b6b
      RBP: ffff88085403fdf0 R08: 0000000000000286 R09: 0000000000009f39
      R10: 0000000000020001 R11: 0000000000020001 R12: ffff88103c17a130
      R13: ffff88103c17a080 R14: 0000000000000000 R15: 0000000000000000
      FS:  0000000000000000(0000) GS:ffff88107fca0000(0000) knlGS:0000000000000000
      CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      CR2: 00000000006e5ab8 CR3: 000000000193d000 CR4: 00000000000407e0
       ffff88085403fe18 ffffffff812cbfc2 ffff88103c17a130 0000000000000000
       ffff88103c17a130 ffff88085403fec0 ffffffff810d1d28 ffff880854021de0
       ffff880854021de0 ffff88107fcaec58 ffff88085403fe80 ffff88107fcaec30
      Call Trace:
       [<ffffffff812cbfc2>] __blkg_release_rcu+0x72/0x150
       [<ffffffff810d1d28>] rcu_nocb_kthread+0x1e8/0x300
       [<ffffffff81091d81>] kthread+0xe1/0x100
       [<ffffffff8163813c>] ret_from_fork+0x7c/0xb0
      Code: ff 47 04 48 8b 7d 08 be 00 02 00 00 e8 55 48 a4 ff 5d c3 0f 1f 00 66 66 66 66 90 55 48 89 e5
      +fa 66 66 90 66 66 90 b8 00 00 02 00 <f0> 0f c1 07 89 c2 c1 ea 10 66 39 c2 75 02 5d c3 83 e2 fe 0f
      RIP  [<ffffffff8162e9e5>] _raw_spin_lock_irq+0x15/0x60
       RSP <ffff88085403fdf0>
    The request_queue locking was added because blkcg_gq->refcnt is an int
    protected with the queue lock and __blkg_release_rcu() needs to put
    the parent.  Let's fix it by making blkcg_gq->refcnt an atomic_t and
    dropping queue locking in the function.
    Given the general heavy weight of the current request_queue and blkcg
    operations, this is unlikely to cause any noticeable overhead.
    Moreover, blkcg_gq->refcnt is likely to be converted to percpu_ref in
    the near future, so whatever (most likely negligible) overhead it may
    add is temporary.
    Signed-off-by: default avatarTejun Heo <tj@kernel.org>
    Reported-by: default avatarJoe Lawrence <joe.lawrence@stratus.com>
    Acked-by: default avatarVivek Goyal <vgoyal@redhat.com>
    Link: http://lkml.kernel.org/g/alpine.DEB.2.02.1406081816540.17948@jlaw-desktop.mno.stratus.com
    Cc: stable@vger.kernel.org
    Signed-off-by: default avatarJens Axboe <axboe@fb.com>
blk-cgroup.h 16.7 KB