1. 09 Nov, 2016 1 commit
  2. 02 Nov, 2016 1 commit
  3. 01 Nov, 2016 1 commit
  4. 28 Oct, 2016 2 commits
    • Christoph Hellwig's avatar
      block: better op and flags encoding · ef295ecf
      Christoph Hellwig authored
      
      
      Now that we don't need the common flags to overflow outside the range
      of a 32-bit type we can encode them the same way for both the bio and
      request fields.  This in addition allows us to place the operation
      first (and make some room for more ops while we're at it) and to
      stop having to shift around the operation values.
      
      In addition this allows passing around only one value in the block layer
      instead of two (and eventuall also in the file systems, but we can do
      that later) and thus clean up a lot of code.
      
      Last but not least this allows decreasing the size of the cmd_flags
      field in struct request to 32-bits.  Various functions passing this
      value could also be updated, but I'd like to avoid the churn for now.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      ef295ecf
    • Christoph Hellwig's avatar
      block: split out request-only flags into a new namespace · e8064021
      Christoph Hellwig authored
      
      
      A lot of the REQ_* flags are only used on struct requests, and only of
      use to the block layer and a few drivers that dig into struct request
      internals.
      
      This patch adds a new req_flags_t rq_flags field to struct request for
      them, and thus dramatically shrinks the number of common requests.  It
      also removes the unfortunate situation where we have to fit the fields
      from the same enum into 32 bits for struct bio and 64 bits for
      struct request.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarShaun Tancheff <shaun.tancheff@seagate.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      e8064021
  5. 15 Sep, 2016 1 commit
  6. 07 Jun, 2016 4 commits
  7. 13 Apr, 2016 1 commit
  8. 25 Nov, 2015 2 commits
    • Jens Axboe's avatar
      Revert "blk-flush: Queue through IO scheduler when flush not required" · d7cf931d
      Jens Axboe authored
      This reverts commit 1b2ff19e.
      
      Jan writes:
      
      --
      
      Thanks for report! After some investigation I found out we allocate
      elevator specific data in __get_request() only for non-flush requests. And
      this is actually required since the flush machinery uses the space in
      struct request for something else. Doh. So my patch is just wrong and not
      easy to fix since at the time __get_request() is called we are not sure
      whether the flush machinery will be used in the end. Jens, please revert
      1b2ff19e. Thanks!
      
      I'm somewhat surprised that you can reliably hit the race where flushing
      gets disabled for the device just while the request is in flight. But I
      guess during boot it makes some sense.
      
      --
      
      So let's just revert it, we can fix the queue run manually after the
      fact. This race is rare enough that it didn't trigger in testing, it
      requires the specific disable-while-in-flight scenario to trigger.
      d7cf931d
    • Jens Axboe's avatar
      Revert "blk-flush: Queue through IO scheduler when flush not required" · dcd8376c
      Jens Axboe authored
      This reverts commit 1b2ff19e.
      
      Jan writes:
      
      --
      
      Thanks for report! After some investigation I found out we allocate
      elevator specific data in __get_request() only for non-flush requests. And
      this is actually required since the flush machinery uses the space in
      struct request for something else. Doh. So my patch is just wrong and not
      easy to fix since at the time __get_request() is called we are not sure
      whether the flush machinery will be used in the end. Jens, please revert
      1b2ff19e. Thanks!
      
      I'm somewhat surprised that you can reliably hit the race where flushing
      gets disabled for the device just while the request is in flight. But I
      guess during boot it makes some sense.
      
      --
      
      So let's just revert it, we can fix the queue run manually after the
      fact. This race is rare enough that it didn't trigger in testing, it
      requires the specific disable-while-in-flight scenario to trigger.
      dcd8376c
  9. 16 Nov, 2015 1 commit
    • Jan Kara's avatar
      blk-flush: Queue through IO scheduler when flush not required · 1b2ff19e
      Jan Kara authored
      
      
      Currently blk_insert_flush() just adds flush request to q->queue_head
      when flush is not required. That completely bypasses IO scheduler so
      e.g. CFQ can be idling waiting for new request to arrive and will idle
      through the whole window unnecessarily. Luckily this only happens in
      rare cases as usually checks in generic_make_request_checks() clear
      FLUSH and FUA flags early if they are not needed.
      
      When no flushing is actually required, we can easily fix the problem by
      properly queueing the request through the IO scheduler. Ideally IO
      scheduler should be also made aware of requests queued via
      blk_flush_queue_rq(). However inserting flush request through IO
      scheduler can have unwanted side-effects since due to flush batching
      delaying the flush request in IO scheduler will delay all flush requests
      possibly coming from other processes. So we keep adding the request
      directly to q->queue_head.
      Signed-off-by: default avatarJan Kara <jack@suse.com>
      Reviewed-by: default avatarJeff Moyer <jmoyer@redhat.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      1b2ff19e
  10. 15 Aug, 2015 1 commit
    • Ming Lei's avatar
      blk-mq: fix race between timeout and freeing request · 0048b483
      Ming Lei authored
      
      
      Inside timeout handler, blk_mq_tag_to_rq() is called
      to retrieve the request from one tag. This way is obviously
      wrong because the request can be freed any time and some
      fiedds of the request can't be trusted, then kernel oops
      might be triggered[1].
      
      Currently wrt. blk_mq_tag_to_rq(), the only special case is
      that the flush request can share same tag with the request
      cloned from, and the two requests can't be active at the same
      time, so this patch fixes the above issue by updating tags->rqs[tag]
      with the active request(either flush rq or the request cloned
      from) of the tag.
      
      Also blk_mq_tag_to_rq() gets much simplified with this patch.
      
      Given blk_mq_tag_to_rq() is mainly for drivers and the caller must
      make sure the request can't be freed, so in bt_for_each() this
      helper is replaced with tags->rqs[tag].
      
      [1] kernel oops log
      [  439.696220] BUG: unable to handle kernel NULL pointer dereference at 0000000000000158^M
      [  439.697162] IP: [<ffffffff812d89ba>] blk_mq_tag_to_rq+0x21/0x6e^M
      [  439.700653] PGD 7ef765067 PUD 7ef764067 PMD 0 ^M
      [  439.700653] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ^M
      [  439.700653] Dumping ftrace buffer:^M
      [  439.700653]    (ftrace buffer empty)^M
      [  439.700653] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw^M
      [  439.700653] CPU: 6 PID: 2779 Comm: stress-ng-sigfd Not tainted 4.2.0-rc5-next-20150805+ #265^M
      [  439.730500] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011^M
      [  439.730500] task: ffff880605308000 ti: ffff88060530c000 task.ti: ffff88060530c000^M
      [  439.730500] RIP: 0010:[<ffffffff812d89ba>]  [<ffffffff812d89ba>] blk_mq_tag_to_rq+0x21/0x6e^M
      [  439.730500] RSP: 0018:ffff880819203da0  EFLAGS: 00010283^M
      [  439.730500] RAX: ffff880811b0e000 RBX: ffff8800bb465f00 RCX: 0000000000000002^M
      [  439.730500] RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000^M
      [  439.730500] RBP: ffff880819203db0 R08: 0000000000000002 R09: 0000000000000000^M
      [  439.730500] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000202^M
      [  439.730500] R13: ffff880814104800 R14: 0000000000000002 R15: ffff880811a2ea00^M
      [  439.730500] FS:  00007f165b3f5740(0000) GS:ffff880819200000(0000) knlGS:0000000000000000^M
      [  439.730500] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b^M
      [  439.730500] CR2: 0000000000000158 CR3: 00000007ef766000 CR4: 00000000000006e0^M
      [  439.730500] Stack:^M
      [  439.730500]  0000000000000008 ffff8808114eed90 ffff880819203e00 ffffffff812dc104^M
      [  439.755663]  ffff880819203e40 ffffffff812d9f5e 0000020000000000 ffff8808114eed80^M
      [  439.755663] Call Trace:^M
      [  439.755663]  <IRQ> ^M
      [  439.755663]  [<ffffffff812dc104>] bt_for_each+0x6e/0xc8^M
      [  439.755663]  [<ffffffff812d9f5e>] ? blk_mq_rq_timed_out+0x6a/0x6a^M
      [  439.755663]  [<ffffffff812d9f5e>] ? blk_mq_rq_timed_out+0x6a/0x6a^M
      [  439.755663]  [<ffffffff812dc1b3>] blk_mq_tag_busy_iter+0x55/0x5e^M
      [  439.755663]  [<ffffffff812d88b4>] ? blk_mq_bio_to_request+0x38/0x38^M
      [  439.755663]  [<ffffffff812d8911>] blk_mq_rq_timer+0x5d/0xd4^M
      [  439.755663]  [<ffffffff810a3e10>] call_timer_fn+0xf7/0x284^M
      [  439.755663]  [<ffffffff810a3d1e>] ? call_timer_fn+0x5/0x284^M
      [  439.755663]  [<ffffffff812d88b4>] ? blk_mq_bio_to_request+0x38/0x38^M
      [  439.755663]  [<ffffffff810a46d6>] run_timer_softirq+0x1ce/0x1f8^M
      [  439.755663]  [<ffffffff8104c367>] __do_softirq+0x181/0x3a4^M
      [  439.755663]  [<ffffffff8104c76e>] irq_exit+0x40/0x94^M
      [  439.755663]  [<ffffffff81031482>] smp_apic_timer_interrupt+0x33/0x3e^M
      [  439.755663]  [<ffffffff815559a4>] apic_timer_interrupt+0x84/0x90^M
      [  439.755663]  <EOI> ^M
      [  439.755663]  [<ffffffff81554350>] ? _raw_spin_unlock_irq+0x32/0x4a^M
      [  439.755663]  [<ffffffff8106a98b>] finish_task_switch+0xe0/0x163^M
      [  439.755663]  [<ffffffff8106a94d>] ? finish_task_switch+0xa2/0x163^M
      [  439.755663]  [<ffffffff81550066>] __schedule+0x469/0x6cd^M
      [  439.755663]  [<ffffffff8155039b>] schedule+0x82/0x9a^M
      [  439.789267]  [<ffffffff8119b28b>] signalfd_read+0x186/0x49a^M
      [  439.790911]  [<ffffffff8106d86a>] ? wake_up_q+0x47/0x47^M
      [  439.790911]  [<ffffffff811618c2>] __vfs_read+0x28/0x9f^M
      [  439.790911]  [<ffffffff8117a289>] ? __fget_light+0x4d/0x74^M
      [  439.790911]  [<ffffffff811620a7>] vfs_read+0x7a/0xc6^M
      [  439.790911]  [<ffffffff8116292b>] SyS_read+0x49/0x7f^M
      [  439.790911]  [<ffffffff81554c17>] entry_SYSCALL_64_fastpath+0x12/0x6f^M
      [  439.790911] Code: 48 89 e5 e8 a9 b8 e7 ff 5d c3 0f 1f 44 00 00 55 89
      f2 48 89 e5 41 54 41 89 f4 53 48 8b 47 60 48 8b 1c d0 48 8b 7b 30 48 8b
      53 38 <48> 8b 87 58 01 00 00 48 85 c0 75 09 48 8b 97 88 0c 00 00 eb 10
      ^M
      [  439.790911] RIP  [<ffffffff812d89ba>] blk_mq_tag_to_rq+0x21/0x6e^M
      [  439.790911]  RSP <ffff880819203da0>^M
      [  439.790911] CR2: 0000000000000158^M
      [  439.790911] ---[ end trace d40af58949325661 ]---^M
      
      Cc: <stable@vger.kernel.org>
      Signed-off-by: default avatarMing Lei <ming.lei@canonical.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      0048b483
  11. 25 Sep, 2014 9 commits
  12. 22 Sep, 2014 2 commits
  13. 11 Jun, 2014 1 commit
  14. 04 Jun, 2014 1 commit
  15. 30 May, 2014 1 commit
  16. 28 May, 2014 1 commit
  17. 16 Apr, 2014 1 commit
  18. 15 Apr, 2014 2 commits
  19. 09 Apr, 2014 1 commit
  20. 21 Mar, 2014 1 commit
  21. 09 Mar, 2014 1 commit
  22. 21 Feb, 2014 1 commit
  23. 10 Feb, 2014 1 commit
    • Christoph Hellwig's avatar
      blk-mq: rework flush sequencing logic · 18741986
      Christoph Hellwig authored
      
      
      Witch to using a preallocated flush_rq for blk-mq similar to what's done
      with the old request path.  This allows us to set up the request properly
      with a tag from the actually allowed range and ->rq_disk as needed by
      some drivers.  To make life easier we also switch to dynamic allocation
      of ->flush_rq for the old path.
      
      This effectively reverts most of
      
          "blk-mq: fix for flush deadlock"
      
      and
      
          "blk-mq: Don't reserve a tag for flush request"
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      18741986
  24. 30 Jan, 2014 1 commit
    • Shaohua Li's avatar
      blk-mq: Don't reserve a tag for flush request · f0276924
      Shaohua Li authored
      
      
      Reserving a tag (request) for flush to avoid dead lock is a overkill. A
      tag is valuable resource. We can track the number of flush requests and
      disallow having too many pending flush requests allocated. With this
      patch, blk_mq_alloc_request_pinned() could do a busy nop (but not a dead
      loop) if too many pending requests are allocated and new flush request
      is allocated. But this should not be a problem, too many pending flush
      requests are very rare case.
      
      I verified this can fix the deadlock caused by too many pending flush
      requests.
      Signed-off-by: default avatarShaohua Li <shli@fusionio.com>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      f0276924
  25. 24 Nov, 2013 1 commit