1. 21 Dec, 2018 1 commit
    • Mike Snitzer's avatar
      dm thin: send event about thin-pool state change _after_ making it · cd5d8a92
      Mike Snitzer authored
      commit f6c36758
      
       upstream.
      
      Sending a DM event before a thin-pool state change is about to happen is
      a bug.  It wasn't realized until it became clear that userspace response
      to the event raced with the actual state change that the event was
      meant to notify about.
      
      Fix this by first updating internal thin-pool state to reflect what the
      DM event is being issued about.  This fixes a long-standing racey/buggy
      userspace device-mapper-test-suite 'resize_io' test that would get an
      event but not find the state it was looking for -- so it would just go
      on to hang because no other events caused the test to reevaluate the
      thin-pool's state.
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      cd5d8a92
  2. 10 Oct, 2018 1 commit
    • Joe Thornber's avatar
      dm thin metadata: try to avoid ever aborting transactions · 1484d4ff
      Joe Thornber authored
      [ Upstream commit 3ab91828
      
       ]
      
      Committing a transaction can consume some metadata of it's own, we now
      reserve a small amount of metadata to cover this.  Free metadata
      reported by the kernel will not include this reserve.
      
      If any of the reserve has been used after a commit we enter a new
      internal state PM_OUT_OF_METADATA_SPACE.  This is reported as
      PM_READ_ONLY, so no userland changes are needed.  If the metadata
      device is resized the pool will move back to PM_WRITE.
      
      These changes mean we never need to abort and rollback a transaction due
      to running out of metadata space.  This is particularly important
      because there have been a handful of reports of data corruption against
      DM thin-provisioning that can all be attributed to the thin-pool having
      ran out of metadata space.
      Signed-off-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarSasha Levin <alexander.levin@microsoft.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      1484d4ff
  3. 09 Sep, 2018 1 commit
    • Hou Tao's avatar
      dm thin: stop no_space_timeout worker when switching to write-mode · 3bef8825
      Hou Tao authored
      commit 75294442 upstream.
      
      Now both check_for_space() and do_no_space_timeout() will read & write
      pool->pf.error_if_no_space.  If these functions run concurrently, as
      shown in the following case, the default setting of "queue_if_no_space"
      can get lost.
      
      precondition:
          * error_if_no_space = false (aka "queue_if_no_space")
          * pool is in Out-of-Data-Space (OODS) mode
          * no_space_timeout worker has been queued
      
      CPU 0:                          CPU 1:
      // delete a thin device
      process_delete_mesg()
      // check_for_space() invoked by commit()
      set_pool_mode(pool, PM_WRITE)
          pool->pf.error_if_no_space = \
           pt->requested_pf.error_if_no_space
      
      				// timeout, pool is still in OODS mode
      				do_no_space_timeout
      				    // "queue_if_no_space" config is lost
      				    pool->pf.error_if_no_space = true
          pool->pf.mode = new_mode
      
      Fix it by stopping no_space_timeout worker when switching to write mode.
      
      Fixes: bcc696fa
      
       ("dm thin: stay in out-of-data-space mode once no_space_timeout expires")
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarHou Tao <houtao1@huawei.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      3bef8825
  4. 03 Jul, 2018 1 commit
    • Mike Snitzer's avatar
      dm thin: handle running out of data space vs concurrent discard · 0b19825f
      Mike Snitzer authored
      commit a685557f upstream.
      
      Discards issued to a DM thin device can complete to userspace (via
      fstrim) _before_ the metadata changes associated with the discards is
      reflected in the thinp superblock (e.g. free blocks).  As such, if a
      user constructs a test that loops repeatedly over these steps, block
      allocation can fail due to discards not having completed yet:
      1) fill thin device via filesystem file
      2) remove file
      3) fstrim
      
      From initial report, here:
      https://www.redhat.com/archives/dm-devel/2018-April/msg00022.html
      
      
      
      "The root cause of this issue is that dm-thin will first remove
      mapping and increase corresponding blocks' reference count to prevent
      them from being reused before DISCARD bios get processed by the
      underlying layers. However. increasing blocks' reference count could
      also increase the nr_allocated_this_transaction in struct sm_disk
      which makes smd->old_ll.nr_allocated +
      smd->nr_allocated_this_transaction bigger than smd->old_ll.nr_blocks.
      In this case, alloc_data_block() will never commit metadata to reset
      the begin pointer of struct sm_disk, because sm_disk_get_nr_free()
      always return an underflow value."
      
      While there is room for improvement to the space-map accounting that
      thinp is making use of: the reality is this test is inherently racey and
      will result in the previous iteration's fstrim's discard(s) completing
      vs concurrent block allocation, via dd, in the next iteration of the
      loop.
      
      No amount of space map accounting improvements will be able to allow
      user's to use a block before a discard of that block has completed.
      
      So the best we can really do is allow DM thinp to gracefully handle such
      aggressive use of all the pool's data by degrading the pool into
      out-of-data-space (OODS) mode.  We _should_ get that behaviour already
      (if space map accounting didn't falsely cause alloc_data_block() to
      believe free space was available).. but short of that we handle the
      current reality that dm_pool_alloc_data_block() can return -ENOSPC.
      Reported-by: default avatarDennis Yang <dennisyang@qnap.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      0b19825f
  5. 20 Dec, 2017 1 commit
    • monty_pavel@sina.com's avatar
      dm: fix various targets to dm_register_target after module __init resources created · fbce429b
      monty_pavel@sina.com authored
      commit 7e6358d2
      
       upstream.
      
      A NULL pointer is seen if two concurrent "vgchange -ay -K <vg name>"
      processes race to load the dm-thin-pool module:
      
       PID: 25992 TASK: ffff883cd7d23500 CPU: 4 COMMAND: "vgchange"
        #0 [ffff883cd743d600] machine_kexec at ffffffff81038fa9
        0000001 [ffff883cd743d660] crash_kexec at ffffffff810c5992
        0000002 [ffff883cd743d730] oops_end at ffffffff81515c90
        0000003 [ffff883cd743d760] no_context at ffffffff81049f1b
        0000004 [ffff883cd743d7b0] __bad_area_nosemaphore at ffffffff8104a1a5
        0000005 [ffff883cd743d800] bad_area at ffffffff8104a2ce
        0000006 [ffff883cd743d830] __do_page_fault at ffffffff8104aa6f
        0000007 [ffff883cd743d950] do_page_fault at ffffffff81517bae
        0000008 [ffff883cd743d980] page_fault at ffffffff81514f95
           [exception RIP: kmem_cache_alloc+108]
           RIP: ffffffff8116ef3c RSP: ffff883cd743da38 RFLAGS: 00010046
           RAX: 0000000000000004 RBX: ffffffff81121b90 RCX: ffff881bf1e78cc0
           RDX: 0000000000000000 RSI: 00000000000000d0 RDI: 0000000000000000
           RBP: ffff883cd743da68 R8: ffff881bf1a4eb00 R9: 0000000080042000
           R10: 0000000000002000 R11: 0000000000000000 R12: 00000000000000d0
           R13: 0000000000000000 R14: 00000000000000d0 R15: 0000000000000246
           ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
        0000009 [ffff883cd743da70] mempool_alloc_slab at ffffffff81121ba5
       0000010 [ffff883cd743da80] mempool_create_node at ffffffff81122083
       0000011 [ffff883cd743dad0] mempool_create at ffffffff811220f4
       0000012 [ffff883cd743dae0] pool_ctr at ffffffffa08de049 [dm_thin_pool]
       0000013 [ffff883cd743dbd0] dm_table_add_target at ffffffffa0005f2f [dm_mod]
       0000014 [ffff883cd743dc30] table_load at ffffffffa0008ba9 [dm_mod]
       0000015 [ffff883cd743dc90] ctl_ioctl at ffffffffa0009dc4 [dm_mod]
      
      The race results in a NULL pointer because:
      
      Process A (vgchange -ay -K):
       	a. send DM_LIST_VERSIONS_CMD ioctl;
       	b. pool_target not registered;
       	c. modprobe dm_thin_pool and wait until end.
      
      Process B (vgchange -ay -K):
       	a. send DM_LIST_VERSIONS_CMD ioctl;
       	b. pool_target registered;
       	c. table_load->dm_table_add_target->pool_ctr;
       	d. _new_mapping_cache is NULL and panic.
      Note:
       	1. process A and process B are two concurrent processes.
       	2. pool_target can be detected by process B but
       	_new_mapping_cache initialization has not ended.
      
      To fix dm-thin-pool, and other targets (cache, multipath, and snapshot)
      with the same problem, simply dm_register_target() after all resources
      created during module init (as labelled with __init) are finished.
      Signed-off-by: default avatarmonty <monty_pavel@sina.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      fbce429b
  6. 28 Aug, 2017 1 commit
  7. 23 Aug, 2017 1 commit
    • Christoph Hellwig's avatar
      block: replace bi_bdev with a gendisk pointer and partitions index · 74d46992
      Christoph Hellwig authored
      
      
      This way we don't need a block_device structure to submit I/O.  The
      block_device has different life time rules from the gendisk and
      request_queue and is usually only available when the block device node
      is open.  Other callers need to explicitly create one (e.g. the lightnvm
      passthrough code, or the new nvme multipathing code).
      
      For the actual I/O path all that we need is the gendisk, which exists
      once per block device.  But given that the block layer also does
      partition remapping we additionally need a partition index, which is
      used for said remapping in generic_make_request.
      
      Note that all the block drivers generally want request_queue or
      sometimes the gendisk, so this removes a layer of indirection all
      over the stack.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
      74d46992
  8. 27 Jun, 2017 1 commit
    • Vallish Vaidyeshwara's avatar
      dm thin: do not queue freed thin mapping for next stage processing · 00a0ea33
      Vallish Vaidyeshwara authored
      
      
      process_prepared_discard_passdown_pt1() should cleanup
      dm_thin_new_mapping in cases of error.
      
      dm_pool_inc_data_range() can fail trying to get a block reference:
      
      metadata operation 'dm_pool_inc_data_range' failed: error = -61
      
      When dm_pool_inc_data_range() fails, dm thin aborts current metadata
      transaction and marks pool as PM_READ_ONLY. Memory for thin mapping
      is released as well. However, current thin mapping will be queued
      onto next stage as part of queue_passdown_pt2() or passdown_endio().
      This dangling thin mapping memory when processed and accessed in
      next stage will lead to device mapper crashing.
      
      Code flow without fix:
      -> process_prepared_discard_passdown_pt1(m)
         -> dm_thin_remove_range()
         -> discard passdown
            --> passdown_endio(m) queues m onto next stage
         -> dm_pool_inc_data_range() fails, frees memory m
                  but does not remove it from next stage queue
      
      -> process_prepared_discard_passdown_pt2(m)
         -> processes freed memory m and crashes
      
      One such stack:
      
      Call Trace:
      [<ffffffffa037a46f>] dm_cell_release_no_holder+0x2f/0x70 [dm_bio_prison]
      [<ffffffffa039b6dc>] cell_defer_no_holder+0x3c/0x80 [dm_thin_pool]
      [<ffffffffa039b88b>] process_prepared_discard_passdown_pt2+0x4b/0x90 [dm_thin_pool]
      [<ffffffffa0399611>] process_prepared+0x81/0xa0 [dm_thin_pool]
      [<ffffffffa039e735>] do_worker+0xc5/0x820 [dm_thin_pool]
      [<ffffffff8152bf54>] ? __schedule+0x244/0x680
      [<ffffffff81087e72>] ? pwq_activate_delayed_work+0x42/0xb0
      [<ffffffff81089f53>] process_one_work+0x153/0x3f0
      [<ffffffff8108a71b>] worker_thread+0x12b/0x4b0
      [<ffffffff8108a5f0>] ? rescuer_thread+0x350/0x350
      [<ffffffff8108fd6a>] kthread+0xca/0xe0
      [<ffffffff8108fca0>] ? kthread_park+0x60/0x60
      [<ffffffff81530b45>] ret_from_fork+0x25/0x30
      
      The fix is to first take the block ref count for discarded block and
      then do a passdown discard of this block. If block ref count fails,
      then bail out aborting current metadata transaction, mark pool as
      PM_READ_ONLY and also free current thin mapping memory (existing error
      handling code) without queueing this thin mapping onto next stage of
      processing. If block ref count succeeds, then passdown discard of this
      block. Discard callback of passdown_endio() will queue this thin mapping
      onto next stage of processing.
      
      Code flow with fix:
      -> process_prepared_discard_passdown_pt1(m)
         -> dm_thin_remove_range()
         -> dm_pool_inc_data_range()
            --> if fails, free memory m and bail out
         -> discard passdown
            --> passdown_endio(m) queues m onto next stage
      
      Cc: stable <stable@vger.kernel.org> # v4.9+
      Reviewed-by: default avatarEduardo Valentin <eduval@amazon.com>
      Reviewed-by: default avatarCristian Gafton <gafton@amazon.com>
      Reviewed-by: default avatarAnchal Agarwal <anchalag@amazon.com>
      Signed-off-by: default avatarVallish Vaidyeshwara <vallish@amazon.com>
      Reviewed-by: default avatarJoe Thornber <ejt@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      00a0ea33
  9. 09 Jun, 2017 2 commits
  10. 24 Apr, 2017 1 commit
    • Dennis Yang's avatar
      dm thin: fix a memory leak when passing discard bio down · 948f581a
      Dennis Yang authored
      
      
      dm-thin does not free the discard_parent bio after all chained sub
      bios finished. The following kmemleak report could be observed after
      pool with discard_passdown option processes discard bios in
      linux v4.11-rc7. To fix this, we drop the discard_parent bio reference
      when its endio (passdown_endio) called.
      
      unreferenced object 0xffff8803d6b29700 (size 256):
        comm "kworker/u8:0", pid 30349, jiffies 4379504020 (age 143002.776s)
        hex dump (first 32 bytes):
          00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
          01 00 00 00 00 00 00 f0 00 00 00 00 00 00 00 00  ................
        backtrace:
          [<ffffffff81a5efd9>] kmemleak_alloc+0x49/0xa0
          [<ffffffff8114ec34>] kmem_cache_alloc+0xb4/0x100
          [<ffffffff8110eec0>] mempool_alloc_slab+0x10/0x20
          [<ffffffff8110efa5>] mempool_alloc+0x55/0x150
          [<ffffffff81374939>] bio_alloc_bioset+0xb9/0x260
          [<ffffffffa018fd20>] process_prepared_discard_passdown_pt1+0x40/0x1c0 [dm_thin_pool]
          [<ffffffffa018b409>] break_up_discard_bio+0x1a9/0x200 [dm_thin_pool]
          [<ffffffffa018b484>] process_discard_cell_passdown+0x24/0x40 [dm_thin_pool]
          [<ffffffffa018b24d>] process_discard_bio+0xdd/0xf0 [dm_thin_pool]
          [<ffffffffa018ecf6>] do_worker+0xa76/0xd50 [dm_thin_pool]
          [<ffffffff81086239>] process_one_work+0x139/0x370
          [<ffffffff810867b1>] worker_thread+0x61/0x450
          [<ffffffff8108b316>] kthread+0xd6/0xf0
          [<ffffffff81a6cd1f>] ret_from_fork+0x3f/0x70
          [<ffffffffffffffff>] 0xffffffffffffffff
      
      Cc: stable@vger.kernel.org
      Signed-off-by: default avatarDennis Yang <dennisyang@qnap.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      948f581a
  11. 08 Apr, 2017 1 commit
  12. 07 Mar, 2017 1 commit
  13. 02 Feb, 2017 1 commit
  14. 27 Jan, 2017 1 commit
  15. 07 Aug, 2016 1 commit
    • Jens Axboe's avatar
      block: rename bio bi_rw to bi_opf · 1eff9d32
      Jens Axboe authored
      Since commit 63a4cc24
      
      , bio->bi_rw contains flags in the lower
      portion and the op code in the higher portions. This means that
      old code that relies on manually setting bi_rw is most likely
      going to be broken. Instead of letting that brokeness linger,
      rename the member, to force old and out-of-tree code to break
      at compile time instead of at runtime.
      
      No intended functional changes in this commit.
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      1eff9d32
  16. 20 Jul, 2016 1 commit
  17. 07 Jun, 2016 4 commits
  18. 13 May, 2016 3 commits
  19. 05 May, 2016 1 commit
  20. 11 Mar, 2016 1 commit
    • Mike Snitzer's avatar
      dm thin: consistently return -ENOSPC if pool has run out of data space · c3667cc6
      Mike Snitzer authored
      Commit 0a927c2f
      
       ("dm thin: return -ENOSPC when erroring retry list due
      to out of data space") was a step in the right direction but didn't go
      far enough.
      
      Add a new 'out_of_data_space' flag to 'struct pool' and set it if/when
      the pool runs of of data space.  This fixes cell_error() and
      error_retry_list() to not blindly return -EIO.
      
      We cannot rely on the 'error_if_no_space' feature flag since it is
      transient (in that it can be reset once space is added, plus it only
      controls whether errors are issued, it doesn't reflect whether the
      pool is actually out of space).
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      c3667cc6
  21. 23 Feb, 2016 1 commit
  22. 07 Jan, 2016 1 commit
  23. 17 Dec, 2015 1 commit
    • Nikolay Borisov's avatar
      dm thin: fix race condition when destroying thin pool workqueue · 18d03e8c
      Nikolay Borisov authored
      When a thin pool is being destroyed delayed work items are
      cancelled using cancel_delayed_work(), which doesn't guarantee that on
      return the delayed item isn't running.  This can cause the work item to
      requeue itself on an already destroyed workqueue.  Fix this by using
      cancel_delayed_work_sync() which guarantees that on return the work item
      is not running anymore.
      
      Fixes: 905e51b3 ("dm thin: commit outstanding data every second")
      Fixes: 85ad643b
      
       ("dm thin: add timeout to stop out-of-data-space mode holding IO forever")
      Signed-off-by: default avatarNikolay Borisov <kernel@kyup.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org
      18d03e8c
  24. 23 Nov, 2015 1 commit
    • Mike Snitzer's avatar
      dm thin: fix regression in advertised discard limits · 0fcb04d5
      Mike Snitzer authored
      When establishing a thin device's discard limits we cannot rely on the
      underlying thin-pool device's discard capabilities (which are inherited
      from the thin-pool's underlying data device) given that DM thin devices
      must provide discard support even when the thin-pool's underlying data
      device doesn't support discards.
      
      Users were exposed to this thin device discard limits regression if
      their thin-pool's underlying data device does _not_ support discards.
      This regression caused all upper-layers that called the
      blkdev_issue_discard() interface to not be able to issue discards to
      thin devices (because discard_granularity was 0).  This regression
      wasn't caught earlier because the device-mapper-test-suite's extensive
      'thin-provisioning' discard tests are only ever performed against
      thin-pool's with data devices that support discards.
      
      Fix is to have thin_io_hints() test the pool's 'discard_enabled' feature
      rather than inferring whether or not a thin device's discard support
      should be enabled by looking at the thin-pool's discard_granularity.
      
      Fixes: 21607670
      
       ("dm thin: disable discard support for thin devices if pool's is disabled")
      Reported-by: default avatarMike Gerber <mike@sprachgewalt.de>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Cc: stable@vger.kernel.org # 4.1+
      0fcb04d5
  25. 16 Nov, 2015 1 commit
    • Mike Snitzer's avatar
      dm thin: restore requested 'error_if_no_space' setting on OODS to WRITE transition · 172c2386
      Mike Snitzer authored
      
      
      A thin-pool that is in out-of-data-space (OODS) mode may transition back
      to write mode -- without the admin adding more space to the thin-pool --
      if/when blocks are released (either by deleting thin devices or
      discarding provisioned blocks).
      
      But as part of the thin-pool's earlier transition to out-of-data-space
      mode the thin-pool may have set the 'error_if_no_space' flag to true if
      the no_space_timeout expires without more space having been made
      available.  That implementation detail, of changing the pool's
      error_if_no_space setting, needs to be reset back to the default that
      the user specified when the thin-pool's table was loaded.
      
      Otherwise we'll drop the user requested behaviour on the floor when this
      out-of-data-space to write mode transition occurs.
      Reported-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      Acked-by: default avatarJoe Thornber <ejt@redhat.com>
      Fixes: 2c43fd26 ("dm thin: fix missing out-of-data-space to write mode transition if blocks are released")
      Cc: stable@vger.kernel.org
      172c2386
  26. 13 Oct, 2015 1 commit
  27. 14 Sep, 2015 1 commit
  28. 18 Aug, 2015 1 commit
  29. 13 Aug, 2015 1 commit
    • Kent Overstreet's avatar
      block: kill merge_bvec_fn() completely · 8ae12666
      Kent Overstreet authored
      
      
      As generic_make_request() is now able to handle arbitrarily sized bios,
      it's no longer necessary for each individual block driver to define its
      own ->merge_bvec_fn() callback. Remove every invocation completely.
      
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Lars Ellenberg <drbd-dev@lists.linbit.com>
      Cc: drbd-user@lists.linbit.com
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Yehuda Sadeh <yehuda@inktank.com>
      Cc: Sage Weil <sage@inktank.com>
      Cc: Alex Elder <elder@kernel.org>
      Cc: ceph-devel@vger.kernel.org
      Cc: Alasdair Kergon <agk@redhat.com>
      Cc: Mike Snitzer <snitzer@redhat.com>
      Cc: dm-devel@redhat.com
      Cc: Neil Brown <neilb@suse.de>
      Cc: linux-raid@vger.kernel.org
      Cc: Christoph Hellwig <hch@infradead.org>
      Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
      Acked-by: NeilBrown <neilb@suse.de> (for the 'md' bits)
      Acked-by: default avatarMike Snitzer <snitzer@redhat.com>
      Signed-off-by: default avatarKent Overstreet <kent.overstreet@gmail.com>
      [dpark: also remove ->merge_bvec_fn() in dm-thin as well as
       dm-era-target, and resolve merge conflicts]
      Signed-off-by: default avatarDongsu Park <dpark@posteo.net>
      Signed-off-by: default avatarMing Lin <ming.l@ssi.samsung.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      8ae12666
  30. 29 Jul, 2015 1 commit
    • Christoph Hellwig's avatar
      block: add a bi_error field to struct bio · 4246a0b6
      Christoph Hellwig authored
      
      
      Currently we have two different ways to signal an I/O error on a BIO:
      
       (1) by clearing the BIO_UPTODATE flag
       (2) by returning a Linux errno value to the bi_end_io callback
      
      The first one has the drawback of only communicating a single possible
      error (-EIO), and the second one has the drawback of not beeing persistent
      when bios are queued up, and are not passed along from child to parent
      bio in the ever more popular chaining scenario.  Having both mechanisms
      available has the additional drawback of utterly confusing driver authors
      and introducing bugs where various I/O submitters only deal with one of
      them, and the others have to add boilerplate code to deal with both kinds
      of error returns.
      
      So add a new bi_error field to store an errno value directly in struct
      bio and remove the existing mechanisms to clean all this up.
      Signed-off-by: default avatarChristoph Hellwig <hch@lst.de>
      Reviewed-by: default avatarHannes Reinecke <hare@suse.de>
      Reviewed-by: default avatarNeilBrown <neilb@suse.com>
      Signed-off-by: default avatarJens Axboe <axboe@fb.com>
      4246a0b6
  31. 26 Jul, 2015 1 commit
  32. 16 Jul, 2015 2 commits
    • Mike Snitzer's avatar
      dm thin: display 'needs_check' in status if it is set · e4c78e21
      Mike Snitzer authored
      
      
      There is currently no way to see that the needs_check flag has been set
      in the metadata.  Display 'needs_check' in the thin-pool status if it is
      set in the thinp metadata.
      
      Also, update thinp documentation.
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      e4c78e21
    • Mike Snitzer's avatar
      dm thin: stay in out-of-data-space mode once no_space_timeout expires · bcc696fa
      Mike Snitzer authored
      
      
      This fixes an issue where running out of data space would cause the
      thin-pool's metadata to become read-only.  There was no reason to make
      metadata read-only -- calling set_pool_mode() with PM_READ_ONLY was a
      misguided way to error all queued and future write IOs.  We can
      accomplish the same by degrading from PM_OUT_OF_DATA_SPACE to
      PM_OUT_OF_DATA_SPACE with error_if_no_space enabled.
      
      Otherwise, the use of PM_READ_ONLY could cause a race where commit() was
      started before the PM_READ_ONLY transition but dm_pool_commit_metadata()
      would go on to fail because the block manager had transitioned to
      read-only.  The return of -EPERM from dm_pool_commit_metadata(), due to
      attempting to commit while in read-only mode, caused the thin-pool to
      set 'needs_check' because a metadata_operation_failed().  This needless
      cascade of failures makes life for users more difficult than needed.
      Reported-by: default avatarVivek Goyal <vgoyal@redhat.com>
      Signed-off-by: default avatarMike Snitzer <snitzer@redhat.com>
      bcc696fa
  33. 06 Jul, 2015 1 commit