Commit 28a8f0d3 authored by Mike Christie's avatar Mike Christie Committed by Jens Axboe
Browse files

block, drivers, fs: rename REQ_FLUSH to REQ_PREFLUSH



To avoid confusion between REQ_OP_FLUSH, which is handled by
request_fn drivers, and upper layers requesting the block layer
perform a flush sequence along with possibly a WRITE, this patch
renames REQ_FLUSH to REQ_PREFLUSH.
Signed-off-by: default avatarMike Christie <mchristi@redhat.com>
Reviewed-by: default avatarChristoph Hellwig <hch@lst.de>
Reviewed-by: default avatarHannes Reinecke <hare@suse.com>
Signed-off-by: default avatarJens Axboe <axboe@fb.com>
parent a418090a
......@@ -20,11 +20,11 @@ a forced cache flush, and the Force Unit Access (FUA) flag for requests.
Explicit cache flushes
----------------------
The REQ_FLUSH flag can be OR ed into the r/w flags of a bio submitted from
The REQ_PREFLUSH flag can be OR ed into the r/w flags of a bio submitted from
the filesystem and will make sure the volatile cache of the storage device
has been flushed before the actual I/O operation is started. This explicitly
guarantees that previously completed write requests are on non-volatile
storage before the flagged bio starts. In addition the REQ_FLUSH flag can be
storage before the flagged bio starts. In addition the REQ_PREFLUSH flag can be
set on an otherwise empty bio structure, which causes only an explicit cache
flush without any dependent I/O. It is recommend to use
the blkdev_issue_flush() helper for a pure cache flush.
......@@ -41,21 +41,21 @@ signaled after the data has been committed to non-volatile storage.
Implementation details for filesystems
--------------------------------------
Filesystems can simply set the REQ_FLUSH and REQ_FUA bits and do not have to
Filesystems can simply set the REQ_PREFLUSH and REQ_FUA bits and do not have to
worry if the underlying devices need any explicit cache flushing and how
the Forced Unit Access is implemented. The REQ_FLUSH and REQ_FUA flags
the Forced Unit Access is implemented. The REQ_PREFLUSH and REQ_FUA flags
may both be set on a single bio.
Implementation details for make_request_fn based block drivers
--------------------------------------------------------------
These drivers will always see the REQ_FLUSH and REQ_FUA bits as they sit
These drivers will always see the REQ_PREFLUSH and REQ_FUA bits as they sit
directly below the submit_bio interface. For remapping drivers the REQ_FUA
bits need to be propagated to underlying devices, and a global flush needs
to be implemented for bios with the REQ_FLUSH bit set. For real device
drivers that do not have a volatile cache the REQ_FLUSH and REQ_FUA bits
on non-empty bios can simply be ignored, and REQ_FLUSH requests without
to be implemented for bios with the REQ_PREFLUSH bit set. For real device
drivers that do not have a volatile cache the REQ_PREFLUSH and REQ_FUA bits
on non-empty bios can simply be ignored, and REQ_PREFLUSH requests without
data can be completed successfully without doing any work. Drivers for
devices with volatile caches need to implement the support for these
flags themselves without any help from the block layer.
......@@ -65,8 +65,8 @@ Implementation details for request_fn based block drivers
--------------------------------------------------------------
For devices that do not support volatile write caches there is no driver
support required, the block layer completes empty REQ_FLUSH requests before
entering the driver and strips off the REQ_FLUSH and REQ_FUA bits from
support required, the block layer completes empty REQ_PREFLUSH requests before
entering the driver and strips off the REQ_PREFLUSH and REQ_FUA bits from
requests that have a payload. For devices with volatile write caches the
driver needs to tell the block layer that it supports flushing caches by
doing:
......@@ -74,7 +74,7 @@ doing:
blk_queue_write_cache(sdkp->disk->queue, true, false);
and handle empty REQ_OP_FLUSH requests in its prep_fn/request_fn. Note that
REQ_FLUSH requests with a payload are automatically turned into a sequence
REQ_PREFLUSH requests with a payload are automatically turned into a sequence
of an empty REQ_OP_FLUSH request followed by the actual write by the block
layer. For devices that also support the FUA bit the block layer needs
to be told to pass through the REQ_FUA bit using:
......
......@@ -14,14 +14,14 @@ Log Ordering
We log things in order of completion once we are sure the write is no longer in
cache. This means that normal WRITE requests are not actually logged until the
next REQ_FLUSH request. This is to make it easier for userspace to replay the
log in a way that correlates to what is on disk and not what is in cache, to
make it easier to detect improper waiting/flushing.
next REQ_PREFLUSH request. This is to make it easier for userspace to replay
the log in a way that correlates to what is on disk and not what is in cache,
to make it easier to detect improper waiting/flushing.
This works by attaching all WRITE requests to a list once the write completes.
Once we see a REQ_FLUSH request we splice this list onto the request and once
Once we see a REQ_PREFLUSH request we splice this list onto the request and once
the FLUSH request completes we log all of the WRITEs and then the FLUSH. Only
completed WRITEs, at the time the REQ_FLUSH is issued, are added in order to
completed WRITEs, at the time the REQ_PREFLUSH is issued, are added in order to
simulate the worst case scenario with regard to power failures. Consider the
following example (W means write, C means complete):
......
......@@ -1029,7 +1029,7 @@ static bool blk_rq_should_init_elevator(struct bio *bio)
* Flush requests do not use the elevator so skip initialization.
* This allows a request to share the flush and elevator data.
*/
if (bio->bi_rw & (REQ_FLUSH | REQ_FUA))
if (bio->bi_rw & (REQ_PREFLUSH | REQ_FUA))
return false;
return true;
......@@ -1736,7 +1736,7 @@ static blk_qc_t blk_queue_bio(struct request_queue *q, struct bio *bio)
return BLK_QC_T_NONE;
}
if (bio->bi_rw & (REQ_FLUSH | REQ_FUA)) {
if (bio->bi_rw & (REQ_PREFLUSH | REQ_FUA)) {
spin_lock_irq(q->queue_lock);
where = ELEVATOR_INSERT_FLUSH;
goto get_rq;
......@@ -1968,9 +1968,9 @@ generic_make_request_checks(struct bio *bio)
* drivers without flush support don't have to worry
* about them.
*/
if ((bio->bi_rw & (REQ_FLUSH | REQ_FUA)) &&
if ((bio->bi_rw & (REQ_PREFLUSH | REQ_FUA)) &&
!test_bit(QUEUE_FLAG_WC, &q->queue_flags)) {
bio->bi_rw &= ~(REQ_FLUSH | REQ_FUA);
bio->bi_rw &= ~(REQ_PREFLUSH | REQ_FUA);
if (!nr_sectors) {
err = 0;
goto end_io;
......@@ -2217,7 +2217,7 @@ int blk_insert_cloned_request(struct request_queue *q, struct request *rq)
*/
BUG_ON(blk_queued_rq(rq));
if (rq->cmd_flags & (REQ_FLUSH|REQ_FUA))
if (rq->cmd_flags & (REQ_PREFLUSH | REQ_FUA))
where = ELEVATOR_INSERT_FLUSH;
add_acct_request(q, rq, where);
......@@ -3311,7 +3311,7 @@ void blk_flush_plug_list(struct blk_plug *plug, bool from_schedule)
/*
* rq is already accounted, so use raw insert
*/
if (rq->cmd_flags & (REQ_FLUSH | REQ_FUA))
if (rq->cmd_flags & (REQ_PREFLUSH | REQ_FUA))
__elv_add_request(q, rq, ELEVATOR_INSERT_FLUSH);
else
__elv_add_request(q, rq, ELEVATOR_INSERT_SORT_MERGE);
......
......@@ -10,8 +10,8 @@
* optional steps - PREFLUSH, DATA and POSTFLUSH - according to the request
* properties and hardware capability.
*
* If a request doesn't have data, only REQ_FLUSH makes sense, which
* indicates a simple flush request. If there is data, REQ_FLUSH indicates
* If a request doesn't have data, only REQ_PREFLUSH makes sense, which
* indicates a simple flush request. If there is data, REQ_PREFLUSH indicates
* that the device cache should be flushed before the data is executed, and
* REQ_FUA means that the data must be on non-volatile media on request
* completion.
......@@ -20,11 +20,11 @@
* difference. The requests are either completed immediately if there's no
* data or executed as normal requests otherwise.
*
* If the device has writeback cache and supports FUA, REQ_FLUSH is
* If the device has writeback cache and supports FUA, REQ_PREFLUSH is
* translated to PREFLUSH but REQ_FUA is passed down directly with DATA.
*
* If the device has writeback cache and doesn't support FUA, REQ_FLUSH is
* translated to PREFLUSH and REQ_FUA to POSTFLUSH.
* If the device has writeback cache and doesn't support FUA, REQ_PREFLUSH
* is translated to PREFLUSH and REQ_FUA to POSTFLUSH.
*
* The actual execution of flush is double buffered. Whenever a request
* needs to execute PRE or POSTFLUSH, it queues at
......@@ -103,7 +103,7 @@ static unsigned int blk_flush_policy(unsigned long fflags, struct request *rq)
policy |= REQ_FSEQ_DATA;
if (fflags & (1UL << QUEUE_FLAG_WC)) {
if (rq->cmd_flags & REQ_FLUSH)
if (rq->cmd_flags & REQ_PREFLUSH)
policy |= REQ_FSEQ_PREFLUSH;
if (!(fflags & (1UL << QUEUE_FLAG_FUA)) &&
(rq->cmd_flags & REQ_FUA))
......@@ -391,9 +391,9 @@ void blk_insert_flush(struct request *rq)
/*
* @policy now records what operations need to be done. Adjust
* REQ_FLUSH and FUA for the driver.
* REQ_PREFLUSH and FUA for the driver.
*/
rq->cmd_flags &= ~REQ_FLUSH;
rq->cmd_flags &= ~REQ_PREFLUSH;
if (!(fflags & (1UL << QUEUE_FLAG_FUA)))
rq->cmd_flags &= ~REQ_FUA;
......
......@@ -1247,7 +1247,7 @@ static int blk_mq_direct_issue_request(struct request *rq, blk_qc_t *cookie)
static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
{
const int is_sync = rw_is_sync(bio_op(bio), bio->bi_rw);
const int is_flush_fua = bio->bi_rw & (REQ_FLUSH | REQ_FUA);
const int is_flush_fua = bio->bi_rw & (REQ_PREFLUSH | REQ_FUA);
struct blk_map_ctx data;
struct request *rq;
unsigned int request_count = 0;
......@@ -1344,7 +1344,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio)
{
const int is_sync = rw_is_sync(bio_op(bio), bio->bi_rw);
const int is_flush_fua = bio->bi_rw & (REQ_FLUSH | REQ_FUA);
const int is_flush_fua = bio->bi_rw & (REQ_PREFLUSH | REQ_FUA);
struct blk_plug *plug;
unsigned int request_count = 0;
struct blk_map_ctx data;
......
......@@ -148,7 +148,7 @@ static int _drbd_md_sync_page_io(struct drbd_device *device,
device->md_io.error = -ENODEV;
if ((op == REQ_OP_WRITE) && !test_bit(MD_NO_FUA, &device->flags))
op_flags |= REQ_FUA | REQ_FLUSH;
op_flags |= REQ_FUA | REQ_PREFLUSH;
op_flags |= REQ_SYNC | REQ_NOIDLE;
bio = bio_alloc_drbd(GFP_NOIO);
......@@ -847,7 +847,7 @@ int __drbd_change_sync(struct drbd_device *device, sector_t sector, int size,
unsigned long count = 0;
sector_t esector, nr_sectors;
/* This would be an empty REQ_FLUSH, be silent. */
/* This would be an empty REQ_PREFLUSH, be silent. */
if ((mode == SET_OUT_OF_SYNC) && size == 0)
return 0;
......
......@@ -1609,7 +1609,7 @@ static u32 bio_flags_to_wire(struct drbd_connection *connection,
if (connection->agreed_pro_version >= 95)
return (bio->bi_rw & REQ_SYNC ? DP_RW_SYNC : 0) |
(bio->bi_rw & REQ_FUA ? DP_FUA : 0) |
(bio->bi_rw & REQ_FLUSH ? DP_FLUSH : 0) |
(bio->bi_rw & REQ_PREFLUSH ? DP_FLUSH : 0) |
(bio_op(bio) == REQ_OP_DISCARD ? DP_DISCARD : 0);
else
return bio->bi_rw & REQ_SYNC ? DP_RW_SYNC : 0;
......
......@@ -112,7 +112,7 @@ struct p_header100 {
#define DP_MAY_SET_IN_SYNC 4
#define DP_UNPLUG 8 /* not used anymore */
#define DP_FUA 16 /* equals REQ_FUA */
#define DP_FLUSH 32 /* equals REQ_FLUSH */
#define DP_FLUSH 32 /* equals REQ_PREFLUSH */
#define DP_DISCARD 64 /* equals REQ_DISCARD */
#define DP_SEND_RECEIVE_ACK 128 /* This is a proto B write request */
#define DP_SEND_WRITE_ACK 256 /* This is a proto C write request */
......
......@@ -2158,7 +2158,7 @@ static unsigned long wire_flags_to_bio_flags(u32 dpf)
{
return (dpf & DP_RW_SYNC ? REQ_SYNC : 0) |
(dpf & DP_FUA ? REQ_FUA : 0) |
(dpf & DP_FLUSH ? REQ_FLUSH : 0);
(dpf & DP_FLUSH ? REQ_PREFLUSH : 0);
}
static unsigned long wire_flags_to_bio_op(u32 dpf)
......
......@@ -1132,7 +1132,7 @@ static int drbd_process_write_request(struct drbd_request *req)
* replicating, in which case there is no point. */
if (unlikely(req->i.size == 0)) {
/* The only size==0 bios we expect are empty flushes. */
D_ASSERT(device, req->master_bio->bi_rw & REQ_FLUSH);
D_ASSERT(device, req->master_bio->bi_rw & REQ_PREFLUSH);
if (remote)
_req_mod(req, QUEUE_AS_DRBD_BARRIER);
return remote;
......
......@@ -631,7 +631,7 @@ static void journal_write_unlocked(struct closure *cl)
bio->bi_end_io = journal_write_endio;
bio->bi_private = w;
bio_set_op_attrs(bio, REQ_OP_WRITE,
REQ_SYNC|REQ_META|REQ_FLUSH|REQ_FUA);
REQ_SYNC|REQ_META|REQ_PREFLUSH|REQ_FUA);
bch_bio_map(bio, w->data);
trace_bcache_journal_write(bio);
......
......@@ -205,10 +205,10 @@ static void bch_data_insert_start(struct closure *cl)
return bch_data_invalidate(cl);
/*
* Journal writes are marked REQ_FLUSH; if the original write was a
* Journal writes are marked REQ_PREFLUSH; if the original write was a
* flush, it'll wait on the journal write.
*/
bio->bi_rw &= ~(REQ_FLUSH|REQ_FUA);
bio->bi_rw &= ~(REQ_PREFLUSH|REQ_FUA);
do {
unsigned i;
......@@ -668,7 +668,7 @@ static inline struct search *search_alloc(struct bio *bio,
s->iop.write_prio = 0;
s->iop.error = 0;
s->iop.flags = 0;
s->iop.flush_journal = (bio->bi_rw & (REQ_FLUSH|REQ_FUA)) != 0;
s->iop.flush_journal = (bio->bi_rw & (REQ_PREFLUSH|REQ_FUA)) != 0;
s->iop.wq = bcache_wq;
return s;
......@@ -920,7 +920,7 @@ static void cached_dev_write(struct cached_dev *dc, struct search *s)
bch_writeback_add(dc);
s->iop.bio = bio;
if (bio->bi_rw & REQ_FLUSH) {
if (bio->bi_rw & REQ_PREFLUSH) {
/* Also need to send a flush to the backing device */
struct bio *flush = bio_alloc_bioset(GFP_NOIO, 0,
dc->disk.bio_split);
......
......@@ -788,7 +788,7 @@ static void check_if_tick_bio_needed(struct cache *cache, struct bio *bio)
spin_lock_irqsave(&cache->lock, flags);
if (cache->need_tick_bio &&
!(bio->bi_rw & (REQ_FUA | REQ_FLUSH)) &&
!(bio->bi_rw & (REQ_FUA | REQ_PREFLUSH)) &&
bio_op(bio) != REQ_OP_DISCARD) {
pb->tick = true;
cache->need_tick_bio = false;
......@@ -830,7 +830,7 @@ static dm_oblock_t get_bio_block(struct cache *cache, struct bio *bio)
static int bio_triggers_commit(struct cache *cache, struct bio *bio)
{
return bio->bi_rw & (REQ_FLUSH | REQ_FUA);
return bio->bi_rw & (REQ_PREFLUSH | REQ_FUA);
}
/*
......@@ -1069,7 +1069,7 @@ static void dec_io_migrations(struct cache *cache)
static bool discard_or_flush(struct bio *bio)
{
return bio_op(bio) == REQ_OP_DISCARD ||
bio->bi_rw & (REQ_FLUSH | REQ_FUA);
bio->bi_rw & (REQ_PREFLUSH | REQ_FUA);
}
static void __cell_defer(struct cache *cache, struct dm_bio_prison_cell *cell)
......@@ -1614,8 +1614,8 @@ static void process_flush_bio(struct cache *cache, struct bio *bio)
remap_to_cache(cache, bio, 0);
/*
* REQ_FLUSH is not directed at any particular block so we don't
* need to inc_ds(). REQ_FUA's are split into a write + REQ_FLUSH
* REQ_PREFLUSH is not directed at any particular block so we don't
* need to inc_ds(). REQ_FUA's are split into a write + REQ_PREFLUSH
* by dm-core.
*/
issue(cache, bio);
......@@ -1980,7 +1980,7 @@ static void process_deferred_bios(struct cache *cache)
bio = bio_list_pop(&bios);
if (bio->bi_rw & REQ_FLUSH)
if (bio->bi_rw & REQ_PREFLUSH)
process_flush_bio(cache, bio);
else if (bio_op(bio) == REQ_OP_DISCARD)
process_discard_bio(cache, &structs, bio);
......
......@@ -1911,11 +1911,12 @@ static int crypt_map(struct dm_target *ti, struct bio *bio)
struct crypt_config *cc = ti->private;
/*
* If bio is REQ_FLUSH or REQ_OP_DISCARD, just bypass crypt queues.
* - for REQ_FLUSH device-mapper core ensures that no IO is in-flight
* If bio is REQ_PREFLUSH or REQ_OP_DISCARD, just bypass crypt queues.
* - for REQ_PREFLUSH device-mapper core ensures that no IO is in-flight
* - for REQ_OP_DISCARD caller must use flush if IO ordering matters
*/
if (unlikely(bio->bi_rw & REQ_FLUSH || bio_op(bio) == REQ_OP_DISCARD)) {
if (unlikely(bio->bi_rw & REQ_PREFLUSH ||
bio_op(bio) == REQ_OP_DISCARD)) {
bio->bi_bdev = cc->dev->bdev;
if (bio_sectors(bio))
bio->bi_iter.bi_sector = cc->start +
......
......@@ -1540,9 +1540,9 @@ static int era_map(struct dm_target *ti, struct bio *bio)
remap_to_origin(era, bio);
/*
* REQ_FLUSH bios carry no data, so we're not interested in them.
* REQ_PREFLUSH bios carry no data, so we're not interested in them.
*/
if (!(bio->bi_rw & REQ_FLUSH) &&
if (!(bio->bi_rw & REQ_PREFLUSH) &&
(bio_data_dir(bio) == WRITE) &&
!metadata_current_marked(era->md, block)) {
defer_bio(era, bio);
......
......@@ -380,7 +380,7 @@ static void dispatch_io(int op, int op_flags, unsigned int num_regions,
*/
for (i = 0; i < num_regions; i++) {
*dp = old_pages;
if (where[i].count || (op_flags & REQ_FLUSH))
if (where[i].count || (op_flags & REQ_PREFLUSH))
do_region(op, op_flags, i, where + i, dp, io);
}
......
......@@ -555,7 +555,7 @@ static int log_writes_map(struct dm_target *ti, struct bio *bio)
struct bio_vec bv;
size_t alloc_size;
int i = 0;
bool flush_bio = (bio->bi_rw & REQ_FLUSH);
bool flush_bio = (bio->bi_rw & REQ_PREFLUSH);
bool fua_bio = (bio->bi_rw & REQ_FUA);
bool discard_bio = (bio_op(bio) == REQ_OP_DISCARD);
......
......@@ -704,7 +704,7 @@ static void do_writes(struct mirror_set *ms, struct bio_list *writes)
bio_list_init(&requeue);
while ((bio = bio_list_pop(writes))) {
if ((bio->bi_rw & REQ_FLUSH) ||
if ((bio->bi_rw & REQ_PREFLUSH) ||
(bio_op(bio) == REQ_OP_DISCARD)) {
bio_list_add(&sync, bio);
continue;
......@@ -1253,7 +1253,8 @@ static int mirror_end_io(struct dm_target *ti, struct bio *bio, int error)
* We need to dec pending if this was a write.
*/
if (rw == WRITE) {
if (!(bio->bi_rw & REQ_FLUSH) && bio_op(bio) != REQ_OP_DISCARD)
if (!(bio->bi_rw & REQ_PREFLUSH) &&
bio_op(bio) != REQ_OP_DISCARD)
dm_rh_dec(ms->rh, bio_record->write_region);
return error;
}
......
......@@ -398,7 +398,7 @@ void dm_rh_mark_nosync(struct dm_region_hash *rh, struct bio *bio)
region_t region = dm_rh_bio_to_region(rh, bio);
int recovering = 0;
if (bio->bi_rw & REQ_FLUSH) {
if (bio->bi_rw & REQ_PREFLUSH) {
rh->flush_failure = 1;
return;
}
......@@ -526,7 +526,7 @@ void dm_rh_inc_pending(struct dm_region_hash *rh, struct bio_list *bios)
struct bio *bio;
for (bio = bios->head; bio; bio = bio->bi_next) {
if (bio->bi_rw & REQ_FLUSH || bio_op(bio) == REQ_OP_DISCARD)
if (bio->bi_rw & REQ_PREFLUSH || bio_op(bio) == REQ_OP_DISCARD)
continue;
rh_inc(rh, dm_rh_bio_to_region(rh, bio));
}
......
......@@ -1680,7 +1680,7 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio)
init_tracked_chunk(bio);
if (bio->bi_rw & REQ_FLUSH) {
if (bio->bi_rw & REQ_PREFLUSH) {
bio->bi_bdev = s->cow->bdev;
return DM_MAPIO_REMAPPED;
}
......@@ -1799,7 +1799,7 @@ static int snapshot_merge_map(struct dm_target *ti, struct bio *bio)
init_tracked_chunk(bio);
if (bio->bi_rw & REQ_FLUSH) {
if (bio->bi_rw & REQ_PREFLUSH) {
if (!dm_bio_get_target_bio_nr(bio))
bio->bi_bdev = s->origin->bdev;
else
......@@ -2285,7 +2285,7 @@ static int origin_map(struct dm_target *ti, struct bio *bio)
bio->bi_bdev = o->dev->bdev;
if (unlikely(bio->bi_rw & REQ_FLUSH))
if (unlikely(bio->bi_rw & REQ_PREFLUSH))
return DM_MAPIO_REMAPPED;
if (bio_rw(bio) != WRITE)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment