Skip to content
  • Filipe David Borba Manana's avatar
    Btrfs: faster file extent item replace operations · 1acae57b
    Filipe David Borba Manana authored
    
    
    When writing to a file we drop existing file extent items that cover the
    write range and then add a new file extent item that represents that write
    range.
    
    Before this change we were doing a tree lookup to remove the file extent
    items, and then after we did another tree lookup to insert the new file
    extent item.
    Most of the time all the file extent items we need to drop are located
    within a single leaf - this is the leaf where our new file extent item ends
    up at. Therefore, in this common case just combine these 2 operations into
    a single one.
    
    By avoiding the second btree navigation for insertion of the new file extent
    item, we reduce btree node/leaf lock acquisitions/releases, btree block/leaf
    COW operations, CPU time on btree node/leaf key binary searches, etc.
    
    Besides for file writes, this is an operation that happens for file fsync's
    as well. However log btrees are much less likely to big as big as regular
    fs btrees, therefore the impact of this change is smaller.
    
    The following benchmark was performed against an SSD drive and a
    HDD drive, both for random and sequential writes:
    
      sysbench --test=fileio --file-num=4096 --file-total-size=8G \
         --file-test-mode=[rndwr|seqwr] --num-threads=512 \
         --file-block-size=8192 \ --max-requests=1000000 \
         --file-fsync-freq=0 --file-io-mode=sync [prepare|run]
    
    All results below are averages of 10 runs of the respective test.
    
    ** SSD sequential writes
    
    Before this change: 225.88 Mb/sec
    After this change:  277.26 Mb/sec
    
    ** SSD random writes
    
    Before this change: 49.91 Mb/sec
    After this change:  56.39 Mb/sec
    
    ** HDD sequential writes
    
    Before this change: 68.53 Mb/sec
    After this change:  69.87 Mb/sec
    
    ** HDD random writes
    
    Before this change: 13.04 Mb/sec
    After this change:  14.39 Mb/sec
    
    Signed-off-by: default avatarFilipe David Borba Manana <fdmanana@gmail.com>
    Signed-off-by: default avatarJosef Bacik <jbacik@fb.com>
    Signed-off-by: default avatarChris Mason <clm@fb.com>
    1acae57b