1. 24 Sep, 2012 4 commits
    • Steven Whitehouse's avatar
      GFS2: Update gfs2_get_block_type() to use rbm · 3983903a
      Steven Whitehouse authored
      
      
      Use the new gfs2_rbm_from_block() function to replace an open
      coded version of the same code.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      3983903a
    • Steven Whitehouse's avatar
      GFS2: Replace rgblk_search with gfs2_rbm_find · 5b924ae2
      Steven Whitehouse authored
      
      
      This is part of a series of patches which are introducing the
      gfs2_rbm structure throughout the block allocation code. The
      main aim of this part is to create a search function which can
      deal directly with struct gfs2_rbm. In this case it specifies
      the initial position at which to start the search and also the
      point at which the search terminates.
      
      The net result of this is to clean up the search code and make
      it rather more readable, and the various possible exceptions which
      may occur during the search are partitioned into their own functions.
      
      There are some bug fixes too. We should not be checking the reservations
      while allocating extents - the time for that is when we are searching
      for where to put the extent, not when we've already made that decision.
      
      Also, rgblk_search had two uses, and in only one of those cases did
      it make sense to check for reservations. This is fixed in the new
      gfs2_rbm_find function, which has a cleaner interface.
      
      The reservation checking has been improved by always checking for
      contiguous reservations, and returning the first free block after
      all contiguous reservations. This is done under the spin lock to
      ensure consistancy of the tree.
      
      The allocation of extents is now in all cases done by the existing
      allocation code, and if there is an active reservation, that is updated
      after the fact. Again this is done under the spin lock, since it entails
      changing the lookup key for the reservation in question.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      5b924ae2
    • Steven Whitehouse's avatar
      GFS2: Add structure to contain rgrp, bitmap, offset tuple · 4a993fb1
      Steven Whitehouse authored
      
      
      This patch introduces a new structure, gfs2_rbm, which is a
      tuple of a resource group, a bitmap within the resource group
      and an offset within that bitmap. This is designed to make
      manipulating these sets of variables easier. There is also a
      new helper function which converts this representation back
      to a disk block address.
      
      In addition, the rbtree nodes which are used for the reservations
      were not being correctly initialised, which is now fixed. Also,
      the tracing was not passing through the inode where it should
      have been. That is mostly fixed aside from one corner case. This
      needs to be revisited since there can also be a NULL rgrp in
      some cases which results in the device being incorrect in the
      trace.
      
      This is intended to be the first step towards cleaning up some
      of the allocation code, and some further bug fixes.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      4a993fb1
    • Steven Whitehouse's avatar
      GFS2: Remove rs_requested field from reservations · 71f890f7
      Steven Whitehouse authored
      
      
      The rs_requested field is left over from the original allocation
      code, however this should have been a parameter passed to the
      various functions from gfs2_inplace_reserve() and not a member of the
      reservation structure as the value is not required after the
      initial allocation.
      
      This also helps simplify the code since we no longer need to set
      the rs_requested to zero. Also the gfs2_inplace_release()
      function can also be simplified since the reservation structure
      will always be defined when it is called, and the only remaining
      task is to unlock the rgrp if required. It can also now be
      called unconditionally too, resulting in a further simplification.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      71f890f7
  2. 13 Sep, 2012 1 commit
    • Steven Whitehouse's avatar
      GFS2: Take account of blockages when using reserved blocks · 62e252ee
      Steven Whitehouse authored
      
      
      The claim_reserved_blks() function was not taking account of
      the possibility of "blockages" while performing allocation.
      This can be caused by another node allocating something in
      the same extent which has been reserved locally.
      
      This patch tests for this condition and then skips the remainder
      of the reservation in this case. This is a relatively rare event,
      so that it should not affect the general performance improvement
      which the block reservations provide.
      
      The claim_reserved_blks() function also appears not to be able
      to deal with reservations which cross bitmap boundaries, but
      that can be dealt with in a future patch since we don't generate
      boundary crossing reservations currently.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      Reported-by: default avatarDavid Teigland <teigland@redhat.com>
      Cc: Bob Peterson <rpeterso@redhat.com>
      62e252ee
  3. 19 Jul, 2012 1 commit
    • Bob Peterson's avatar
      GFS2: Reduce file fragmentation · 8e2e0047
      Bob Peterson authored
      
      
      This patch reduces GFS2 file fragmentation by pre-reserving blocks. The
      resulting improved on disk layout greatly speeds up operations in cases
      which would have resulted in interlaced allocation of blocks previously.
      A typical example of this is 10 parallel dd processes, each writing to a
      file in a common dirctory.
      
      The implementation uses an rbtree of reservations attached to each
      resource group (and each inode).
      
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      8e2e0047
  4. 18 Jul, 2012 1 commit
  5. 14 Jun, 2012 1 commit
  6. 08 Jun, 2012 1 commit
    • Benjamin Marzinski's avatar
      GFS2: Use lvbs for storing rgrp information with mount option · 90306c41
      Benjamin Marzinski authored
      
      
      Instead of reading in the resource groups when gfs2 is checking
      for free space to allocate from, gfs2 can store the necessary infromation
      in the resource group's lvb.  Also, instead of searching for unlinked
      inodes in every resource group that's checked for free space, gfs2 can
      store the number of unlinked but inodes in the lvb, and only check for
      unlinked inodes if it will find some.
      
      The first time a resource group is locked, the lvb must initialized.
      Since this involves counting the unlinked inodes in the resource group,
      this takes a little extra time.  But after that, if the resource group
      is locked with GL_SKIP, the buffer head won't be read in unless it's
      actually needed.
      
      Enabling the resource groups lvbs is done via the rgrplvb mount option.  If
      this option isn't set, the lvbs will still be set and updated, but they won't
      be verfied or used by the filesystem.  To safely turn on this option, all of
      the nodes mounting the filesystem must be running code with this patch, and
      the filesystem must have been completely unmounted since they were updated.
      
      Signed-off-by: default avatarBenjamin Marzinski <bmarzins@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      90306c41
  7. 06 Jun, 2012 2 commits
  8. 11 May, 2012 1 commit
    • Bob Peterson's avatar
      GFS2: Add rgrp information to block_alloc trace point · 41db1ab9
      Bob Peterson authored
      
      
      This is a second attempt at a patch that adds rgrp information to the
      block allocation trace point for GFS2. As suggested, the patch was
      modified to list the rgrp information _after_ the fields that exist today.
      
      Again, the reason for this patch is to allow us to trace and debug
      problems with the block reservations patch, which is still in the works.
      We can debug problems with reservations if we can see what block allocations
      result from the block reservations. It may also be handy in figuring out
      if there are problems in rgrp free space accounting. In other words,
      we can use it to track the rgrp and its free space along side the allocations
      that are taking place.
      
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      41db1ab9
  9. 27 Apr, 2012 1 commit
  10. 24 Apr, 2012 5 commits
  11. 05 Apr, 2012 1 commit
  12. 26 Mar, 2012 1 commit
  13. 05 Mar, 2012 2 commits
    • Bob Peterson's avatar
      GFS2: make sure rgrps are up to date in func gfs2_blk2rgrpd · 58884c4d
      Bob Peterson authored
      
      
      This patch adds a call to gfs2_rindex_update from function gfs2_blk2rgrpd
      and removes calls to it that are made redundant by it. The problem is
      that a gfs2_grow can add rgrps to the rindex, then put those rgrps into
      use, thus rendering the rindex we read in at mount time incomplete.
      
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      58884c4d
    • Bob Peterson's avatar
      GFS2: Eliminate sd_rindex_mutex · 6aad1c3d
      Bob Peterson authored
      
      
      Over time, we've slowly eliminated the use of sd_rindex_mutex.
      Up to this point, it was only used in two places: function
      gfs2_ri_total (which totals the file system size by reading
      and parsing the rindex file) and function gfs2_rindex_update
      which updates the rgrps in memory. Both of these functions have
      the rindex glock to protect them, so the rindex is unnecessary.
      Since gfs2_grow writes to the rindex via the meta_fs, the mutex
      is in the wrong order according to the normal rules. This patch
      eliminates the mutex entirely to avoid the problem.
      
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      6aad1c3d
  14. 01 Mar, 2012 1 commit
  15. 28 Feb, 2012 2 commits
    • Steven Whitehouse's avatar
      GFS2: FITRIM ioctl support · 66fc061b
      Steven Whitehouse authored
      
      
      The FITRIM ioctl provides an alternative way to send discard requests to
      the underlying device. Using the discard mount option results in every
      freed block generating a discard request to the block device. This can
      be slow, since many block devices can only process discard requests of
      larger sizes, and also such operations can be time consuming.
      
      Rather than using the discard mount option, FITRIM allows a sweep of the
      filesystem on an occasional basis, and also to optionally avoid sending
      down discard requests for smaller regions.
      
      In GFS2 FITRIM will work at resource group granularity. There is a flag
      for each resource group which keeps track of which resource groups have
      been trimmed. This flag is reset whenever a deallocation occurs in the
      resource group, and set whenever a successful FITRIM of that resource
      group has taken place. This helps to reduce repeated discard requests
      for the same block ranges, again improving performance.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      66fc061b
    • Steven Whitehouse's avatar
      GFS2: Read resource groups on mount · a365fbf3
      Steven Whitehouse authored
      
      
      This makes mount take slightly longer, but at the same time, the first
      write to the filesystem will be faster too. It also means that if there
      is a problem in the resource index, then we can refuse to mount rather
      than having to try and report that when the first write occurs.
      
      In addition, to avoid recursive locking, we hvae to take account of
      instances when the rindex glock may already be held when we are
      trying to update the rbtree of resource groups.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      a365fbf3
  16. 11 Jan, 2012 1 commit
  17. 22 Nov, 2011 2 commits
    • Steven Whitehouse's avatar
      GFS2: Fix multi-block allocation · 6a8099ed
      Steven Whitehouse authored
      
      
      Clean up gfs2_alloc_blocks so that it takes the full extent length
      rather than just the number of non-inode blocks as an argument. That
      will only make a difference in the inode allocation case for now.
      
      Also, this fixes the extent length handling around gfs2_alloc_extent() so
      that multi block allocations will work again.
      
      The rd_last_alloc block is set to the final block in the allocated
      extent (as per the update to i_goal, but referenced to a different
      start point).
      
      This also removes the dinode argument to rgblk_search() which is no
      longer used.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      6a8099ed
    • Bob Peterson's avatar
      GFS2: decouple quota allocations from block allocations · 564e12b1
      Bob Peterson authored
      
      
      This patch separates the code pertaining to allocations into two
      parts: quota-related information and block reservations.
      This patch also moves all the block reservation structure allocations to
      function gfs2_inplace_reserve to simplify the code, and moves
      the frees to function gfs2_inplace_release.
      
      Signed-off-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      564e12b1
  18. 21 Nov, 2011 3 commits
  19. 18 Nov, 2011 1 commit
  20. 15 Nov, 2011 2 commits
  21. 21 Oct, 2011 6 commits
    • Steven Whitehouse's avatar
      GFS2: Remove two unused variables · 9ae32429
      Steven Whitehouse authored
      
      
      The two variables being initialised in gfs2_inplace_reserve
      to track the file & line number of the caller are never
      used, so we might as well remove them.
      
      If something does go wrong, then a stack trace is probably
      more useful anyway.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      9ae32429
    • Steven Whitehouse's avatar
      GFS2: Fix off-by-one in gfs2_blk2rgrpd · f75bbfb4
      Steven Whitehouse authored
      
      
      Bob reported:
      
      I found an off-by-one problem with how I coded this section:
      It should be:
      
      + else if (blk >= cur->rd_data0 + cur->rd_data)
      
      In fact, cur->rd_data0 + cur->rd_data is the start of the next
      rgrp (the next ri_addr), so without the "=" check it can land on
      the wrong rgrp.
      
      In all normal cases, this won't be a problem: you're searching
      for a block _within_ the rgrp, which will pass the test properly.
      Where it gets into trouble is if you search the rgrps for the
      block exactly equal to ri_addr.  I don't think anything in the
      kernel does this, but I found a place in gfs2-utils gfs2_edit
      where it does.  So I definitely need to fix it in libgfs2.  I'd
      like to suggest we fix it in the kernel as well for the sake of
      keeping the functions similar.
      
      So this patch fixes the above mentioned off by one error as well
      as removing the unused parent pointer.
      
      Reported-by: default avatarBob Peterson <rpeterso@redhat.com>
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      f75bbfb4
    • Steven Whitehouse's avatar
      GFS2: Correctly set goal block after allocation · ccad4e14
      Steven Whitehouse authored
      
      
      The new goal block should be set to the end of the newly
      allocated extent, not the start of it.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      ccad4e14
    • Steven Whitehouse's avatar
      GFS2: Use cached rgrp in gfs2_rlist_add() · 70b0c365
      Steven Whitehouse authored
      
      
      Each block which is deallocated, requires a call to gfs2_rlist_add()
      and each of those calls was calling gfs2_blk2rgrpd() in order to
      figure out which rgrp the block belonged in. This can be speeded up
      by making use of the rgrp cached in the inode. We also reset this
      cached rgrp in case the block has changed rgrp. This should provide
      a big reduction in gfs2_blk2rgrpd() calls during deallocation.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      70b0c365
    • Steven Whitehouse's avatar
      GFS2: Remove obsolete assert · 534029e2
      Steven Whitehouse authored
      
      
      Given that a resource group has been locked, there is no reason why
      we should not be able to allocate as many blocks as are free. The
      al_requested parameter should really be considered as a minimum
      number of blocks to be available. Should this limit be overshot,
      there are other mechanisms which will prevent over allocation.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      534029e2
    • Steven Whitehouse's avatar
      GFS2: Cache the most recently used resource group in the inode · 54335b1f
      Steven Whitehouse authored
      
      
      This means that after the initial allocation for any inode, the
      last used resource group is cached in the inode for future use.
      This drastically reduces the number of lookups of resource
      groups in the common case, and this the contention on that
      data structure.
      
      The allocation algorithm is the same as previously, except that we
      always check to see if the goal block is within the cached rgrp
      first before going to the rbtree to look one up.
      
      Signed-off-by: default avatarSteven Whitehouse <swhiteho@redhat.com>
      54335b1f