• Divyesh Shah's avatar
    blkio: Add more debug-only per-cgroup stats · 812df48d
    Divyesh Shah authored
    1) group_wait_time - This is the amount of time the cgroup had to wait to get a
      timeslice for one of its queues from when it became busy, i.e., went from 0
      to 1 request queued. This is different from the io_wait_time which is the
      cumulative total of the amount of time spent by each IO in that cgroup waiting
      in the scheduler queue. This stat is a great way to find out any jobs in the
      fleet that are being starved or waiting for longer than what is expected (due
      to an IO controller bug or any other issue).
    2) empty_time - This is the amount of time a cgroup spends w/o any pending
       requests. This stat is useful when a job does not seem to be able to use its
       assigned disk share by helping check if that is happening due to an IO
       controller bug or because the job is not submitting enough IOs.
    3) idle_time - This is the amount of time spent by the IO scheduler idling
       for a given cgroup in anticipation of a better request than the exising ones
       from other queues/cgroups.
    All these stats are recorded using start and stop events. When reading these
    stats, we do not add the delta between the current time and the last start time
    if we're between the start and stop events. We avoid doing this to make sure
    that these numbers are always monotonically increasing when read. Since we're
    using sched_clock() which may use the tsc as its source, it may induce some
    inconsistency (due to tsc resync across cpus) if we included the current delta.
    Signed-off-by: default avatarDivyesh <Shah&lt;dpshah@google.com>
    Signed-off-by: default avatarJens Axboe <jens.axboe@oracle.com>