Commit 9138125b authored by Tejun Heo's avatar Tejun Heo
blk-throttle: implement proper hierarchy support

With the recent updates, blk-throttle is finally ready for proper
hierarchy support.  Dispatching now honors service_queue->parent_sq
and propagates correctly.  The only thing missing is setting
->parent_sq correctly so that throtl_grp hierarchy matches the cgroup

This patch updates throtl_pd_init() such that service_queues form the
same hierarchy as the cgroup hierarchy if sane_behavior is enabled.
As this concludes proper hierarchy support for blkcg, the shameful
.broken_hierarchy tag is removed from blkio_subsys.

v2: Updated blkio-controller.txt as suggested by Vivek.
Signed-off-by: default avatarTejun Heo <>
Acked-by: default avatarVivek Goyal <>
Cc: Li Zefan <>
parent 693e751e
......@@ -94,11 +94,13 @@ Throttling/Upper Limit policy
Hierarchical Cgroups
- Currently only CFQ supports hierarchical groups. For throttling,
cgroup interface does allow creation of hierarchical cgroups and
internally it treats them as flat hierarchy.
If somebody created a hierarchy like as follows.
Both CFQ and throttling implement hierarchy support; however,
throttling's hierarchy support is enabled iff "sane_behavior" is
enabled from cgroup side, which currently is a development option and
not publicly available.
If somebody created a hierarchy like as follows.
/ \
......@@ -106,21 +108,20 @@ Hierarchical Cgroups
CFQ will handle the hierarchy correctly but and throttling will
practically treat all groups at same level. For details on CFQ
hierarchy support, refer to Documentation/block/cfq-iosched.txt.
Throttling will treat the hierarchy as if it looks like the
CFQ by default and throttling with "sane_behavior" will handle the
hierarchy correctly. For details on CFQ hierarchy support, refer to
Documentation/block/cfq-iosched.txt. For throttling, all limits apply
to the whole subtree while all statistics are local to the IOs
directly generated by tasks in that cgroup.
Throttling without "sane_behavior" enabled from cgroup side will
practically treat all groups at same level as if it looks like the
/ / \ \
root test1 test2 test3
Nesting cgroups, while allowed, isn't officially supported and blkio
genereates warning when cgroups nest. Once throttling implements
hierarchy support, hierarchy will be supported and the warning will
be removed.
Various user visible config options
......@@ -911,14 +911,6 @@ struct cgroup_subsys blkio_subsys = {
.subsys_id = blkio_subsys_id,
.base_cftypes = blkcg_files,
.module = THIS_MODULE,
* blkio subsystem is utterly broken in terms of hierarchy support.
* It treats all cgroups equally regardless of where they're
* located in the hierarchy - all cgroups are treated as if they're
* right below the root. Fix it and remove the following.
.broken_hierarchy = true,
......@@ -397,10 +397,30 @@ static void throtl_pd_init(struct blkcg_gq *blkg)
struct throtl_grp *tg = blkg_to_tg(blkg);
struct throtl_data *td = blkg->q->td;
struct throtl_service_queue *parent_sq;
unsigned long flags;
int rw;
throtl_service_queue_init(&tg->service_queue, &td->service_queue);
* If sane_hierarchy is enabled, we switch to properly hierarchical
* behavior where limits on a given throtl_grp are applied to the
* whole subtree rather than just the group itself. e.g. If 16M
* read_bps limit is set on the root group, the whole system can't
* exceed 16M for the device.
* If sane_hierarchy is not enabled, the broken flat hierarchy
* behavior is retained where all throtl_grps are treated as if
* they're all separate root groups right below throtl_data.
* Limits of a group don't interact with limits of other groups
* regardless of the position of the group in the hierarchy.
parent_sq = &td->service_queue;
if (cgroup_sane_behavior(blkg->blkcg->css.cgroup) && blkg->parent)
parent_sq = &blkg_to_tg(blkg->parent)->service_queue;
throtl_service_queue_init(&tg->service_queue, parent_sq);
for (rw = READ; rw <= WRITE; rw++) {
throtl_qnode_init(&tg->qnode_on_self[rw], tg);
throtl_qnode_init(&tg->qnode_on_parent[rw], tg);
......@@ -272,6 +272,8 @@ enum {
* - memcg: use_hierarchy is on by default and the cgroup file for
* the flag is not created.
* - blkcg: blk-throttle becomes properly hierarchical.
* The followings are planned changes.
* - release_agent will be disallowed once replacement notification
