Skip to content
  • Nikolay Borisov's avatar
    lockdep: teach lockdep about memalloc_noio_save · 6d7225f0
    Nikolay Borisov authored
    Patch series "scope GFP_NOFS api", v5.
    
    This patch (of 7):
    
    Commit 21caf2fc ("mm: teach mm by current context info to not do I/O
    during memory allocation") added the memalloc_noio_(save|restore)
    functions to enable people to modify the MM behavior by disabling I/O
    during memory allocation.
    
    This was further extended in commit 934f3072 ("mm: clear __GFP_FS
    when PF_MEMALLOC_NOIO is set").
    
    memalloc_noio_* functions prevent allocation paths recursing back into
    the filesystem without explicitly changing the flags for every
    allocation site.
    
    However, lockdep hasn't been keeping up with the changes and it entirely
    misses handling the memalloc_noio adjustments.  Instead, it is left to
    the callers of __lockdep_trace_alloc to call the function after they
    have shaven the respective GFP flags which can lead to false positives:
    
      =================================
       [ INFO: inconsistent lock state ]
       4.10.0-nbor #134 Not tainted
       ---------------------------------
       inconsistent {IN-RECLAIM_FS-W} -> {RECLAIM_FS-ON-W} usage.
       fsstress/3365 [HC0[0]:SC0[0]:HE1:SE1] takes:
        (&xfs_nondir_ilock_class){++++?.}, at: xfs_ilock+0x141/0x230
       {IN-RECLAIM_FS-W} state was registered at:
         __lock_acquire+0x62a/0x17c0
         lock_acquire+0xc5/0x220
         down_write_nested+0x4f/0x90
         xfs_ilock+0x141/0x230
         xfs_reclaim_inode+0x12a/0x320
         xfs_reclaim_inodes_ag+0x2c8/0x4e0
         xfs_reclaim_inodes_nr+0x33/0x40
         xfs_fs_free_cached_objects+0x19/0x20
         super_cache_scan+0x191/0x1a0
         shrink_slab+0x26f/0x5f0
         shrink_node+0xf9/0x2f0
         kswapd+0x356/0x920
         kthread+0x10c/0x140
         ret_from_fork+0x31/0x40
       irq event stamp: 173777
       hardirqs last  enabled at (173777): __local_bh_enable_ip+0x70/0xc0
       hardirqs last disabled at (173775): __local_bh_enable_ip+0x37/0xc0
       softirqs last  enabled at (173776): _xfs_buf_find+0x67a/0xb70
       softirqs last disabled at (173774): _xfs_buf_find+0x5db/0xb70
    
       other info that might help us debug this:
        Possible unsafe locking scenario:
    
              CPU0
              ----
         lock(&xfs_nondir_ilock_class);
         <Interrupt>
           lock(&xfs_nondir_ilock_class);
    
        *** DEADLOCK ***
    
       4 locks held by fsstress/3365:
        #0:  (sb_writers#10){++++++}, at: mnt_want_write+0x24/0x50
        #1:  (&sb->s_type->i_mutex_key#12){++++++}, at: vfs_setxattr+0x6f/0xb0
        #2:  (sb_internal#2){++++++}, at: xfs_trans_alloc+0xfc/0x140
        #3:  (&xfs_nondir_ilock_class){++++?.}, at: xfs_ilock+0x141/0x230
    
       stack backtrace:
       CPU: 0 PID: 3365 Comm: fsstress Not tainted 4.10.0-nbor #134
       Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014
       Call Trace:
        kmem_cache_alloc_node_trace+0x3a/0x2c0
        vm_map_ram+0x2a1/0x510
        _xfs_buf_map_pages+0x77/0x140
        xfs_buf_get_map+0x185/0x2a0
        xfs_attr_rmtval_set+0x233/0x430
        xfs_attr_leaf_addname+0x2d2/0x500
        xfs_attr_set+0x214/0x420
        xfs_xattr_set+0x59/0xb0
        __vfs_setxattr+0x76/0xa0
        __vfs_setxattr_noperm+0x5e/0xf0
        vfs_setxattr+0xae/0xb0
        setxattr+0x15e/0x1a0
        path_setxattr+0x8f/0xc0
        SyS_lsetxattr+0x11/0x20
        entry_SYSCALL_64_fastpath+0x23/0xc6
    
    Let's fix this by making lockdep explicitly do the shaving of respective
    GFP flags.
    
    Fixes: 934f3072 ("mm: clear __GFP_FS when PF_MEMALLOC_NOIO is set")
    Link: http://lkml.kernel.org/r/20170306131408.9828-2-mhocko@kernel.org
    
    
    Signed-off-by: default avatarNikolay Borisov <nborisov@suse.com>
    Signed-off-by: default avatarMichal Hocko <mhocko@suse.com>
    Acked-by: default avatarPeter Zijlstra (Intel) <peterz@infradead.org>
    Cc: Dave Chinner <david@fromorbit.com>
    Cc: Theodore Ts'o <tytso@mit.edu>
    Cc: Chris Mason <clm@fb.com>
    Cc: David Sterba <dsterba@suse.cz>
    Cc: Jan Kara <jack@suse.cz>
    Cc: Brian Foster <bfoster@redhat.com>
    Cc: Darrick J. Wong <darrick.wong@oracle.com>
    Cc: Vlastimil Babka <vbabka@suse.cz>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    6d7225f0