Skip to content
  • Joseph Qi's avatar
    ocfs2/dlm: fix race between convert and recovery · ac7cf246
    Joseph Qi authored
    
    
    There is a race window between dlmconvert_remote and
    dlm_move_lockres_to_recovery_list, which will cause a lock with
    OCFS2_LOCK_BUSY in grant list, thus system hangs.
    
    dlmconvert_remote
    {
            spin_lock(&res->spinlock);
            list_move_tail(&lock->list, &res->converting);
            lock->convert_pending = 1;
            spin_unlock(&res->spinlock);
    
            status = dlm_send_remote_convert_request();
            >>>>>> race window, master has queued ast and return DLM_NORMAL,
                   and then down before sending ast.
                   this node detects master down and calls
                   dlm_move_lockres_to_recovery_list, which will revert the
                   lock to grant list.
                   Then OCFS2_LOCK_BUSY won't be cleared as new master won't
                   send ast any more because it thinks already be authorized.
    
            spin_lock(&res->spinlock);
            lock->convert_pending = 0;
            if (status != DLM_NORMAL)
                    dlm_revert_pending_convert(res, lock);
            spin_unlock(&res->spinlock);
    }
    
    In this case, check if res->state has DLM_LOCK_RES_RECOVERING bit set
    (res is still in recovering) or res master changed (new master has
    finished recovery), reset the status to DLM_RECOVERING, then it will
    retry convert.
    
    Signed-off-by: default avatarJoseph Qi <joseph.qi@huawei.com>
    Reported-by: default avatarYiwen Jiang <jiangyiwen@huawei.com>
    Reviewed-by: default avatarJunxiao Bi <junxiao.bi@oracle.com>
    Cc: Mark Fasheh <mfasheh@suse.de>
    Cc: Joel Becker <jlbec@evilplan.org>
    Cc: Tariq Saeed <tariq.x.saeed@oracle.com>
    Cc: Junxiao Bi <junxiao.bi@oracle.com>
    Cc: <stable@vger.kernel.org>
    Signed-off-by: default avatarAndrew Morton <akpm@linux-foundation.org>
    Signed-off-by: default avatarLinus Torvalds <torvalds@linux-foundation.org>
    ac7cf246