Skip to content
  • Michael J. Ruhl's avatar
    IB/hfi1: Fix destroy_qp hang after a link down · b4a4957d
    Michael J. Ruhl authored
    rvt_destroy_qp() cannot complete until all in process packets have
    been released from the underlying hardware.  If a link down event
    occurs, an application can hang with a kernel stack similar to:
    
    cat /proc/<app PID>/stack
     quiesce_qp+0x178/0x250 [hfi1]
     rvt_reset_qp+0x23d/0x400 [rdmavt]
     rvt_destroy_qp+0x69/0x210 [rdmavt]
     ib_destroy_qp+0xba/0x1c0 [ib_core]
     nvme_rdma_destroy_queue_ib+0x46/0x80 [nvme_rdma]
     nvme_rdma_free_queue+0x3c/0xd0 [nvme_rdma]
     nvme_rdma_destroy_io_queues+0x88/0xd0 [nvme_rdma]
     nvme_rdma_error_recovery_work+0x52/0xf0 [nvme_rdma]
     process_one_work+0x17a/0x440
     worker_thread+0x126/0x3c0
     kthread+0xcf/0xe0
     ret_from_fork+0x58/0x90
     0xffffffffffffffff
    
    quiesce_qp() waits until all outstanding packets have been freed.
    This wait should be momentary.  During a link down event, the cleanup
    handling does not ensure that all packets caught by the link down are
    flushed properly.
    
    This is caused by the fact that the freeze path and the link down
    event is handled the same.  This is not correct.  The freeze path
    waits until the HFI is unfrozen and then restarts PIO.  A link down
    is not a freeze event.  The link down path cannot restart the PIO
    until link is restored.  If the PIO path is restarted before the link
    comes up, the application (QP) using the PIO path will hang (until
    link is restored).
    
    Fix by separating the linkdown path from the freeze path and use the
    link down path for link down events.
    
    Close a race condition sc_disable() by acquiring both the progress
    and release locks.
    
    Close a race condition in sc_stop() by moving the setting of the flag
    bits under the alloc lock.
    
    Cc: <stable@vger.kernel.org> # 4.9.x+
    Fixes: 77241056
    
     ("IB/hfi1: add driver files")
    Reviewed-by: default avatarMike Marciniszyn <mike.marciniszyn@intel.com>
    Signed-off-by: default avatarMichael J. Ruhl <michael.j.ruhl@intel.com>
    Signed-off-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
    Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
    b4a4957d