Skip to content
  • Yijing Wang's avatar
    PCI: pciehp: Use per-slot workqueues to avoid deadlock · c2be6f93
    Yijing Wang authored
    When we have a hotplug-capable PCIe port with a second hotplug-capable
    PCIe port below it, removing the device below the upstream port causes
    a deadlock.
    
    The deadlock happens because we use the pciehp_wq workqueue to run
    pciehp_power_thread(), which uses pciehp_disable_slot() to remove devices
    below the upstream port.  When we remove the downstream PCIe port, we call
    pciehp_remove(), the pciehp driver's .remove() method.  That calls
    flush_workqueue(pciehp_wq), which deadlocks because the
    pciehp_power_thread() work item is still running.
    
    This patch avoids the deadlock by creating a workqueue for every PCIe port
    and removing the single shared workqueue.
    
    Here's the call path that leads to the deadlock:
    
      pciehp_queue_pushbutton_work
        queue_work(pciehp_wq)                   # queue pciehp_power_thread
        ...
    
      pciehp_power_thread
        pciehp_disable_slot
          remove_board
    	pciehp_unconfigure_device
    	  pci_stop_and_remove_bus_device
    	    ...
    	      pciehp_remove                 # pciehp driver .remove method
    		pciehp_release_ctrl
    		  pcie_cleanup_slot
    		    flush_workqueue(pciehp_wq)
    
    This is fairly urgent because it can be caused by simply unplugging a
    Thunderbolt adapter, as reported by Daniel below.
    
    [bhelgaas: changelog]
    Reference: http://lkml.kernel.org/r/CAMVG2ssiRgcTD1bej2tkUUfsWmpL5eNtPcNif9va2-Gzb2u8nQ@mail.gmail.com
    
    
    Reported-and-tested-by: default avatarDaniel J Blueman <daniel@quora.org>
    Reviewed-by: default avatarKenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
    Signed-off-by: default avatarYijing Wang <wangyijing@huawei.com>
    Signed-off-by: default avatarBjorn Helgaas <bhelgaas@google.com>
    CC: stable@vger.kernel.org
    c2be6f93