Skip to content
  • Mitko Haralanov's avatar
    staging/rdma/hfi1: Improve performance of SDMA transfers · 0f2d87d2
    Mitko Haralanov authored
    Commit a0d40693
    
     ("staging/rdma/hfi1: Add page lock limit
    check for SDMA requests") added a mechanism to
    delay the clean-up of user SDMA requests in order to facilitate
    proper locked page counting.
    
    This delayed processing was done using a kernel workqueue, which
    meant that a kernel thread would have to spin up and take CPU
    cycles to do the clean-up.
    
    This proved detrimental to performance because now there are two
    execution threads (the kernel workqueue and the user process)
    needing cycles on the same CPU.
    
    Performance-wise, it is much better to do as much of the clean-up
    as can be done in interrupt context (during the callback) and do
    the remaining work in-line during subsequent calls of the user
    process into the driver.
    
    The changes required to implement the above also significantly
    simplify the entire SDMA completion processing code and eliminate
    a memory corruption causing the following observed crash:
    
        [ 2881.703362] BUG: unable to handle kernel NULL pointer dereference at        (null)
        [ 2881.703389] IP: [<ffffffffa02897e4>] user_sdma_send_pkts+0xcd4/0x18e0 [hfi1]
        [ 2881.703422] PGD 7d4d25067 PUD 77d96d067 PMD 0
        [ 2881.703427] Oops: 0000 [#1] SMP
        [ 2881.703431] Modules linked in:
        [ 2881.703504] CPU: 28 PID: 6668 Comm: mpi_stress Tainted: G           OENX 3.12.28-4-default #1
        [ 2881.703508] Hardware name: Intel Corporation S2600KP/S2600KP, BIOS SE5C610.86B.11.01.0044.090
        [ 2881.703512] task: ffff88077da8e0c0 ti: ffff880856772000 task.ti: ffff880856772000
        [ 2881.703515] RIP: 0010:[<ffffffffa02897e4>]  [<ffffffffa02897e4>] user_sdma_send_pkts+0xcd4/0x
        [ 2881.703529] RSP: 0018:ffff880856773c48  EFLAGS: 00010287
        [ 2881.703531] RAX: 0000000000000000 RBX: 0000000000001000 RCX: 0000000000002000
        [ 2881.703534] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000002000
        [ 2881.703537] RBP: 0000000000000000 R08: 0000000000000001 R09: 0000000000000000
        [ 2881.703540] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
        [ 2881.703543] R13: 0000000000000000 R14: ffff88071e782e68 R15: ffff8810532955c0
        [ 2881.703546] FS:  00007f9c4375e700(0000) GS:ffff88107eec0000(0000) knlGS:0000000000000000
        [ 2881.703549] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
        [ 2881.703551] CR2: 0000000000000000 CR3: 00000007d4cba000 CR4: 00000000003407e0
        [ 2881.703554] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
        [ 2881.703556] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
        [ 2881.703558] Stack:
        [ 2881.703559]  ffffffff00002000 ffff881000001800 ffffffff00000000 00000000000080d0
        [ 2881.703570]  0000000000000000 0000200000000000 0000000000000000 ffff88071e782db8
        [ 2881.703580]  ffff8807d4d08d80 ffff881053295600 0000000000000008 ffff88071e782fc8
        [ 2881.703589] Call Trace:
        [ 2881.703691]  [<ffffffffa028b5da>] hfi1_user_sdma_process_request+0x84a/0xab0 [hfi1]
        [ 2881.703777]  [<ffffffffa0255412>] hfi1_aio_write+0xd2/0x110 [hfi1]
        [ 2881.703828]  [<ffffffff8119e3d8>] do_sync_readv_writev+0x48/0x80
        [ 2881.703837]  [<ffffffff8119f78b>] do_readv_writev+0xbb/0x230
        [ 2881.703843]  [<ffffffff8119fab8>] SyS_writev+0x48/0xc0
    
    This commit also addresses issues related to notification of user
    processes of SDMA request slot availability. The slot should be
    cleaned up first before the user processes is notified of its
    availability.
    
    Reviewed-by: default avatarArthur Kepner <arthur.kepner@intel.com>
    Reviewed-by: default avatarDennis Dalessandro <dennis.dalessandro@intel.com>
    Signed-off-by: default avatarMitko Haralanov <mitko.haralanov@intel.com>
    Signed-off-by: default avatarJubin John <jubin.john@intel.com>
    Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
    0f2d87d2