1. 28 Feb, 2018 1 commit
    • Muneendra Kumar M's avatar
      IB/core : Add null pointer check in addr_resolve · 4cd482c1
      Muneendra Kumar M authored
      dev_get_by_index is being called in addr_resolve
      function which returns NULL and NULL pointer access
      leads to kernel crash.
      
      Following call trace is observed while running
      rdma_lat test application
      
      [  146.173149] BUG: unable to handle kernel NULL pointer dereference
      at 00000000000004a0
      [  146.173198] IP: addr_resolve+0x9e/0x3e0 [ib_core]
      [  146.173221] PGD 0 P4D 0
      [  146.173869] Oops: 0000 [#1] SMP PTI
      [  146.182859] CPU: 8 PID: 127 Comm: kworker/8:1 Tainted: G  O 4.15.0-rc6+ #18
      [  146.183758] Hardware name: LENOVO System x3650 M5: -[8871AC1]-/01KN179,
       BIOS-[TCE132H-2.50]- 10/11/2017
      [  146.184691] Workqueue: ib_cm cm_work_handler [ib_cm]
      [  146.185632] RIP: 0010:addr_resolve+0x9e/0x3e0 [ib_core]
      [  146.186584] RSP: 0018:ffffc9000362faa0 EFLAGS: 00010246
      [  146.187521] RAX: 000000000000001b RBX: ffffc9000362fc08 RCX:
      0000000000000006
      [  146.188472] RDX: 0000000000000000 RSI: 0000000000000096 RDI
      : ffff88087fc16990
      [  146.189427] RBP: ffffc9000362fb18 R08: 00000000ffffff9d R09:
      00000000000004ac
      [  146.190392] R10: 00000000000001e7 R11: 0000000000000001 R12:
      ffff88086af2e090
      [  146.191361] R13: 0000000000000000 R14: 0000000000000001 R15:
      00000000ffffff9d
      [  146.192327] FS:  0000000000000000(0000) GS:ffff88087fc00000(0000)
      knlGS:0000000000000000
      [  146.193301] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
      [  146.194274] CR2: 00000000000004a0 CR3: 000000000220a002 CR4:
      00000000003606e0
      [  146.195258] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
      0000000000000000
      [  146.196256] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
      0000000000000400
      [  146.197231] Call Trace:
      [  146.198209]  ? rdma_addr_register_client+0x30/0x30 [ib_core]
      [  146.199199]  rdma_resolve_ip+0x1af/0x280 [ib_core]
      [  146.200196]  rdma_addr_find_l2_eth_by_grh+0x154/0x2b0 [ib_core]
      
      The below patch adds the missing NULL pointer check
      returned by dev_get_by_index before accessing the netdev to
      avoid kernel crash.
      
      We observed the below crash when we try to do the below test.
      
       server                       client
       ---------                    ---------
       |1.1.1.1|<----rxe-channel--->|1.1.1.2|
       ---------                    ---------
      
      On server: rdma_lat -c -n 2 -s 1024
      On client:rdma_lat 1.1.1.1 -c -n 2 -s 1024
      
      Fixes: 20029832
      
       ("IB/core: Validate route when we init ah")
      Signed-off-by: default avatarMuneendra <muneendra.kumar@broadcom.com>
      Signed-off-by: default avatarJason Gunthorpe <jgg@mellanox.com>
      4cd482c1
  2. 18 Dec, 2017 5 commits
  3. 13 Nov, 2017 1 commit
  4. 18 Oct, 2017 1 commit
  5. 10 Aug, 2017 4 commits
  6. 04 Aug, 2017 1 commit
    • Parav Pandit's avatar
      IB/core: Fix race condition in resolving IP to MAC · 5fff41e1
      Parav Pandit authored
      Currently while resolving IP address to MAC address single delayed work
      is used for resolving multiple such resolve requests. This singled work
      is essentially performs two tasks.
      (a) any retry needed to resolve and
      (b) it executes the callback function for all completed requests
      
      While work is executing callbacks, any new work scheduled on for this
      workqueue is lost because workqueue has completed looking at all pending
      requests and now looking at callbacks, but work is still under
      execution. Any further retry to look at pending requests in
      process_req() after executing callbacks would lead to similar race
      condition (may be reduce the probably further but doesn't eliminate it).
      Retrying to enqueue work that from queue_req() context is not something
      rest of the kernel modules have followed.
      
      Therefore fix in this patch utilizes kernel facility to enqueue multiple
      work items to a workqueue. This ensures that no such requests
      gets lost in synchronization. Request list is still maintained so that
      rdma_cancel_addr() can unlink the request and get the completion with
      error sooner. Neighbour update event handling continues to be handled in
      same way as before.
      Additionally process_req() work entry cancels any pending work for a
      request that gets completed while processing those requests.
      
      Originally ib_addr was ST workqueue, but it became MT work queue with
      patch of [1]. This patch again makes it similar to ST so that
      neighbour update events handler work item doesn't race with
      other work items.
      
      In one such below trace, (though on 4.5 based kernel) it can be seen
      that process_req() never executed the callback, which is likely for an
      event that was schedule by queue_req() when previous callback was
      getting executed by workqueue.
      
       [<ffffffff816b0dde>] schedule+0x3e/0x90
       [<ffffffff816b3c45>] schedule_timeout+0x1b5/0x210
       [<ffffffff81618c37>] ? ip_route_output_flow+0x27/0x70
       [<ffffffffa027f9c9>] ? addr_resolve+0x149/0x1b0 [ib_addr]
       [<ffffffff816b228f>] wait_for_completion+0x10f/0x170
       [<ffffffff810b6140>] ? try_to_wake_up+0x210/0x210
       [<ffffffffa027f220>] ? rdma_copy_addr+0xa0/0xa0 [ib_addr]
       [<ffffffffa0280120>] rdma_addr_find_l2_eth_by_grh+0x1d0/0x278 [ib_addr]
       [<ffffffff81321297>] ? sub_alloc+0x77/0x1c0
       [<ffffffffa02943b7>] ib_init_ah_from_wc+0x3a7/0x5a0 [ib_core]
       [<ffffffffa0457aba>] cm_req_handler+0xea/0x580 [ib_cm]
       [<ffffffff81015982>] ? __switch_to+0x212/0x5e0
       [<ffffffffa04582fd>] cm_work_handler+0x6d/0x150 [ib_cm]
       [<ffffffff810a14c1>] process_one_work+0x151/0x4b0
       [<ffffffff810a1940>] worker_thread+0x120/0x480
       [<ffffffff816b074b>] ? __schedule+0x30b/0x890
       [<ffffffff810a1820>] ? process_one_work+0x4b0/0x4b0
       [<ffffffff810a1820>] ? process_one_work+0x4b0/0x4b0
       [<ffffffff810a6b1e>] kthread+0xce/0xf0
       [<ffffffff810a6a50>] ? kthread_freezable_should_stop+0x70/0x70
       [<ffffffff816b53a2>] ret_from_fork+0x42/0x70
       [<ffffffff810a6a50>] ? kthread_freezable_should_stop+0x70/0x70
      INFO: task kworker/u144:1:156520 blocked for more than 120 seconds.
      "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this
      message.
      kworker/u144:1  D ffff883ffe1d7600     0 156520      2 0x00000080
      Workqueue: ib_addr process_req [ib_addr]
       ffff883f446fbbd8 0000000000000046 ffff881f95280000 ffff881ff24de200
       ffff883f66120000 ffff883f446f8008 ffff881f95280000 ffff883f6f9208c4
       ffff883f6f9208c8 00000000ffffffff ffff883f446fbbf8 ffffffff816b0dde
      
      [1] http://lkml.iu.edu/hypermail/linux/kernel/1608.1/05834.html
      
      Signed-off-by: default avatarParav Pandit <parav@mellanox.com>
      Reviewed-by: default avatarMark Bloch <markb@mellanox.com>
      Signed-off-by: default avatarLeon Romanovsky <leon@kernel.org>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      5fff41e1
  7. 17 Jul, 2017 2 commits
  8. 16 Jun, 2017 1 commit
    • Johannes Berg's avatar
      networking: make skb_put & friends return void pointers · 4df864c1
      Johannes Berg authored
      
      
      It seems like a historic accident that these return unsigned char *,
      and in many places that means casts are required, more often than not.
      
      Make these functions (skb_put, __skb_put and pskb_put) return void *
      and remove all the casts across the tree, adding a (u8 *) cast only
      where the unsigned char pointer was used directly, all done with the
      following spatch:
      
          @@
          expression SKB, LEN;
          typedef u8;
          identifier fn = { skb_put, __skb_put };
          @@
          - *(fn(SKB, LEN))
          + *(u8 *)fn(SKB, LEN)
      
          @@
          expression E, SKB, LEN;
          identifier fn = { skb_put, __skb_put };
          type T;
          @@
          - E = ((T *)(fn(SKB, LEN)))
          + E = fn(SKB, LEN)
      
      which actually doesn't cover pskb_put since there are only three
      users overall.
      
      A handful of stragglers were converted manually, notably a macro in
      drivers/isdn/i4l/isdn_bsdcomp.c and, oddly enough, one of the many
      instances in net/bluetooth/hci_sock.c. In the former file, I also
      had to fix one whitespace problem spatch introduced.
      Signed-off-by: default avatarJohannes Berg <johannes.berg@intel.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      4df864c1
  9. 07 Jun, 2017 1 commit
    • Roland Dreier's avatar
      IB/addr: Fix setting source address in addr6_resolve() · 79e25959
      Roland Dreier authored
      Commit eea40b8f ("infiniband: call ipv6 route lookup via the stub
      interface") introduced a regression in address resolution when connecting
      to IPv6 destination addresses.  The old code called ip6_route_output(),
      while the new code calls ipv6_stub->ipv6_dst_lookup().  The two are almost
      the same, except that ipv6_dst_lookup() also calls ip6_route_get_saddr()
      if the source address is in6addr_any.
      
      This means that the test of ipv6_addr_any(&fl6.saddr) now never succeeds,
      and so we never copy the source address out.  This ends up causing
      rdma_resolve_addr() to fail, because without a resolved source address,
      cma_acquire_dev() will fail to find an RDMA device to use.  For me, this
      causes connecting to an NVMe over Fabrics target via RoCE / IPv6 to fail.
      
      Fix this by copying out fl6.saddr if ipv6_addr_any() is true for the original
      source address passed into addr6_resolve().  We can drop our call to
      ipv6_dev_get_saddr() because ipv6_dst_lookup() already does that work.
      
      Fixes: eea40b8f
      
       ("infiniband: call ipv6 route lookup via the stub interface")
      Cc: <stable@vger.kernel.org> # 3.12+
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      Acked-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      79e25959
  10. 02 May, 2017 1 commit
  11. 28 Apr, 2017 1 commit
    • Paolo Abeni's avatar
      infiniband: call ipv6 route lookup via the stub interface · eea40b8f
      Paolo Abeni authored
      
      
      The infiniband address handle can be triggered to resolve an ipv6
      address in response to MAD packets, regardless of the ipv6
      module being disabled via the kernel command line argument.
      
      That will cause a call into the ipv6 routing code, which is not
      initialized, and a conseguent oops.
      
      This commit addresses the above issue replacing the direct lookup
      call with an indirect one via the ipv6 stub, which is properly
      initialized according to the ipv6 status (e.g. if ipv6 is
      disabled, the routing lookup fails gracefully)
      
      Cc: stable@vger.kernel.org # 3.12+
      Signed-off-by: default avatarPaolo Abeni <pabeni@redhat.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      eea40b8f
  12. 13 Apr, 2017 1 commit
  13. 17 Nov, 2016 1 commit
  14. 07 Oct, 2016 1 commit
  15. 24 May, 2016 2 commits
  16. 19 Jan, 2016 3 commits
  17. 23 Dec, 2015 2 commits
  18. 28 Oct, 2015 1 commit
  19. 22 Oct, 2015 1 commit
    • Matan Barak's avatar
      IB/core: Use GID table in AH creation and dmac resolution · dbf727de
      Matan Barak authored
      
      
      Previously, vlan id and source MAC were used from QP attributes. Since
      the net device is now stored in the GID attributes, they could be used
      instead of getting this information from the QP attributes.
      
      IB_QP_SMAC, IB_QP_ALT_SMAC, IB_QP_VID and IB_QP_ALT_VID were removed
      because there is no known libibverbs that uses them.
      
      This commit also modifies the vendors (mlx4, ocrdma) drivers in order
      to use the new approach.
      
      ocrdma driver changes were done by Somnath Kotur <Somnath.Kotur@Avagotech.Com>
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      dbf727de
  20. 02 Jun, 2015 1 commit
  21. 05 May, 2015 1 commit
  22. 16 Dec, 2014 1 commit
    • Or Kehati's avatar
      IB/addr: Improve address resolution callback scheduling · 346f98b4
      Or Kehati authored
      
      
      Address resolution always does a context switch to a work-queue to
      deliver the address resolution event.  When the IP address is already
      cached in the system ARP table, we're going through the following:
      chain:
      
          rdma_resolve_ip --> addr_resolve (cache hit) -->
      
      which ends up with:
      
          queue_req --> set_timeout (now) --> mod_delayed_work(,, delay=1)
      
      We actually do realize that the timeout should be zero, but the code
      forces it to a minimum of one jiffie.
      
      Using one jiffie as the minimum delay value results in sub-optimal
      scheduling of executing this work item by the workqueue, which on the
      below testbed costs about 3-4ms out of 12ms total time.
      
      To fix that, we let the minimum delay to be zero.  Note that the
      connect step times change too, as there are address resolution calls
      from that flow.
      
      The results were taken from running both client and server on the
      same node, over mlx4 RoCE port.
      
      before -->
      step              total ms     max ms     min us  us / conn
      create id    :        0.01       0.01       6.00       6.00
      resolve addr :        4.02       4.01    4013.00    4016.00
      resolve route:        0.18       0.18     182.00     183.00
      create qp    :        1.15       1.15    1150.00    1150.00
      connect      :        6.73       6.73    6730.00    6731.00
      disconnect   :        0.55       0.55     549.00     550.00
      destroy      :        0.01       0.01       9.00       9.00
      
      after -->
      step              total ms     max ms     min us  us / conn
      create id    :        0.01       0.01       6.00       6.00
      resolve addr :        0.05       0.05      49.00      52.00
      resolve route:        0.21       0.21     207.00     208.00
      create qp    :        1.10       1.10    1104.00    1104.00
      connect      :        1.22       1.22    1220.00    1221.00
      disconnect   :        0.71       0.71     713.00     713.00
      destroy      :        0.01       0.01       9.00       9.00
      Signed-off-by: default avatarOr Kehati <ork@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Acked-by: default avatarSean Hefty <sean.hefty@intel.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      346f98b4
  23. 14 Jan, 2014 1 commit
    • Matan Barak's avatar
      IB/core: Ethernet L2 attributes in verbs/cm structures · dd5f03be
      Matan Barak authored
      
      
      This patch add the support for Ethernet L2 attributes in the
      verbs/cm/cma structures.
      
      When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
      in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.
      
      Thus, those attributes were added to the following structures:
      
      * ib_ah_attr - added dmac
      * ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
      * ib_wc - added smac, vlan_id
      * ib_sa_path_rec - added smac, dmac, vlan_id
      * cm_av - added smac and vlan_id
      
      For the path record structure, extra care was taken to avoid the new
      fields when packing it into wire format, so we don't break the IB CM
      and SA wire protocol.
      
      On the active side, the CM fills. its internal structures from the
      path provided by the ULP.  We add there taking the ETH L2 attributes
      and placing them into the CM Address Handle (struct cm_av).
      
      On the passive side, the CM fills its internal structures from the WC
      associated with the REQ message.  We add there taking the ETH L2
      attributes from the WC.
      
      When the HW driver provides the required ETH L2 attributes in the WC,
      they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
      code checks for the presence of these flags, and in their absence does
      address resolution from the ib_init_ah_from_wc() helper function.
      
      ib_modify_qp_is_ok is also updated to consider the link layer. Some
      parameters are mandatory for Ethernet link layer, while they are
      irrelevant for IB.  Vendor drivers are modified to support the new
      function signature.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      dd5f03be
  24. 20 Jun, 2013 1 commit
  25. 13 Aug, 2012 1 commit
    • Tejun Heo's avatar
      workqueue: use mod_delayed_work() instead of cancel + queue · 41f63c53
      Tejun Heo authored
      
      
      Convert delayed_work users doing cancel_delayed_work() followed by
      queue_delayed_work() to mod_delayed_work().
      
      Most conversions are straight-forward.  Ones worth mentioning are,
      
      * drivers/edac/edac_mc.c: edac_mc_workq_setup() converted to always
        use mod_delayed_work() and cancel loop in
        edac_mc_reset_delay_period() is dropped.
      
      * drivers/platform/x86/thinkpad_acpi.c: No need to remember whether
        watchdog is active or not.  @fan_watchdog_active and related code
        dropped.
      
      * drivers/power/charger-manager.c: Seemingly a lot of
        delayed_work_pending() abuse going on here.
        [delayed_]work_pending() are unsynchronized and racy when used like
        this.  I converted one instance in fullbatt_handler().  Please
        conver the rest so that it invokes workqueue APIs for the intended
        target state rather than trying to game work item pending state
        transitions.  e.g. if timer should be modified - call
        mod_delayed_work(), canceled - call cancel_delayed_work[_sync]().
      
      * drivers/thermal/thermal_sys.c: thermal_zone_device_set_polling()
        simplified.  Note that round_jiffies() calls in this function are
        meaningless.  round_jiffies() work on absolute jiffies not delta
        delay used by delayed_work.
      
      v2: Tomi pointed out that __cancel_delayed_work() users can't be
          safely converted to mod_delayed_work().  They could be calling it
          from irq context and if that happens while delayed_work_timer_fn()
          is running, it could deadlock.  __cancel_delayed_work() users are
          dropped.
      Signed-off-by: default avatarTejun Heo <tj@kernel.org>
      Acked-by: default avatarHenrique de Moraes Holschuh <hmh@hmh.eng.br>
      Acked-by: default avatarDmitry Torokhov <dmitry.torokhov@gmail.com>
      Acked-by: default avatarAnton Vorontsov <cbouatmailru@gmail.com>
      Acked-by: default avatarDavid Howells <dhowells@redhat.com>
      Cc: Tomi Valkeinen <tomi.valkeinen@ti.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Jiri Kosina <jkosina@suse.cz>
      Cc: Doug Thompson <dougthompson@xmission.com>
      Cc: David Airlie <airlied@linux.ie>
      Cc: Roland Dreier <roland@kernel.org>
      Cc: "John W. Linville" <linville@tuxdriver.com>
      Cc: Zhang Rui <rui.zhang@intel.com>
      Cc: Len Brown <len.brown@intel.com>
      Cc: "J. Bruce Fields" <bfields@fieldses.org>
      Cc: Johannes Berg <johannes@sipsolutions.net>
      41f63c53
  26. 09 Jul, 2012 1 commit
  27. 26 Jan, 2012 1 commit
  28. 05 Dec, 2011 1 commit