1. 18 May, 2015 11 commits
  2. 05 May, 2015 1 commit
  3. 10 Jun, 2014 1 commit
    • Tatyana Nikolova's avatar
      RDMA/core: Add support for iWARP Port Mapper user space service · 30dc5e63
      Tatyana Nikolova authored
      This patch adds iWARP Port Mapper (IWPM) Version 2 support.  The iWARP
      Port Mapper implementation is based on the port mapper specification
      section in the Sockets Direct Protocol paper -
      http://www.rdmaconsortium.org/home/draft-pinkerton-iwarp-sdp-v1.0.pdf
      
      Existing iWARP RDMA providers use the same IP address as the native
      TCP/IP stack when creating RDMA connections.  They need a mechanism to
      claim the TCP ports used for RDMA connections to prevent TCP port
      collisions when other host applications use TCP ports.  The iWARP Port
      Mapper provides a standard mechanism to accomplish this.  Without this
      service it is possible for RDMA application to bind/listen on the same
      port which is already being used by native TCP host application.  If
      that happens the incoming TCP connection data can be passed to the
      RDMA stack with error.
      
      The iWARP Port Mapper solution doesn't contain any changes to the
      existing network stack in the kernel space.  All the changes are
      contained with the infiniband tree and also in user space.
      
      The iWARP Port Mapper service is implemented as a user space daemon
      process.  Source for the IWPM service is located at
      http://git.openfabrics.org/git?p=~tnikolova/libiwpm-1.0.0/.git;a=summary
      
      
      
      The iWARP driver (port mapper client) sends to the IWPM service the
      local IP address and TCP port it has received from the RDMA
      application, when starting a connection.  The IWPM service performs a
      socket bind from user space to get an available TCP port, called a
      mapped port, and communicates it back to the client.  In that sense,
      the IWPM service is used to map the TCP port, which the RDMA
      application uses to any port available from the host TCP port
      space. The mapped ports are used in iWARP RDMA connections to avoid
      collisions with native TCP stack which is aware that these ports are
      taken. When an RDMA connection using a mapped port is terminated, the
      client notifies the IWPM service, which then releases the TCP port.
      
      The message exchange between the IWPM service and the iWARP drivers
      (between user space and kernel space) is implemented using netlink
      sockets.
      
      1) Netlink interface functions are added: ibnl_unicast() and
         ibnl_mulitcast() for sending netlink messages to user space
      
      2) The signature of the existing ibnl_put_msg() is changed to be more
         generic
      
      3) Two netlink clients are added: RDMA_NL_NES, RDMA_NL_C4IW
         corresponding to the two iWarp drivers - nes and cxgb4 which use
         the IWPM service
      
      4) Enums are added to enumerate the attributes in the netlink
         messages, which are exchanged between the user space IWPM service
         and the iWARP drivers
      Signed-off-by: default avatarTatyana Nikolova <tatyana.e.nikolova@intel.com>
      Signed-off-by: default avatarSteve Wise <swise@opengridcomputing.com>
      Reviewed-by: default avatarPJ Waskiewicz <pj.waskiewicz@solidfire.com>
      
      [ Fold in range checking fixes and nlh_next removal as suggested by Dan
        Carpenter and Steve Wise.  Fix sparse endianness in hash.  - Roland ]
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      30dc5e63
  4. 01 Apr, 2014 1 commit
  5. 23 Jan, 2014 1 commit
  6. 18 Jan, 2014 1 commit
  7. 14 Jan, 2014 2 commits
    • Aruna-Hewapathirane's avatar
      net: replace macros net_random and net_srandom with direct calls to prandom · 63862b5b
      Aruna-Hewapathirane authored
      
      
      This patch removes the net_random and net_srandom macros and replaces
      them with direct calls to the prandom ones. As new commits only seem to
      use prandom_u32 there is no use to keep them around.
      This change makes it easier to grep for users of prandom_u32.
      Signed-off-by: default avatarAruna-Hewapathirane <aruna.hewapathirane@gmail.com>
      Suggested-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Acked-by: default avatarHannes Frederic Sowa <hannes@stressinduktion.org>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      63862b5b
    • Matan Barak's avatar
      IB/core: Ethernet L2 attributes in verbs/cm structures · dd5f03be
      Matan Barak authored
      
      
      This patch add the support for Ethernet L2 attributes in the
      verbs/cm/cma structures.
      
      When dealing with L2 Ethernet, we should use smac, dmac, vlan ID and priority
      in a similar manner that the IB L2 (and the L4 PKEY) attributes are used.
      
      Thus, those attributes were added to the following structures:
      
      * ib_ah_attr - added dmac
      * ib_qp_attr - added smac and vlan_id, (sl remains vlan priority)
      * ib_wc - added smac, vlan_id
      * ib_sa_path_rec - added smac, dmac, vlan_id
      * cm_av - added smac and vlan_id
      
      For the path record structure, extra care was taken to avoid the new
      fields when packing it into wire format, so we don't break the IB CM
      and SA wire protocol.
      
      On the active side, the CM fills. its internal structures from the
      path provided by the ULP.  We add there taking the ETH L2 attributes
      and placing them into the CM Address Handle (struct cm_av).
      
      On the passive side, the CM fills its internal structures from the WC
      associated with the REQ message.  We add there taking the ETH L2
      attributes from the WC.
      
      When the HW driver provides the required ETH L2 attributes in the WC,
      they set the IB_WC_WITH_SMAC and IB_WC_WITH_VLAN flags. The IB core
      code checks for the presence of these flags, and in their absence does
      address resolution from the ib_init_ah_from_wc() helper function.
      
      ib_modify_qp_is_ok is also updated to consider the link layer. Some
      parameters are mandatory for Ethernet link layer, while they are
      irrelevant for IB.  Vendor drivers are modified to support the new
      function signature.
      Signed-off-by: default avatarMatan Barak <matanb@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      dd5f03be
  8. 11 Nov, 2013 1 commit
  9. 08 Nov, 2013 3 commits
    • Doug Ledford's avatar
      IB/cma: Check for GID on listening device first · be9130cc
      Doug Ledford authored
      
      
      As a simple optimization that should speed up the vast majority of
      connect attemps on IB devices, when we are searching for the GID of an
      incoming connection in the cached GID lists of devices, search the
      device that received the incoming connection request first.  If we
      don't find it there, then move on to other devices.
      
      This reduces the time to perform 10,000 connections considerably.
      Prior to this patch, a bad run of cmtime would look like this:
      
      connect      :    12399.26   12351.10    8609.00    1239.93
      
      With this patch, it looks more like this:
      
      connect      :     5864.86    5799.80    8876.00     586.49
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      be9130cc
    • Doug Ledford's avatar
      IB/cma: Use cached gids · 29f27e84
      Doug Ledford authored
      The cma_acquire_dev function was changed by commit 3c86aa70
      
      
      ("RDMA/cm: Add RDMA CM support for IBoE devices") to use find_gid_port()
      because multiport devices might have either IB or IBoE formatted gids.
      The old function assumed that all ports on the same device used the
      same GID format.
      
      However, when it was changed to use find_gid_port(), we inadvertently
      lost usage of the GID cache.  This turned out to be a very costly
      change.  In our testing, each iteration through each index of the GID
      table takes roughly 35us.  When you have multiple devices in a system,
      and the GID you are looking for is on one of the later devices, the
      code loops through all of the GID indexes on all of the early devices
      before it finally succeeds on the target device.  This pathological
      search behavior combined with 35us per GID table index retrieval
      results in results such as the following from the cmtime application
      that's part of the latest librdmacm git repo:
      
      ib1:
      step              total ms     max ms     min us  us / conn
      create id    :       29.42       0.04       1.00       2.94
      bind addr    :   186705.66      19.00   18556.00   18670.57
      resolve addr :       41.93       9.68     619.00       4.19
      resolve route:      486.93       0.48     101.00      48.69
      create qp    :     4021.95       6.18     330.00     402.20
      connect      :    68350.39   68588.17   24632.00    6835.04
      disconnect   :     1460.43     252.65-1862269.00     146.04
      destroy      :       41.16       0.04       2.00       4.12
      
      ib0:
      step              total ms     max ms     min us  us / conn
      create id    :       28.61       0.68       1.00       2.86
      bind addr    :     2178.86       2.95     201.00     217.89
      resolve addr :       51.26      16.85     845.00       5.13
      resolve route:      620.08       0.43      92.00      62.01
      create qp    :     3344.40       6.36     273.00     334.44
      connect      :     6435.99    6368.53    7844.00     643.60
      disconnect   :     5095.38     321.90     757.00     509.54
      destroy      :       37.13       0.02       2.00       3.71
      
      Clearly, both the bind address and connect operations suffer
      a huge penalty for being anything other than the default
      GID on the first port in the system.
      
      After applying this patch, the numbers now look like this:
      
      ib1:
      step              total ms     max ms     min us  us / conn
      create id    :       30.15       0.03       1.00       3.01
      bind addr    :       80.27       0.04       7.00       8.03
      resolve addr :       43.02      13.53     589.00       4.30
      resolve route:      482.90       0.45     100.00      48.29
      create qp    :     3986.55       5.80     330.00     398.66
      connect      :     7141.53    7051.29    5005.00     714.15
      disconnect   :     5038.85     193.63     918.00     503.88
      destroy      :       37.02       0.04       2.00       3.70
      
      ib0:
      step              total ms     max ms     min us  us / conn
      create id    :       34.27       0.05       1.00       3.43
      bind addr    :       26.45       0.04       1.00       2.64
      resolve addr :       38.25      10.54     760.00       3.82
      resolve route:      604.79       0.43      97.00      60.48
      create qp    :     3314.95       6.34     273.00     331.49
      connect      :    12399.26   12351.10    8609.00    1239.93
      disconnect   :     5096.76     270.72    1015.00     509.68
      destroy      :       37.10       0.03       2.00       3.71
      
      It's worth noting that we still suffer a bit of a penalty on
      connect to the wrong device, but the penalty is much less than
      it used to be.  Follow on patches deal with this penalty.
      
      Many thanks to Neil Horman for helping to track the source of
      slow function that allowed us to track down the fact that
      the original patch I mentioned above backed out cache usage
      and identify just how much that impacted the system.
      Signed-off-by: default avatarDoug Ledford <dledford@redhat.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      29f27e84
    • Eyal Perry's avatar
      RDMA/cma: Set IBoE SL (user-priority) by egress map when using vlans · eb072c4b
      Eyal Perry authored
      On top of commit 366cddb4
      
       "IB/rdma_cm: TOS <=> UP mapping for IBoE", add
      support for case vlan egress map is used.
      
      When the IBoE session is being set over a vlan, inherit the socket priority
      to vlan priority mapping which was configured for the vlan device egress map.
      Signed-off-by: default avatarEyal Perry <eyalpe@mellanox.com>
      Signed-off-by: default avatarAmir Vadai <amirv@mellanox.com>
      Signed-off-by: default avatarOr Gerlitz <ogerlitz@mellanox.com>
      Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
      eb072c4b
  10. 01 Oct, 2013 1 commit
  11. 12 Aug, 2013 1 commit
  12. 31 Jul, 2013 2 commits
  13. 30 Jul, 2013 1 commit
    • Paul Bolle's avatar
      RDMA/cma: Fix gcc warning · 8fb488d7
      Paul Bolle authored
      
      
      Building cma.o triggers this gcc warning:
      
          drivers/infiniband/core/cma.c: In function ‘rdma_resolve_addr’:
          drivers/infiniband/core/cma.c:465:23: warning: ‘port’ may be used uninitialized in this function [-Wmaybe-uninitialized]
          drivers/infiniband/core/cma.c:426:5: note: ‘port’ was declared here
      
      This is a false positive, as "port" will always be initialized if we're
      at "found". But if we assign to "id_priv->id.port_num" directly, we can
      drop "port". That will, obviously, silence gcc.
      Signed-off-by: default avatarPaul Bolle <pebolle@tiscali.nl>
      Signed-off-by: default avatarSean Hefty <sean.hefty@intel.com>
      Signed-off-by: default avatarRoland Dreier <roland@purestorage.com>
      8fb488d7
  14. 15 Jul, 2013 1 commit
  15. 21 Jun, 2013 7 commits
  16. 20 Jun, 2013 5 commits