Commit bd9e89f2 authored by Claudiu Manoil's avatar Claudiu Manoil Committed by David S. Miller
Browse files

gianfar: GRO_DROP is unlikely



The change is significant since it affects the rx hot path.
Paul observed and documented the effects at asm level, see
below:

"It turns out that it does make a difference, since gfar_process_frame
gets inlined, and so the increment code gets moved out of line (I have
marked the if statment with * and the increment code within "-----"):

  ------------------------- as is currently ------------------
     4d14:       80 61 00 18     lwz     r3,24(r1)
     4d18:       7f c4 f3 78     mr      r4,r30
     4d1c:       48 00 00 01     bl      4d1c <gfar_clean_rx_ring+0x10c>
  *  4d20:       2f 83 00 04     cmpwi   cr7,r3,4
     4d24:       40 9e 00 1c     bne-    cr7,4d40
<gfar_clean_rx_ring+0x130>
        ----------------------------
     4d28:       81 3c 01 f8     lwz     r9,504(r28)
     4d2c:       81 5c 01 fc     lwz     r10,508(r28)
     4d30:       31 4a 00 01     addic   r10,r10,1
     4d34:       7d 29 01 94     addze   r9,r9
     4d38:       91 3c 01 f8     stw     r9,504(r28)
     4d3c:       91 5c 01 fc     stw     r10,508(r28)
        ----------------------------
     4d40:       a0 1f 00 24     lhz     r0,36(r31)
     4d44:       81 3f 00 00     lwz     r9,0(r31)
     4d48:       7f a4 eb 78     mr      r4,r29
     4d4c:       7f e3 fb 78     mr      r3,r31

  -------------------------- unlikely ------------------------
     4d14:       80 61 00 18     lwz     r3,24(r1)
     4d18:       7f c4 f3 78     mr      r4,r30
     4d1c:       48 00 00 01     bl      4d1c <gfar_clean_rx_ring+0x10c>
  *  4d20:       2f 83 00 04     cmpwi   cr7,r3,4
     4d24:       41 9e 03 94     beq-    cr7,50b8
<gfar_clean_rx_ring+0x4a8>
     4d28:       a0 1f 00 24     lhz     r0,36(r31)
     4d2c:       81 3f 00 00     lwz     r9,0(r31)
     4d30:       7f a4 eb 78     mr      r4,r29
     4d34:       7f e3 fb 78     mr      r3,r31
[...]
     50b8:       81 3c 01 f8     lwz     r9,504(r28)
     50bc:       81 5c 01 fc     lwz     r10,508(r28)
     50c0:       31 4a 00 01     addic   r10,r10,1
     50c4:       7d 29 01 94     addze   r9,r9
     50c8:       91 3c 01 f8     stw     r9,504(r28)
     50cc:       91 5c 01 fc     stw     r10,508(r28)
     50d0:       4b ff fc 58     b       4d28 <gfar_clean_rx_ring+0x118>

So, the increment does actually get moved ~1k away."

Cc: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: default avatarClaudiu Manoil <claudiu.manoil@freescale.com>
Signed-off-by: default avatarDavid S. Miller <davem@davemloft.net>
parent b597d20d
...@@ -2740,7 +2740,7 @@ static int gfar_process_frame(struct net_device *dev, struct sk_buff *skb, ...@@ -2740,7 +2740,7 @@ static int gfar_process_frame(struct net_device *dev, struct sk_buff *skb,
/* Send the packet up the stack */ /* Send the packet up the stack */
ret = napi_gro_receive(napi, skb); ret = napi_gro_receive(napi, skb);
if (GRO_DROP == ret) if (unlikely(GRO_DROP == ret))
atomic64_inc(&priv->extra_stats.kernel_dropped); atomic64_inc(&priv->extra_stats.kernel_dropped);
return 0; return 0;
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment