浏览代码

vxge: prefetch RxD descriptors

This patch prefetches RxD descriptors which helps to lower the latency of a
cache miss in vxge_hw_ring_rxd_next_completed.  This lowers the % of CPU
time used by vxge_hw_ring_rxd_next_completed() where the descriptor is
accessed in profiling netperf on a P4 Xeon from 1.5% to 1.0%.

Signed-off-by: Benjamin LaHaise <ben.lahaise@neterion.com>
Signed-off-by: Sreenivasa Honnur <sreenivasa.honnur@neterion.com>
Signed-off-by: Ramkrishna Vepa <ram.vepa@neterion.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Benjamin LaHaise 16 年之前
父节点
当前提交
3f23e436d2
共有 2 个文件被更改,包括 2 次插入0 次删除
  1. 1 0
      drivers/net/vxge/vxge-main.c
  2. 1 0
      drivers/net/vxge/vxge-traffic.c

+ 1 - 0
drivers/net/vxge/vxge-main.c

@@ -445,6 +445,7 @@ vxge_rx_1b_compl(struct __vxge_hw_ring *ringh, void *dtr,
 	vxge_hw_ring_replenish(ringh, 0);
 
 	do {
+		prefetch((char *)dtr + L1_CACHE_BYTES);
 		rx_priv = vxge_hw_ring_rxd_private_get(dtr);
 		skb = rx_priv->skb;
 		data_size = rx_priv->data_size;

+ 1 - 0
drivers/net/vxge/vxge-traffic.c

@@ -731,6 +731,7 @@ vxge_hw_channel_dtr_try_complete(struct __vxge_hw_channel *channel, void **dtrh)
 	vxge_assert(channel->compl_index < channel->length);
 
 	*dtrh =	channel->work_arr[channel->compl_index];
+	prefetch(*dtrh);
 }
 
 /*