ソースを参照

intel-iommu: Speed up map routines by using cached domain ASAP

We did before, in the end -- but it was at the bottom of a long stack of
functions. Add an inline wrapper get_valid_domain_for_dev() which will
use the cached one _first_ and only make the out-of-line call if it's
not already set.

This takes the average time taken for a 1-page intel_map_sg() from 5961
cycles to 4812 cycles on my Lenovo x200s test box -- a modest 20%.

Signed-off-by: David Woodhouse <David.Woodhouse@intel.com>
David Woodhouse 16 年 前
コミット
147202aa77
1 ファイル変更13 行追加2 行削除
  1. 13 2
      drivers/pci/intel-iommu.c

+ 13 - 2
drivers/pci/intel-iommu.c

@@ -2455,8 +2455,7 @@ static struct iova *intel_alloc_iova(struct device *dev,
 	return iova;
 }
 
-static struct dmar_domain *
-get_valid_domain_for_dev(struct pci_dev *pdev)
+static struct dmar_domain *__get_valid_domain_for_dev(struct pci_dev *pdev)
 {
 	struct dmar_domain *domain;
 	int ret;
@@ -2484,6 +2483,18 @@ get_valid_domain_for_dev(struct pci_dev *pdev)
 	return domain;
 }
 
+static inline struct dmar_domain *get_valid_domain_for_dev(struct pci_dev *dev)
+{
+	struct device_domain_info *info;
+
+	/* No lock here, assumes no domain exit in normal case */
+	info = dev->dev.archdata.iommu;
+	if (likely(info))
+		return info->domain;
+
+	return __get_valid_domain_for_dev(dev);
+}
+
 static int iommu_dummy(struct pci_dev *pdev)
 {
 	return pdev->dev.archdata.iommu == DUMMY_DEVICE_DOMAIN_INFO;