浏览代码

[PATCH] Fix NUMA node sizing in nr_free_zone_pages

We are iterating over all nodes in nr_free_zone_pages().  Because the
fallback zonelists contain all nodes in the system, and we walk all the
zonelists, we're counting memory multiple times (once for each node).  This
caused us to make a size estimate of 32GB for an 8GB AMD64 box, which makes
all the dirty ratio calculations, etc incorrect.

There's still a further bug to fix from e820 holes causing overestimation
as well, but this fix is separate, and good as is, and fixes one class of
problems.  Problem found by Badari, and tested by Ram Pai - thanks!

Signed-off-by:  Martin J. Bligh <mbligh@mbligh.org>
Signed-off-by:  Matt Dobson <colpatch@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Martin J. Bligh 20 年之前
父节点
当前提交
e310fd4325
共有 1 个文件被更改,包括 10 次插入11 次删除
  1. 10 11
      mm/page_alloc.c

+ 10 - 11
mm/page_alloc.c

@@ -1061,20 +1061,19 @@ unsigned int nr_free_pages_pgdat(pg_data_t *pgdat)
 
 static unsigned int nr_free_zone_pages(int offset)
 {
-	pg_data_t *pgdat;
+	/* Just pick one node, since fallback list is circular */
+	pg_data_t *pgdat = NODE_DATA(numa_node_id());
 	unsigned int sum = 0;
 
-	for_each_pgdat(pgdat) {
-		struct zonelist *zonelist = pgdat->node_zonelists + offset;
-		struct zone **zonep = zonelist->zones;
-		struct zone *zone;
+	struct zonelist *zonelist = pgdat->node_zonelists + offset;
+	struct zone **zonep = zonelist->zones;
+	struct zone *zone;
 
-		for (zone = *zonep++; zone; zone = *zonep++) {
-			unsigned long size = zone->present_pages;
-			unsigned long high = zone->pages_high;
-			if (size > high)
-				sum += size - high;
-		}
+	for (zone = *zonep++; zone; zone = *zonep++) {
+		unsigned long size = zone->present_pages;
+		unsigned long high = zone->pages_high;
+		if (size > high)
+			sum += size - high;
 	}
 
 	return sum;