lkundrak / rpms / kernel

Forked from rpms/kernel 4 years ago
Clone
Kyle McMartin fe72140
From df43fae25437d7bc7dfff72599c1e825038b67cf Mon Sep 17 00:00:00 2001
Kyle McMartin fe72140
From: Mel Gorman <mel@csn.ul.ie>
Kyle McMartin fe72140
Date: Wed, 24 Nov 2010 22:18:23 -0500
Kyle McMartin fe72140
Subject: [PATCH 1/2] mm: page allocator: Adjust the per-cpu counter threshold when memory is low
Kyle McMartin fe72140
Kyle McMartin fe72140
Commit aa45484 ("calculate a better estimate of NR_FREE_PAGES when memory
Kyle McMartin fe72140
is low") noted that watermarks were based on the vmstat NR_FREE_PAGES.  To
Kyle McMartin fe72140
avoid synchronization overhead, these counters are maintained on a per-cpu
Kyle McMartin fe72140
basis and drained both periodically and when a threshold is above a
Kyle McMartin fe72140
threshold.  On large CPU systems, the difference between the estimate and
Kyle McMartin fe72140
real value of NR_FREE_PAGES can be very high.  The system can get into a
Kyle McMartin fe72140
case where pages are allocated far below the min watermark potentially
Kyle McMartin fe72140
causing livelock issues.  The commit solved the problem by taking a better
Kyle McMartin fe72140
reading of NR_FREE_PAGES when memory was low.
Kyle McMartin fe72140
Kyle McMartin fe72140
Unfortately, as reported by Shaohua Li this accurate reading can consume a
Kyle McMartin fe72140
large amount of CPU time on systems with many sockets due to cache line
Kyle McMartin fe72140
bouncing.  This patch takes a different approach.  For large machines
Kyle McMartin fe72140
where counter drift might be unsafe and while kswapd is awake, the per-cpu
Kyle McMartin fe72140
thresholds for the target pgdat are reduced to limit the level of drift to
Kyle McMartin fe72140
what should be a safe level.  This incurs a performance penalty in heavy
Kyle McMartin fe72140
memory pressure by a factor that depends on the workload and the machine
Kyle McMartin fe72140
but the machine should function correctly without accidentally exhausting
Kyle McMartin fe72140
all memory on a node.  There is an additional cost when kswapd wakes and
Kyle McMartin fe72140
sleeps but the event is not expected to be frequent - in Shaohua's test
Kyle McMartin fe72140
case, there was one recorded sleep and wake event at least.
Kyle McMartin fe72140
Kyle McMartin fe72140
To ensure that kswapd wakes up, a safe version of zone_watermark_ok() is
Kyle McMartin fe72140
introduced that takes a more accurate reading of NR_FREE_PAGES when called
Kyle McMartin fe72140
from wakeup_kswapd, when deciding whether it is really safe to go back to
Kyle McMartin fe72140
sleep in sleeping_prematurely() and when deciding if a zone is really
Kyle McMartin fe72140
balanced or not in balance_pgdat().  We are still using an expensive
Kyle McMartin fe72140
function but limiting how often it is called.
Kyle McMartin fe72140
Kyle McMartin fe72140
When the test case is reproduced, the time spent in the watermark
Kyle McMartin fe72140
functions is reduced.  The following report is on the percentage of time
Kyle McMartin fe72140
spent cumulatively spent in the functions zone_nr_free_pages(),
Kyle McMartin fe72140
zone_watermark_ok(), __zone_watermark_ok(), zone_watermark_ok_safe(),
Kyle McMartin fe72140
zone_page_state_snapshot(), zone_page_state().
Kyle McMartin fe72140
Kyle McMartin fe72140
vanilla                      11.6615%
Kyle McMartin fe72140
disable-threshold            0.2584%
Kyle McMartin fe72140
Kyle McMartin fe72140
Reported-by: Shaohua Li <shaohua.li@intel.com>
Kyle McMartin fe72140
Signed-off-by: Mel Gorman <mel@csn.ul.ie>
Kyle McMartin fe72140
Reviewed-by: Christoph Lameter <cl@linux.com>
Kyle McMartin fe72140
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Kyle McMartin fe72140
[[http://userweb.kernel.org/~akpm/mmotm/broken-out/mm-page-allocator-adjust-the-per-cpu-counter-threshold-when-memory-is-low.patch]]
Kyle McMartin fe72140
---
Kyle McMartin fe72140
 include/linux/mmzone.h |   10 ++-----
Kyle McMartin fe72140
 include/linux/vmstat.h |    5 +++
Kyle McMartin fe72140
 mm/mmzone.c            |   21 ---------------
Kyle McMartin fe72140
 mm/page_alloc.c        |   35 +++++++++++++++++++-----
Kyle McMartin fe72140
 mm/vmscan.c            |   23 +++++++++-------
Kyle McMartin fe72140
 mm/vmstat.c            |   68 +++++++++++++++++++++++++++++++++++++++++++++++-
Kyle McMartin fe72140
 6 files changed, 115 insertions(+), 47 deletions(-)
Kyle McMartin fe72140
Kyle McMartin fe72140
diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
Kyle McMartin fe72140
index 3984c4e..8d789d7 100644
Kyle McMartin fe72140
--- a/include/linux/mmzone.h
Kyle McMartin fe72140
+++ b/include/linux/mmzone.h
Kyle McMartin fe72140
@@ -448,12 +448,6 @@ static inline int zone_is_oom_locked(const struct zone *zone)
Kyle McMartin fe72140
 	return test_bit(ZONE_OOM_LOCKED, &zone->flags);
Kyle McMartin fe72140
 }
Kyle McMartin fe72140
 
Kyle McMartin fe72140
-#ifdef CONFIG_SMP
Kyle McMartin fe72140
-unsigned long zone_nr_free_pages(struct zone *zone);
Kyle McMartin fe72140
-#else
Kyle McMartin fe72140
-#define zone_nr_free_pages(zone) zone_page_state(zone, NR_FREE_PAGES)
Kyle McMartin fe72140
-#endif /* CONFIG_SMP */
Kyle McMartin fe72140
-
Kyle McMartin fe72140
 /*
Kyle McMartin fe72140
  * The "priority" of VM scanning is how much of the queues we will scan in one
Kyle McMartin fe72140
  * go. A value of 12 for DEF_PRIORITY implies that we will scan 1/4096th of the
Kyle McMartin fe72140
@@ -651,7 +645,9 @@ typedef struct pglist_data {
Kyle McMartin fe72140
 extern struct mutex zonelists_mutex;
Kyle McMartin fe72140
 void build_all_zonelists(void *data);
Kyle McMartin fe72140
 void wakeup_kswapd(struct zone *zone, int order);
Kyle McMartin fe72140
-int zone_watermark_ok(struct zone *z, int order, unsigned long mark,
Kyle McMartin fe72140
+bool zone_watermark_ok(struct zone *z, int order, unsigned long mark,
Kyle McMartin fe72140
+		int classzone_idx, int alloc_flags);
Kyle McMartin fe72140
+bool zone_watermark_ok_safe(struct zone *z, int order, unsigned long mark,
Kyle McMartin fe72140
 		int classzone_idx, int alloc_flags);
Kyle McMartin fe72140
 enum memmap_context {
Kyle McMartin fe72140
 	MEMMAP_EARLY,
Kyle McMartin fe72140
diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
Kyle McMartin fe72140
index eaaea37..e4cc21c 100644
Kyle McMartin fe72140
--- a/include/linux/vmstat.h
Kyle McMartin fe72140
+++ b/include/linux/vmstat.h
Kyle McMartin fe72140
@@ -254,6 +254,8 @@ extern void dec_zone_state(struct zone *, enum zone_stat_item);
Kyle McMartin fe72140
 extern void __dec_zone_state(struct zone *, enum zone_stat_item);
Kyle McMartin fe72140
 
Kyle McMartin fe72140
 void refresh_cpu_vm_stats(int);
Kyle McMartin fe72140
+void reduce_pgdat_percpu_threshold(pg_data_t *pgdat);
Kyle McMartin fe72140
+void restore_pgdat_percpu_threshold(pg_data_t *pgdat);
Kyle McMartin fe72140
 #else /* CONFIG_SMP */
Kyle McMartin fe72140
 
Kyle McMartin fe72140
 /*
Kyle McMartin fe72140
@@ -298,6 +300,9 @@ static inline void __dec_zone_page_state(struct page *page,
Kyle McMartin fe72140
 #define dec_zone_page_state __dec_zone_page_state
Kyle McMartin fe72140
 #define mod_zone_page_state __mod_zone_page_state
Kyle McMartin fe72140
 
Kyle McMartin fe72140
+static inline void reduce_pgdat_percpu_threshold(pg_data_t *pgdat) { }
Kyle McMartin fe72140
+static inline void restore_pgdat_percpu_threshold(pg_data_t *pgdat) { }
Kyle McMartin fe72140
+
Kyle McMartin fe72140
 static inline void refresh_cpu_vm_stats(int cpu) { }
Kyle McMartin fe72140
 #endif
Kyle McMartin fe72140
 
Kyle McMartin fe72140
diff --git a/mm/mmzone.c b/mm/mmzone.c
Kyle McMartin fe72140
index e35bfb8..f5b7d17 100644
Kyle McMartin fe72140
--- a/mm/mmzone.c
Kyle McMartin fe72140
+++ b/mm/mmzone.c
Kyle McMartin fe72140
@@ -87,24 +87,3 @@ int memmap_valid_within(unsigned long pfn,
Kyle McMartin fe72140
 	return 1;
Kyle McMartin fe72140
 }
Kyle McMartin fe72140
 #endif /* CONFIG_ARCH_HAS_HOLES_MEMORYMODEL */
Kyle McMartin fe72140
-
Kyle McMartin fe72140
-#ifdef CONFIG_SMP
Kyle McMartin fe72140
-/* Called when a more accurate view of NR_FREE_PAGES is needed */
Kyle McMartin fe72140
-unsigned long zone_nr_free_pages(struct zone *zone)
Kyle McMartin fe72140
-{
Kyle McMartin fe72140
-	unsigned long nr_free_pages = zone_page_state(zone, NR_FREE_PAGES);
Kyle McMartin fe72140
-
Kyle McMartin fe72140
-	/*
Kyle McMartin fe72140
-	 * While kswapd is awake, it is considered the zone is under some
Kyle McMartin fe72140
-	 * memory pressure. Under pressure, there is a risk that
Kyle McMartin fe72140
-	 * per-cpu-counter-drift will allow the min watermark to be breached
Kyle McMartin fe72140
-	 * potentially causing a live-lock. While kswapd is awake and
Kyle McMartin fe72140
-	 * free pages are low, get a better estimate for free pages
Kyle McMartin fe72140
-	 */
Kyle McMartin fe72140
-	if (nr_free_pages < zone->percpu_drift_mark &&
Kyle McMartin fe72140
-			!waitqueue_active(&zone->zone_pgdat->kswapd_wait))
Kyle McMartin fe72140
-		return zone_page_state_snapshot(zone, NR_FREE_PAGES);
Kyle McMartin fe72140
-
Kyle McMartin fe72140
-	return nr_free_pages;
Kyle McMartin fe72140
-}
Kyle McMartin fe72140
-#endif /* CONFIG_SMP */
Kyle McMartin fe72140
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
Kyle McMartin fe72140
index f12ad18..0286150 100644
Kyle McMartin fe72140
--- a/mm/page_alloc.c
Kyle McMartin fe72140
+++ b/mm/page_alloc.c
Kyle McMartin fe72140
@@ -1454,24 +1454,24 @@ static inline int should_fail_alloc_page(gfp_t gfp_mask, unsigned int order)
Kyle McMartin fe72140
 #endif /* CONFIG_FAIL_PAGE_ALLOC */
Kyle McMartin fe72140
 
Kyle McMartin fe72140
 /*
Kyle McMartin fe72140
- * Return 1 if free pages are above 'mark'. This takes into account the order
Kyle McMartin fe72140
+ * Return true if free pages are above 'mark'. This takes into account the order
Kyle McMartin fe72140
  * of the allocation.
Kyle McMartin fe72140
  */
Kyle McMartin fe72140
-int zone_watermark_ok(struct zone *z, int order, unsigned long mark,
Kyle McMartin fe72140
-		      int classzone_idx, int alloc_flags)
Kyle McMartin fe72140
+static bool __zone_watermark_ok(struct zone *z, int order, unsigned long mark,
Kyle McMartin fe72140
+		      int classzone_idx, int alloc_flags, long free_pages)
Kyle McMartin fe72140
 {
Kyle McMartin fe72140
 	/* free_pages my go negative - that's OK */
Kyle McMartin fe72140
 	long min = mark;
Kyle McMartin fe72140
-	long free_pages = zone_nr_free_pages(z) - (1 << order) + 1;
Kyle McMartin fe72140
 	int o;
Kyle McMartin fe72140
 
Kyle McMartin fe72140
+	free_pages -= (1 << order) + 1;
Kyle McMartin fe72140
 	if (alloc_flags & ALLOC_HIGH)
Kyle McMartin fe72140
 		min -= min / 2;
Kyle McMartin fe72140
 	if (alloc_flags & ALLOC_HARDER)
Kyle McMartin fe72140
 		min -= min / 4;
Kyle McMartin fe72140
 
Kyle McMartin fe72140
 	if (free_pages <= min + z->lowmem_reserve[classzone_idx])
Kyle McMartin fe72140
-		return 0;
Kyle McMartin fe72140
+		return false;
Kyle McMartin fe72140
 	for (o = 0; o < order; o++) {
Kyle McMartin fe72140
 		/* At the next order, this order's pages become unavailable */
Kyle McMartin fe72140
 		free_pages -= z->free_area[o].nr_free << o;
Kyle McMartin fe72140
@@ -1480,9 +1480,28 @@ int zone_watermark_ok(struct zone *z, int order, unsigned long mark,
Kyle McMartin fe72140
 		min >>= 1;
Kyle McMartin fe72140
 
Kyle McMartin fe72140
 		if (free_pages <= min)
Kyle McMartin fe72140
-			return 0;
Kyle McMartin fe72140
+			return false;
Kyle McMartin fe72140
 	}
Kyle McMartin fe72140
-	return 1;
Kyle McMartin fe72140
+	return true;
Kyle McMartin fe72140
+}
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+bool zone_watermark_ok(struct zone *z, int order, unsigned long mark,
Kyle McMartin fe72140
+		      int classzone_idx, int alloc_flags)
Kyle McMartin fe72140
+{
Kyle McMartin fe72140
+	return __zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags,
Kyle McMartin fe72140
+					zone_page_state(z, NR_FREE_PAGES));
Kyle McMartin fe72140
+}
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+bool zone_watermark_ok_safe(struct zone *z, int order, unsigned long mark,
Kyle McMartin fe72140
+		      int classzone_idx, int alloc_flags)
Kyle McMartin fe72140
+{
Kyle McMartin fe72140
+	long free_pages = zone_page_state(z, NR_FREE_PAGES);
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+	if (z->percpu_drift_mark && free_pages < z->percpu_drift_mark)
Kyle McMartin fe72140
+		free_pages = zone_page_state_snapshot(z, NR_FREE_PAGES);
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+	return __zone_watermark_ok(z, order, mark, classzone_idx, alloc_flags,
Kyle McMartin fe72140
+								free_pages);
Kyle McMartin fe72140
 }
Kyle McMartin fe72140
 
Kyle McMartin fe72140
 #ifdef CONFIG_NUMA
Kyle McMartin fe72140
@@ -2436,7 +2455,7 @@ void show_free_areas(void)
Kyle McMartin fe72140
 			" all_unreclaimable? %s"
Kyle McMartin fe72140
 			"\n",
Kyle McMartin fe72140
 			zone->name,
Kyle McMartin fe72140
-			K(zone_nr_free_pages(zone)),
Kyle McMartin fe72140
+			K(zone_page_state(zone, NR_FREE_PAGES)),
Kyle McMartin fe72140
 			K(min_wmark_pages(zone)),
Kyle McMartin fe72140
 			K(low_wmark_pages(zone)),
Kyle McMartin fe72140
 			K(high_wmark_pages(zone)),
Kyle McMartin fe72140
diff --git a/mm/vmscan.c b/mm/vmscan.c
Kyle McMartin fe72140
index c5dfabf..3e71cb1 100644
Kyle McMartin fe72140
--- a/mm/vmscan.c
Kyle McMartin fe72140
+++ b/mm/vmscan.c
Kyle McMartin fe72140
@@ -2082,7 +2082,7 @@ static int sleeping_prematurely(pg_data_t *pgdat, int order, long remaining)
Kyle McMartin fe72140
 		if (zone->all_unreclaimable)
Kyle McMartin fe72140
 			continue;
Kyle McMartin fe72140
 
Kyle McMartin fe72140
-		if (!zone_watermark_ok(zone, order, high_wmark_pages(zone),
Kyle McMartin fe72140
+		if (!zone_watermark_ok_safe(zone, order, high_wmark_pages(zone),
Kyle McMartin fe72140
 								0, 0))
Kyle McMartin fe72140
 			return 1;
Kyle McMartin fe72140
 	}
Kyle McMartin fe72140
@@ -2169,7 +2169,7 @@ loop_again:
Kyle McMartin fe72140
 				shrink_active_list(SWAP_CLUSTER_MAX, zone,
Kyle McMartin fe72140
 							&sc, priority, 0);
Kyle McMartin fe72140
 
Kyle McMartin fe72140
-			if (!zone_watermark_ok(zone, order,
Kyle McMartin fe72140
+			if (!zone_watermark_ok_safe(zone, order,
Kyle McMartin fe72140
 					high_wmark_pages(zone), 0, 0)) {
Kyle McMartin fe72140
 				end_zone = i;
Kyle McMartin fe72140
 				break;
Kyle McMartin fe72140
@@ -2215,7 +2215,7 @@ loop_again:
Kyle McMartin fe72140
 			 * We put equal pressure on every zone, unless one
Kyle McMartin fe72140
 			 * zone has way too many pages free already.
Kyle McMartin fe72140
 			 */
Kyle McMartin fe72140
-			if (!zone_watermark_ok(zone, order,
Kyle McMartin fe72140
+			if (!zone_watermark_ok_safe(zone, order,
Kyle McMartin fe72140
 					8*high_wmark_pages(zone), end_zone, 0))
Kyle McMartin fe72140
 				shrink_zone(priority, zone, &sc);
Kyle McMartin fe72140
 			reclaim_state->reclaimed_slab = 0;
Kyle McMartin fe72140
@@ -2236,7 +2236,7 @@ loop_again:
Kyle McMartin fe72140
 			    total_scanned > sc.nr_reclaimed + sc.nr_reclaimed / 2)
Kyle McMartin fe72140
 				sc.may_writepage = 1;
Kyle McMartin fe72140
 
Kyle McMartin fe72140
-			if (!zone_watermark_ok(zone, order,
Kyle McMartin fe72140
+			if (!zone_watermark_ok_safe(zone, order,
Kyle McMartin fe72140
 					high_wmark_pages(zone), end_zone, 0)) {
Kyle McMartin fe72140
 				all_zones_ok = 0;
Kyle McMartin fe72140
 				/*
Kyle McMartin fe72140
@@ -2244,7 +2244,7 @@ loop_again:
Kyle McMartin fe72140
 				 * means that we have a GFP_ATOMIC allocation
Kyle McMartin fe72140
 				 * failure risk. Hurry up!
Kyle McMartin fe72140
 				 */
Kyle McMartin fe72140
-				if (!zone_watermark_ok(zone, order,
Kyle McMartin fe72140
+				if (!zone_watermark_ok_safe(zone, order,
Kyle McMartin fe72140
 					    min_wmark_pages(zone), end_zone, 0))
Kyle McMartin fe72140
 					has_under_min_watermark_zone = 1;
Kyle McMartin fe72140
 			}
Kyle McMartin fe72140
@@ -2378,7 +2378,9 @@ static int kswapd(void *p)
Kyle McMartin fe72140
 				 */
Kyle McMartin fe72140
 				if (!sleeping_prematurely(pgdat, order, remaining)) {
Kyle McMartin fe72140
 					trace_mm_vmscan_kswapd_sleep(pgdat->node_id);
Kyle McMartin fe72140
+					restore_pgdat_percpu_threshold(pgdat);
Kyle McMartin fe72140
 					schedule();
Kyle McMartin fe72140
+					reduce_pgdat_percpu_threshold(pgdat);
Kyle McMartin fe72140
 				} else {
Kyle McMartin fe72140
 					if (remaining)
Kyle McMartin fe72140
 						count_vm_event(KSWAPD_LOW_WMARK_HIT_QUICKLY);
Kyle McMartin fe72140
@@ -2417,16 +2419,17 @@ void wakeup_kswapd(struct zone *zone, int order)
Kyle McMartin fe72140
 	if (!populated_zone(zone))
Kyle McMartin fe72140
 		return;
Kyle McMartin fe72140
 
Kyle McMartin fe72140
-	pgdat = zone->zone_pgdat;
Kyle McMartin fe72140
-	if (zone_watermark_ok(zone, order, low_wmark_pages(zone), 0, 0))
Kyle McMartin fe72140
+	if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
Kyle McMartin fe72140
 		return;
Kyle McMartin fe72140
+	pgdat = zone->zone_pgdat;
Kyle McMartin fe72140
 	if (pgdat->kswapd_max_order < order)
Kyle McMartin fe72140
 		pgdat->kswapd_max_order = order;
Kyle McMartin fe72140
-	trace_mm_vmscan_wakeup_kswapd(pgdat->node_id, zone_idx(zone), order);
Kyle McMartin fe72140
-	if (!cpuset_zone_allowed_hardwall(zone, GFP_KERNEL))
Kyle McMartin fe72140
-		return;
Kyle McMartin fe72140
 	if (!waitqueue_active(&pgdat->kswapd_wait))
Kyle McMartin fe72140
 		return;
Kyle McMartin fe72140
+	if (zone_watermark_ok_safe(zone, order, low_wmark_pages(zone), 0, 0))
Kyle McMartin fe72140
+		return;
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+	trace_mm_vmscan_wakeup_kswapd(pgdat->node_id, zone_idx(zone), order);
Kyle McMartin fe72140
 	wake_up_interruptible(&pgdat->kswapd_wait);
Kyle McMartin fe72140
 }
Kyle McMartin fe72140
 
Kyle McMartin fe72140
diff --git a/mm/vmstat.c b/mm/vmstat.c
Kyle McMartin fe72140
index 355a9e6..4d7faeb 100644
Kyle McMartin fe72140
--- a/mm/vmstat.c
Kyle McMartin fe72140
+++ b/mm/vmstat.c
Kyle McMartin fe72140
@@ -81,6 +81,30 @@ EXPORT_SYMBOL(vm_stat);
Kyle McMartin fe72140
 
Kyle McMartin fe72140
 #ifdef CONFIG_SMP
Kyle McMartin fe72140
 
Kyle McMartin fe72140
+static int calculate_pressure_threshold(struct zone *zone)
Kyle McMartin fe72140
+{
Kyle McMartin fe72140
+	int threshold;
Kyle McMartin fe72140
+	int watermark_distance;
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+	/*
Kyle McMartin fe72140
+	 * As vmstats are not up to date, there is drift between the estimated
Kyle McMartin fe72140
+	 * and real values. For high thresholds and a high number of CPUs, it
Kyle McMartin fe72140
+	 * is possible for the min watermark to be breached while the estimated
Kyle McMartin fe72140
+	 * value looks fine. The pressure threshold is a reduced value such
Kyle McMartin fe72140
+	 * that even the maximum amount of drift will not accidentally breach
Kyle McMartin fe72140
+	 * the min watermark
Kyle McMartin fe72140
+	 */
Kyle McMartin fe72140
+	watermark_distance = low_wmark_pages(zone) - min_wmark_pages(zone);
Kyle McMartin fe72140
+	threshold = max(1, (int)(watermark_distance / num_online_cpus()));
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+	/*
Kyle McMartin fe72140
+	 * Maximum threshold is 125
Kyle McMartin fe72140
+	 */
Kyle McMartin fe72140
+	threshold = min(125, threshold);
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+	return threshold;
Kyle McMartin fe72140
+}
Kyle McMartin fe72140
+
Kyle McMartin fe72140
 static int calculate_threshold(struct zone *zone)
Kyle McMartin fe72140
 {
Kyle McMartin fe72140
 	int threshold;
Kyle McMartin fe72140
@@ -159,6 +183,48 @@ static void refresh_zone_stat_thresholds(void)
Kyle McMartin fe72140
 	}
Kyle McMartin fe72140
 }
Kyle McMartin fe72140
 
Kyle McMartin fe72140
+void reduce_pgdat_percpu_threshold(pg_data_t *pgdat)
Kyle McMartin fe72140
+{
Kyle McMartin fe72140
+	struct zone *zone;
Kyle McMartin fe72140
+	int cpu;
Kyle McMartin fe72140
+	int threshold;
Kyle McMartin fe72140
+	int i;
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+	get_online_cpus();
Kyle McMartin fe72140
+	for (i = 0; i < pgdat->nr_zones; i++) {
Kyle McMartin fe72140
+		zone = &pgdat->node_zones[i];
Kyle McMartin fe72140
+		if (!zone->percpu_drift_mark)
Kyle McMartin fe72140
+			continue;
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+		threshold = calculate_pressure_threshold(zone);
Kyle McMartin fe72140
+		for_each_online_cpu(cpu)
Kyle McMartin fe72140
+			per_cpu_ptr(zone->pageset, cpu)->stat_threshold
Kyle McMartin fe72140
+							= threshold;
Kyle McMartin fe72140
+	}
Kyle McMartin fe72140
+	put_online_cpus();
Kyle McMartin fe72140
+}
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+void restore_pgdat_percpu_threshold(pg_data_t *pgdat)
Kyle McMartin fe72140
+{
Kyle McMartin fe72140
+	struct zone *zone;
Kyle McMartin fe72140
+	int cpu;
Kyle McMartin fe72140
+	int threshold;
Kyle McMartin fe72140
+	int i;
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+	get_online_cpus();
Kyle McMartin fe72140
+	for (i = 0; i < pgdat->nr_zones; i++) {
Kyle McMartin fe72140
+		zone = &pgdat->node_zones[i];
Kyle McMartin fe72140
+		if (!zone->percpu_drift_mark)
Kyle McMartin fe72140
+			continue;
Kyle McMartin fe72140
+
Kyle McMartin fe72140
+		threshold = calculate_threshold(zone);
Kyle McMartin fe72140
+		for_each_online_cpu(cpu)
Kyle McMartin fe72140
+			per_cpu_ptr(zone->pageset, cpu)->stat_threshold
Kyle McMartin fe72140
+							= threshold;
Kyle McMartin fe72140
+	}
Kyle McMartin fe72140
+	put_online_cpus();
Kyle McMartin fe72140
+}
Kyle McMartin fe72140
+
Kyle McMartin fe72140
 /*
Kyle McMartin fe72140
  * For use when we know that interrupts are disabled.
Kyle McMartin fe72140
  */
Kyle McMartin fe72140
@@ -826,7 +892,7 @@ static void zoneinfo_show_print(struct seq_file *m, pg_data_t *pgdat,
Kyle McMartin fe72140
 		   "\n        scanned  %lu"
Kyle McMartin fe72140
 		   "\n        spanned  %lu"
Kyle McMartin fe72140
 		   "\n        present  %lu",
Kyle McMartin fe72140
-		   zone_nr_free_pages(zone),
Kyle McMartin fe72140
+		   zone_page_state(zone, NR_FREE_PAGES),
Kyle McMartin fe72140
 		   min_wmark_pages(zone),
Kyle McMartin fe72140
 		   low_wmark_pages(zone),
Kyle McMartin fe72140
 		   high_wmark_pages(zone),
Kyle McMartin fe72140
-- 
Kyle McMartin fe72140
1.7.3.2
Kyle McMartin fe72140