14 years ago · 28d9bfc37c
--- a/CREDITS
+++ b/CREDITS
@@ -2365,8 +2365,6 @@ E: acme@redhat.com
 
				 W: http://oops.ghostprotocols.net:81/blog/
			
 
				 P: 1024D/9224DF01 D5DF E3BB E3C8 BCBB F8AD  841A B6AB 4681 9224 DF01
			
 
				 D: IPX, LLC, DCCP, cyc2x, wl3501_cs, net/ hacks
			
 
				-S: R. Brasílio Itiberê, 4270/1010 - Água Verde
			
 
				-S: 80240-060 - Curitiba - Paraná
			
 
				 S: Brazil
			
 
				 
			
 
				 N: Karsten Merker
			
--- a/Documentation/kernel-parameters.txt
+++ b/Documentation/kernel-parameters.txt
@@ -1579,20 +1579,12 @@ and is between 256 and 4096 characters. It is defined in the file
 
				 
			
 
				 	nmi_watchdog=	[KNL,BUGS=X86] Debugging features for SMP kernels
			
 
				 			Format: [panic,][num]
			
 
				-			Valid num: 0,1,2
			
 
				+			Valid num: 0
			
 
				 			0 - turn nmi_watchdog off
			
 
				-			1 - use the IO-APIC timer for the NMI watchdog
			
 
				-			2 - use the local APIC for the NMI watchdog using
			
 
				-			a performance counter. Note: This will use one
			
 
				-			performance counter and the local APIC's performance
			
 
				-			vector.
			
 
				 			When panic is specified, panic when an NMI watchdog
			
 
				 			timeout occurs.
			
 
				 			This is useful when you use a panic=... timeout and
			
 
				 			need the box quickly up again.
			
 
				-			Instead of 1 and 2 it is possible to use the following
			
 
				-			symbolic names: lapic and ioapic
			
 
				-			Example: nmi_watchdog=2 or nmi_watchdog=panic,lapic
			
 
				 
			
 
				 	netpoll.carrier_timeout=
			
 
				 			[NET] Specifies amount of time (in seconds) that
			
--- a/Documentation/trace/events-power.txt
+++ b/Documentation/trace/events-power.txt
@@ -0,0 +1,90 @@
 
				+
			
 
				+			Subsystem Trace Points: power
			
 
				+
			
 
				+The power tracing system captures events related to power transitions
			
 
				+within the kernel. Broadly speaking there are three major subheadings:
			
 
				+
			
 
				+  o Power state switch which reports events related to suspend (S-states),
			
 
				+     cpuidle (C-states) and cpufreq (P-states)
			
 
				+  o System clock related changes
			
 
				+  o Power domains related changes and transitions
			
 
				+
			
 
				+This document describes what each of the tracepoints is and why they
			
 
				+might be useful.
			
 
				+
			
 
				+Cf. include/trace/events/power.h for the events definitions.
			
 
				+
			
 
				+1. Power state switch events
			
 
				+============================
			
 
				+
			
 
				+1.1 New trace API
			
 
				+-----------------
			
 
				+
			
 
				+A 'cpu' event class gathers the CPU-related events: cpuidle and
			
 
				+cpufreq.
			
 
				+
			
 
				+cpu_idle		"state=%lu cpu_id=%lu"
			
 
				+cpu_frequency		"state=%lu cpu_id=%lu"
			
 
				+
			
 
				+A suspend event is used to indicate the system going in and out of the
			
 
				+suspend mode:
			
 
				+
			
 
				+machine_suspend		"state=%lu"
			
 
				+
			
 
				+
			
 
				+Note: the value of '-1' or '4294967295' for state means an exit from the current state,
			
 
				+i.e. trace_cpu_idle(4, smp_processor_id()) means that the system
			
 
				+enters the idle state 4, while trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id())
			
 
				+means that the system exits the previous idle state.
			
 
				+
			
 
				+The event which has 'state=4294967295' in the trace is very important to the user
			
 
				+space tools which are using it to detect the end of the current state, and so to
			
 
				+correctly draw the states diagrams and to calculate accurate statistics etc.
			
 
				+
			
 
				+1.2 DEPRECATED trace API
			
 
				+------------------------
			
 
				+
			
 
				+A new Kconfig option CONFIG_EVENT_POWER_TRACING_DEPRECATED with the default value of
			
 
				+'y' has been created. This allows the legacy trace power API to be used conjointly
			
 
				+with the new trace API.
			
 
				+The Kconfig option, the old trace API (in include/trace/events/power.h) and the
			
 
				+old trace points will disappear in a future release (namely 2.6.41).
			
 
				+
			
 
				+power_start		"type=%lu state=%lu cpu_id=%lu"
			
 
				+power_frequency		"type=%lu state=%lu cpu_id=%lu"
			
 
				+power_end		"cpu_id=%lu"
			
 
				+
			
 
				+The 'type' parameter takes one of those macros:
			
 
				+ . POWER_NONE	= 0,
			
 
				+ . POWER_CSTATE	= 1,	/* C-State */
			
 
				+ . POWER_PSTATE	= 2,	/* Fequency change or DVFS */
			
 
				+
			
 
				+The 'state' parameter is set depending on the type:
			
 
				+ . Target C-state for type=POWER_CSTATE,
			
 
				+ . Target frequency for type=POWER_PSTATE,
			
 
				+
			
 
				+power_end is used to indicate the exit of a state, corresponding to the latest
			
 
				+power_start event.
			
 
				+
			
 
				+2. Clocks events
			
 
				+================
			
 
				+The clock events are used for clock enable/disable and for
			
 
				+clock rate change.
			
 
				+
			
 
				+clock_enable		"%s state=%lu cpu_id=%lu"
			
 
				+clock_disable		"%s state=%lu cpu_id=%lu"
			
 
				+clock_set_rate		"%s state=%lu cpu_id=%lu"
			
 
				+
			
 
				+The first parameter gives the clock name (e.g. "gpio1_iclk").
			
 
				+The second parameter is '1' for enable, '0' for disable, the target
			
 
				+clock rate for set_rate.
			
 
				+
			
 
				+3. Power domains events
			
 
				+=======================
			
 
				+The power domain events are used for power domains transitions
			
 
				+
			
 
				+power_domain_target	"%s state=%lu cpu_id=%lu"
			
 
				+
			
 
				+The first parameter gives the power domain name (e.g. "mpu_pwrdm").
			
 
				+The second parameter is the power domain target state.
			
 
				+
			
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -4627,7 +4627,7 @@ PERFORMANCE EVENTS SUBSYSTEM
 
				 M:	Peter Zijlstra <a.p.zijlstra@chello.nl>
			
 
				 M:	Paul Mackerras <paulus@samba.org>
			
 
				 M:	Ingo Molnar <mingo@elte.hu>
			
 
				-M:	Arnaldo Carvalho de Melo <acme@redhat.com>
			
 
				+M:	Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
			
 
				 S:	Supported
			
 
				 F:	kernel/perf_event*.c
			
 
				 F:	include/linux/perf_event.h
			
--- a/arch/alpha/include/asm/perf_event.h
+++ b/arch/alpha/include/asm/perf_event.h
@@ -1,10 +1,4 @@
 
				 #ifndef __ASM_ALPHA_PERF_EVENT_H
			
 
				 #define __ASM_ALPHA_PERF_EVENT_H
			
 
				 
			
 
				-#ifdef CONFIG_PERF_EVENTS
			
 
				-extern void init_hw_perf_events(void);
			
 
				-#else
			
 
				-static inline void init_hw_perf_events(void)    { }
			
 
				-#endif
			
 
				-
			
 
				 #endif /* __ASM_ALPHA_PERF_EVENT_H */
			
--- a/arch/alpha/kernel/irq_alpha.c
+++ b/arch/alpha/kernel/irq_alpha.c
@@ -112,8 +112,6 @@ init_IRQ(void)
 
				 	wrent(entInt, 0);
			
 
				 
			
 
				 	alpha_mv.init_irq();
			
 
				-
			
 
				-	init_hw_perf_events();
			
 
				 }
			
 
				 
			
 
				 /*
			
--- a/arch/alpha/kernel/perf_event.c
+++ b/arch/alpha/kernel/perf_event.c
@@ -14,6 +14,7 @@
 
				 #include <linux/kernel.h>
			
 
				 #include <linux/kdebug.h>
			
 
				 #include <linux/mutex.h>
			
 
				+#include <linux/init.h>
			
 
				 
			
 
				 #include <asm/hwrpb.h>
			
 
				 #include <asm/atomic.h>
			
@@ -863,13 +864,13 @@ static void alpha_perf_event_irq_handler(unsigned long la_ptr,
 
				 /*
			
 
				  * Init call to initialise performance events at kernel startup.
			
 
				  */
			
 
				-void __init init_hw_perf_events(void)
			
 
				+int __init init_hw_perf_events(void)
			
 
				 {
			
 
				 	pr_info("Performance events: ");
			
 
				 
			
 
				 	if (!supported_cpu()) {
			
 
				 		pr_cont("No support for your CPU.\n");
			
 
				-		return;
			
 
				+		return 0;
			
 
				 	}
			
 
				 
			
 
				 	pr_cont("Supported CPU type!\n");
			
@@ -881,6 +882,8 @@ void __init init_hw_perf_events(void)
 
				 	/* And set up PMU specification */
			
 
				 	alpha_pmu = &ev67_pmu;
			
 
				 
			
 
				-	perf_pmu_register(&pmu);
			
 
				-}
			
 
				+	perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
			
 
				 
			
 
				+	return 0;
			
 
				+}
			
 
				+early_initcall(init_hw_perf_events);
			
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -3034,11 +3034,11 @@ init_hw_perf_events(void)
 
				 		pr_info("no hardware support available\n");
			
 
				 	}
			
 
				 
			
 
				-	perf_pmu_register(&pmu);
			
 
				+	perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
			
 
				 
			
 
				 	return 0;
			
 
				 }
			
 
				-arch_initcall(init_hw_perf_events);
			
 
				+early_initcall(init_hw_perf_events);
			
 
				 
			
 
				 /*
			
 
				  * Callchain handling code.
			
--- a/arch/mips/kernel/perf_event_mipsxx.c
+++ b/arch/mips/kernel/perf_event_mipsxx.c
@@ -1047,6 +1047,6 @@ init_hw_perf_events(void)
 
				 
			
 
				 	return 0;
			
 
				 }
			
 
				-arch_initcall(init_hw_perf_events);
			
 
				+early_initcall(init_hw_perf_events);
			
 
				 
			
 
				 #endif /* defined(CONFIG_CPU_MIPS32)... */
			
--- a/arch/powerpc/kernel/e500-pmu.c
+++ b/arch/powerpc/kernel/e500-pmu.c
@@ -126,4 +126,4 @@ static int init_e500_pmu(void)
 
				 	return register_fsl_emb_pmu(&e500_pmu);
			
 
				 }
			
 
				 
			
 
				-arch_initcall(init_e500_pmu);
			
 
				+early_initcall(init_e500_pmu);
			
--- a/arch/powerpc/kernel/mpc7450-pmu.c
+++ b/arch/powerpc/kernel/mpc7450-pmu.c
@@ -414,4 +414,4 @@ static int init_mpc7450_pmu(void)
 
				 	return register_power_pmu(&mpc7450_pmu);
			
 
				 }
			
 
				 
			
 
				-arch_initcall(init_mpc7450_pmu);
			
 
				+early_initcall(init_mpc7450_pmu);
			
--- a/arch/powerpc/kernel/perf_event.c
+++ b/arch/powerpc/kernel/perf_event.c
@@ -1379,7 +1379,7 @@ int register_power_pmu(struct power_pmu *pmu)
 
				 		freeze_events_kernel = MMCR0_FCHV;
			
 
				 #endif /* CONFIG_PPC64 */
			
 
				 
			
 
				-	perf_pmu_register(&power_pmu);
			
 
				+	perf_pmu_register(&power_pmu, "cpu", PERF_TYPE_RAW);
			
 
				 	perf_cpu_notifier(power_pmu_notifier);
			
 
				 
			
 
				 	return 0;
			
--- a/arch/powerpc/kernel/perf_event_fsl_emb.c
+++ b/arch/powerpc/kernel/perf_event_fsl_emb.c
@@ -681,7 +681,7 @@ int register_fsl_emb_pmu(struct fsl_emb_pmu *pmu)
 
				 	pr_info("%s performance monitor hardware support registered\n",
			
 
				 		pmu->name);
			
 
				 
			
 
				-	perf_pmu_register(&fsl_emb_pmu);
			
 
				+	perf_pmu_register(&fsl_emb_pmu, "cpu", PERF_TYPE_RAW);
			
 
				 
			
 
				 	return 0;
			
 
				 }
			
--- a/arch/powerpc/kernel/power4-pmu.c
+++ b/arch/powerpc/kernel/power4-pmu.c
@@ -613,4 +613,4 @@ static int init_power4_pmu(void)
 
				 	return register_power_pmu(&power4_pmu);
			
 
				 }
			
 
				 
			
 
				-arch_initcall(init_power4_pmu);
			
 
				+early_initcall(init_power4_pmu);
			
--- a/arch/powerpc/kernel/power5+-pmu.c
+++ b/arch/powerpc/kernel/power5+-pmu.c
@@ -682,4 +682,4 @@ static int init_power5p_pmu(void)
 
				 	return register_power_pmu(&power5p_pmu);
			
 
				 }
			
 
				 
			
 
				-arch_initcall(init_power5p_pmu);
			
 
				+early_initcall(init_power5p_pmu);
			
--- a/arch/powerpc/kernel/power5-pmu.c
+++ b/arch/powerpc/kernel/power5-pmu.c
@@ -621,4 +621,4 @@ static int init_power5_pmu(void)
 
				 	return register_power_pmu(&power5_pmu);
			
 
				 }
			
 
				 
			
 
				-arch_initcall(init_power5_pmu);
			
 
				+early_initcall(init_power5_pmu);
			
--- a/arch/powerpc/kernel/power6-pmu.c
+++ b/arch/powerpc/kernel/power6-pmu.c
@@ -544,4 +544,4 @@ static int init_power6_pmu(void)
 
				 	return register_power_pmu(&power6_pmu);
			
 
				 }
			
 
				 
			
 
				-arch_initcall(init_power6_pmu);
			
 
				+early_initcall(init_power6_pmu);
			
--- a/arch/powerpc/kernel/power7-pmu.c
+++ b/arch/powerpc/kernel/power7-pmu.c
@@ -369,4 +369,4 @@ static int init_power7_pmu(void)
 
				 	return register_power_pmu(&power7_pmu);
			
 
				 }
			
 
				 
			
 
				-arch_initcall(init_power7_pmu);
			
 
				+early_initcall(init_power7_pmu);
			
--- a/arch/powerpc/kernel/ppc970-pmu.c
+++ b/arch/powerpc/kernel/ppc970-pmu.c
@@ -494,4 +494,4 @@ static int init_ppc970_pmu(void)
 
				 	return register_power_pmu(&ppc970_pmu);
			
 
				 }
			
 
				 
			
 
				-arch_initcall(init_ppc970_pmu);
			
 
				+early_initcall(init_ppc970_pmu);
			
--- a/arch/sh/kernel/cpu/sh4/perf_event.c
+++ b/arch/sh/kernel/cpu/sh4/perf_event.c
@@ -250,4 +250,4 @@ static int __init sh7750_pmu_init(void)
 
				 
			
 
				 	return register_sh_pmu(&sh7750_pmu);
			
 
				 }
			
 
				-arch_initcall(sh7750_pmu_init);
			
 
				+early_initcall(sh7750_pmu_init);
			
--- a/arch/sh/kernel/cpu/sh4a/perf_event.c
+++ b/arch/sh/kernel/cpu/sh4a/perf_event.c
@@ -284,4 +284,4 @@ static int __init sh4a_pmu_init(void)
 
				 
			
 
				 	return register_sh_pmu(&sh4a_pmu);
			
 
				 }
			
 
				-arch_initcall(sh4a_pmu_init);
			
 
				+early_initcall(sh4a_pmu_init);
			
--- a/arch/sh/kernel/perf_event.c
+++ b/arch/sh/kernel/perf_event.c
@@ -389,7 +389,7 @@ int __cpuinit register_sh_pmu(struct sh_pmu *_pmu)
 
				 
			
 
				 	WARN_ON(_pmu->num_events > MAX_HWEVENTS);
			
 
				 
			
 
				-	perf_pmu_register(&pmu);
			
 
				+	perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
			
 
				 	perf_cpu_notifier(sh_pmu_notifier);
			
 
				 	return 0;
			
 
				 }
			
--- a/arch/sparc/include/asm/perf_event.h
+++ b/arch/sparc/include/asm/perf_event.h
@@ -4,8 +4,6 @@
 
				 #ifdef CONFIG_PERF_EVENTS
			
 
				 #include <asm/ptrace.h>
			
 
				 
			
 
				-extern void init_hw_perf_events(void);
			
 
				-
			
 
				 #define perf_arch_fetch_caller_regs(regs, ip)		\
			
 
				 do {							\
			
 
				 	unsigned long _pstate, _asi, _pil, _i7, _fp;	\
			
@@ -26,8 +24,6 @@ do {							\
 
				 	(regs)->u_regs[UREG_I6] = _fp;			\
			
 
				 	(regs)->u_regs[UREG_I7] = _i7;			\
			
 
				 } while (0)
			
 
				-#else
			
 
				-static inline void init_hw_perf_events(void)	{ }
			
 
				 #endif
			
 
				 
			
 
				 #endif
			
--- a/arch/sparc/kernel/nmi.c
+++ b/arch/sparc/kernel/nmi.c
@@ -270,8 +270,6 @@ int __init nmi_init(void)
 
				 			atomic_set(&nmi_active, -1);
			
 
				 		}
			
 
				 	}
			
 
				-	if (!err)
			
 
				-		init_hw_perf_events();
			
 
				 
			
 
				 	return err;
			
 
				 }
			
--- a/arch/sparc/kernel/perf_event.c
+++ b/arch/sparc/kernel/perf_event.c
@@ -1307,20 +1307,23 @@ static bool __init supported_pmu(void)
 
				 	return false;
			
 
				 }
			
 
				 
			
 
				-void __init init_hw_perf_events(void)
			
 
				+int __init init_hw_perf_events(void)
			
 
				 {
			
 
				 	pr_info("Performance events: ");
			
 
				 
			
 
				 	if (!supported_pmu()) {
			
 
				 		pr_cont("No support for PMU type '%s'\n", sparc_pmu_type);
			
 
				-		return;
			
 
				+		return 0;
			
 
				 	}
			
 
				 
			
 
				 	pr_cont("Supported PMU type is '%s'\n", sparc_pmu_type);
			
 
				 
			
 
				-	perf_pmu_register(&pmu);
			
 
				+	perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
			
 
				 	register_die_notifier(&perf_event_nmi_notifier);
			
 
				+
			
 
				+	return 0;
			
 
				 }
			
 
				+early_initcall(init_hw_perf_events);
			
 
				 
			
 
				 void perf_callchain_kernel(struct perf_callchain_entry *entry,
			
 
				 			   struct pt_regs *regs)
			
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -180,8 +180,15 @@ extern void *text_poke_early(void *addr, const void *opcode, size_t len);
 
				  * On the local CPU you need to be protected again NMI or MCE handlers seeing an
			
 
				  * inconsistent instruction while you patch.
			
 
				  */
			
 
				+struct text_poke_param {
			
 
				+	void *addr;
			
 
				+	const void *opcode;
			
 
				+	size_t len;
			
 
				+};
			
 
				+
			
 
				 extern void *text_poke(void *addr, const void *opcode, size_t len);
			
 
				 extern void *text_poke_smp(void *addr, const void *opcode, size_t len);
			
 
				+extern void text_poke_smp_batch(struct text_poke_param *params, int n);
			
 
				 
			
 
				 #if defined(CONFIG_DYNAMIC_FTRACE) || defined(HAVE_JUMP_LABEL)
			
 
				 #define IDEAL_NOP_SIZE_5 5
			
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -15,10 +15,6 @@ static inline int irq_canonicalize(int irq)
 
				 	return ((irq == 2) ? 9 : irq);
			
 
				 }
			
 
				 
			
 
				-#ifdef CONFIG_X86_LOCAL_APIC
			
 
				-# define ARCH_HAS_NMI_WATCHDOG
			
 
				-#endif
			
 
				-
			
 
				 #ifdef CONFIG_X86_32
			
 
				 extern void irq_ctx_init(int cpu);
			
 
				 #else
			
--- a/arch/x86/include/asm/kdebug.h
+++ b/arch/x86/include/asm/kdebug.h
@@ -28,7 +28,7 @@ extern void die(const char *, struct pt_regs *,long);
 
				 extern int __must_check __die(const char *, struct pt_regs *, long);
			
 
				 extern void show_registers(struct pt_regs *regs);
			
 
				 extern void show_trace(struct task_struct *t, struct pt_regs *regs,
			
 
				-		       unsigned long *sp, unsigned long bp);
			
 
				+		       unsigned long *sp);
			
 
				 extern void __show_regs(struct pt_regs *regs, int all);
			
 
				 extern void show_regs(struct pt_regs *regs);
			
 
				 extern unsigned long oops_begin(void);
			
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -123,6 +123,10 @@
 
				 #define MSR_AMD64_IBSCTL		0xc001103a
			
 
				 #define MSR_AMD64_IBSBRTARGET		0xc001103b
			
 
				 
			
 
				+/* Fam 15h MSRs */
			
 
				+#define MSR_F15H_PERF_CTL		0xc0010200
			
 
				+#define MSR_F15H_PERF_CTR		0xc0010201
			
 
				+
			
 
				 /* Fam 10h MSRs */
			
 
				 #define MSR_FAM10H_MMIO_CONF_BASE	0xc0010058
			
 
				 #define FAM10H_MMIO_CONF_ENABLE		(1<<0)
			
--- a/arch/x86/include/asm/nmi.h
+++ b/arch/x86/include/asm/nmi.h
@@ -5,41 +5,15 @@
 
				 #include <asm/irq.h>
			
 
				 #include <asm/io.h>
			
 
				 
			
 
				-#ifdef ARCH_HAS_NMI_WATCHDOG
			
 
				-
			
 
				-/**
			
 
				- * do_nmi_callback
			
 
				- *
			
 
				- * Check to see if a callback exists and execute it.  Return 1
			
 
				- * if the handler exists and was handled successfully.
			
 
				- */
			
 
				-int do_nmi_callback(struct pt_regs *regs, int cpu);
			
 
				+#ifdef CONFIG_X86_LOCAL_APIC
			
 
				 
			
 
				 extern void die_nmi(char *str, struct pt_regs *regs, int do_panic);
			
 
				-extern int check_nmi_watchdog(void);
			
 
				-#if !defined(CONFIG_LOCKUP_DETECTOR)
			
 
				-extern int nmi_watchdog_enabled;
			
 
				-#endif
			
 
				 extern int avail_to_resrv_perfctr_nmi_bit(unsigned int);
			
 
				 extern int reserve_perfctr_nmi(unsigned int);
			
 
				 extern void release_perfctr_nmi(unsigned int);
			
 
				 extern int reserve_evntsel_nmi(unsigned int);
			
 
				 extern void release_evntsel_nmi(unsigned int);
			
 
				 
			
 
				-extern void setup_apic_nmi_watchdog(void *);
			
 
				-extern void stop_apic_nmi_watchdog(void *);
			
 
				-extern void disable_timer_nmi_watchdog(void);
			
 
				-extern void enable_timer_nmi_watchdog(void);
			
 
				-extern int nmi_watchdog_tick(struct pt_regs *regs, unsigned reason);
			
 
				-extern void cpu_nmi_set_wd_enabled(void);
			
 
				-
			
 
				-extern atomic_t nmi_active;
			
 
				-extern unsigned int nmi_watchdog;
			
 
				-#define NMI_NONE	0
			
 
				-#define NMI_IO_APIC	1
			
 
				-#define NMI_LOCAL_APIC	2
			
 
				-#define NMI_INVALID	3
			
 
				-
			
 
				 struct ctl_table;
			
 
				 extern int proc_nmi_enabled(struct ctl_table *, int ,
			
 
				 			void __user *, size_t *, loff_t *);
			
@@ -47,33 +21,8 @@ extern int unknown_nmi_panic;
 
				 
			
 
				 void arch_trigger_all_cpu_backtrace(void);
			
 
				 #define arch_trigger_all_cpu_backtrace arch_trigger_all_cpu_backtrace
			
 
				-
			
 
				-static inline void localise_nmi_watchdog(void)
			
 
				-{
			
 
				-	if (nmi_watchdog == NMI_IO_APIC)
			
 
				-		nmi_watchdog = NMI_LOCAL_APIC;
			
 
				-}
			
 
				-
			
 
				-/* check if nmi_watchdog is active (ie was specified at boot) */
			
 
				-static inline int nmi_watchdog_active(void)
			
 
				-{
			
 
				-	/*
			
 
				-	 * actually it should be:
			
 
				-	 * 	return (nmi_watchdog == NMI_LOCAL_APIC ||
			
 
				-	 * 		nmi_watchdog == NMI_IO_APIC)
			
 
				-	 * but since they are power of two we could use a
			
 
				-	 * cheaper way --cvg
			
 
				-	 */
			
 
				-	return nmi_watchdog & (NMI_LOCAL_APIC | NMI_IO_APIC);
			
 
				-}
			
 
				 #endif
			
 
				 
			
 
				-void lapic_watchdog_stop(void);
			
 
				-int lapic_watchdog_init(unsigned nmi_hz);
			
 
				-int lapic_wd_event(unsigned nmi_hz);
			
 
				-unsigned lapic_adjust_nmi_hz(unsigned hz);
			
 
				-void disable_lapic_nmi_watchdog(void);
			
 
				-void enable_lapic_nmi_watchdog(void);
			
 
				 void stop_nmi(void);
			
 
				 void restart_nmi(void);
			
 
				 
			
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -125,7 +125,6 @@ union cpuid10_edx {
 
				 #define IBS_OP_MAX_CNT_EXT	0x007FFFFFULL	/* not a register bit mask */
			
 
				 
			
 
				 #ifdef CONFIG_PERF_EVENTS
			
 
				-extern void init_hw_perf_events(void);
			
 
				 extern void perf_events_lapic_init(void);
			
 
				 
			
 
				 #define PERF_EVENT_INDEX_OFFSET			0
			
@@ -156,7 +155,6 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 
				 }
			
 
				 
			
 
				 #else
			
 
				-static inline void init_hw_perf_events(void)		{ }
			
 
				 static inline void perf_events_lapic_init(void)	{ }
			
 
				 #endif
			
 
				 
			
--- a/arch/x86/include/asm/perf_event_p4.h
+++ b/arch/x86/include/asm/perf_event_p4.h
@@ -744,14 +744,6 @@ enum P4_ESCR_EMASKS {
 
				 };
			
 
				 
			
 
				 /*
			
 
				- * P4 PEBS specifics (Replay Event only)
			
 
				- *
			
 
				- * Format (bits):
			
 
				- *   0-6: metric from P4_PEBS_METRIC enum
			
 
				- *    7 : reserved
			
 
				- *    8 : reserved
			
 
				- * 9-11 : reserved
			
 
				- *
			
 
				  * Note we have UOP and PEBS bits reserved for now
			
 
				  * just in case if we will need them once
			
 
				  */
			
@@ -788,5 +780,60 @@ enum P4_PEBS_METRIC {
 
				 	P4_PEBS_METRIC__max
			
 
				 };
			
 
				 
			
 
				+/*
			
 
				+ * Notes on internal configuration of ESCR+CCCR tuples
			
 
				+ *
			
 
				+ * Since P4 has quite the different architecture of
			
 
				+ * performance registers in compare with "architectural"
			
 
				+ * once and we have on 64 bits to keep configuration
			
 
				+ * of performance event, the following trick is used.
			
 
				+ *
			
 
				+ * 1) Since both ESCR and CCCR registers have only low
			
 
				+ *    32 bits valuable, we pack them into a single 64 bit
			
 
				+ *    configuration. Low 32 bits of such config correspond
			
 
				+ *    to low 32 bits of CCCR register and high 32 bits
			
 
				+ *    correspond to low 32 bits of ESCR register.
			
 
				+ *
			
 
				+ * 2) The meaning of every bit of such config field can
			
 
				+ *    be found in Intel SDM but it should be noted that
			
 
				+ *    we "borrow" some reserved bits for own usage and
			
 
				+ *    clean them or set to a proper value when we do
			
 
				+ *    a real write to hardware registers.
			
 
				+ *
			
 
				+ * 3) The format of bits of config is the following
			
 
				+ *    and should be either 0 or set to some predefined
			
 
				+ *    values:
			
 
				+ *
			
 
				+ *    Low 32 bits
			
 
				+ *    -----------
			
 
				+ *      0-6: P4_PEBS_METRIC enum
			
 
				+ *     7-11:                    reserved
			
 
				+ *       12:                    reserved (Enable)
			
 
				+ *    13-15:                    reserved (ESCR select)
			
 
				+ *    16-17: Active Thread
			
 
				+ *       18: Compare
			
 
				+ *       19: Complement
			
 
				+ *    20-23: Threshold
			
 
				+ *       24: Edge
			
 
				+ *       25:                    reserved (FORCE_OVF)
			
 
				+ *       26:                    reserved (OVF_PMI_T0)
			
 
				+ *       27:                    reserved (OVF_PMI_T1)
			
 
				+ *    28-29:                    reserved
			
 
				+ *       30:                    reserved (Cascade)
			
 
				+ *       31:                    reserved (OVF)
			
 
				+ *
			
 
				+ *    High 32 bits
			
 
				+ *    ------------
			
 
				+ *        0:                    reserved (T1_USR)
			
 
				+ *        1:                    reserved (T1_OS)
			
 
				+ *        2:                    reserved (T0_USR)
			
 
				+ *        3:                    reserved (T0_OS)
			
 
				+ *        4: Tag Enable
			
 
				+ *      5-8: Tag Value
			
 
				+ *     9-24: Event Mask (may use P4_ESCR_EMASK_BIT helper)
			
 
				+ *    25-30: enum P4_EVENTS
			
 
				+ *       31:                    reserved (HT thread)
			
 
				+ */
			
 
				+
			
 
				 #endif /* PERF_EVENT_P4_H */
			
 
				 
			
--- a/arch/x86/include/asm/smpboot_hooks.h
+++ b/arch/x86/include/asm/smpboot_hooks.h
@@ -48,7 +48,6 @@ static inline void __init smpboot_setup_io_apic(void)
 
				 		setup_IO_APIC();
			
 
				 	else {
			
 
				 		nr_ioapics = 0;
			
 
				-		localise_nmi_watchdog();
			
 
				 	}
			
 
				 #endif
			
 
				 }
			
--- a/arch/x86/include/asm/stacktrace.h
+++ b/arch/x86/include/asm/stacktrace.h
@@ -7,6 +7,7 @@
 
				 #define _ASM_X86_STACKTRACE_H
			
 
				 
			
 
				 #include <linux/uaccess.h>
			
 
				+#include <linux/ptrace.h>
			
 
				 
			
 
				 extern int kstack_depth_to_print;
			
 
				 
			
@@ -46,7 +47,7 @@ struct stacktrace_ops {
 
				 };
			
 
				 
			
 
				 void dump_trace(struct task_struct *tsk, struct pt_regs *regs,
			
 
				-		unsigned long *stack, unsigned long bp,
			
 
				+		unsigned long *stack,
			
 
				 		const struct stacktrace_ops *ops, void *data);
			
 
				 
			
 
				 #ifdef CONFIG_X86_32
			
@@ -57,13 +58,39 @@ void dump_trace(struct task_struct *tsk, struct pt_regs *regs,
 
				 #define get_bp(bp) asm("movq %%rbp, %0" : "=r" (bp) :)
			
 
				 #endif
			
 
				 
			
 
				+#ifdef CONFIG_FRAME_POINTER
			
 
				+static inline unsigned long
			
 
				+stack_frame(struct task_struct *task, struct pt_regs *regs)
			
 
				+{
			
 
				+	unsigned long bp;
			
 
				+
			
 
				+	if (regs)
			
 
				+		return regs->bp;
			
 
				+
			
 
				+	if (task == current) {
			
 
				+		/* Grab bp right from our regs */
			
 
				+		get_bp(bp);
			
 
				+		return bp;
			
 
				+	}
			
 
				+
			
 
				+	/* bp is the last reg pushed by switch_to */
			
 
				+	return *(unsigned long *)task->thread.sp;
			
 
				+}
			
 
				+#else
			
 
				+static inline unsigned long
			
 
				+stack_frame(struct task_struct *task, struct pt_regs *regs)
			
 
				+{
			
 
				+	return 0;
			
 
				+}
			
 
				+#endif
			
 
				+
			
 
				 extern void
			
 
				 show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
			
 
				-		unsigned long *stack, unsigned long bp, char *log_lvl);
			
 
				+		   unsigned long *stack, char *log_lvl);
			
 
				 
			
 
				 extern void
			
 
				 show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
			
 
				-		unsigned long *sp, unsigned long bp, char *log_lvl);
			
 
				+		   unsigned long *sp, char *log_lvl);
			
 
				 
			
 
				 extern unsigned int code_bytes;
			
 
				 
			
--- a/arch/x86/include/asm/timer.h
+++ b/arch/x86/include/asm/timer.h
@@ -10,12 +10,6 @@
 
				 unsigned long long native_sched_clock(void);
			
 
				 extern int recalibrate_cpu_khz(void);
			
 
				 
			
 
				-#if defined(CONFIG_X86_32) && defined(CONFIG_X86_IO_APIC)
			
 
				-extern int timer_ack;
			
 
				-#else
			
 
				-# define timer_ack (0)
			
 
				-#endif
			
 
				-
			
 
				 extern int no_timer_check;
			
 
				 
			
 
				 /* Accelerators for sched_clock()
			
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -591,17 +591,21 @@ static atomic_t stop_machine_first;
 
				 static int wrote_text;
			
 
				 
			
 
				 struct text_poke_params {
			
 
				-	void *addr;
			
 
				-	const void *opcode;
			
 
				-	size_t len;
			
 
				+	struct text_poke_param *params;
			
 
				+	int nparams;
			
 
				 };
			
 
				 
			
 
				 static int __kprobes stop_machine_text_poke(void *data)
			
 
				 {
			
 
				 	struct text_poke_params *tpp = data;
			
 
				+	struct text_poke_param *p;
			
 
				+	int i;
			
 
				 
			
 
				 	if (atomic_dec_and_test(&stop_machine_first)) {
			
 
				-		text_poke(tpp->addr, tpp->opcode, tpp->len);
			
 
				+		for (i = 0; i < tpp->nparams; i++) {
			
 
				+			p = &tpp->params[i];
			
 
				+			text_poke(p->addr, p->opcode, p->len);
			
 
				+		}
			
 
				 		smp_wmb();	/* Make sure other cpus see that this has run */
			
 
				 		wrote_text = 1;
			
 
				 	} else {
			
@@ -610,8 +614,12 @@ static int __kprobes stop_machine_text_poke(void *data)
 
				 		smp_mb();	/* Load wrote_text before following execution */
			
 
				 	}
			
 
				 
			
 
				-	flush_icache_range((unsigned long)tpp->addr,
			
 
				-			   (unsigned long)tpp->addr + tpp->len);
			
 
				+	for (i = 0; i < tpp->nparams; i++) {
			
 
				+		p = &tpp->params[i];
			
 
				+		flush_icache_range((unsigned long)p->addr,
			
 
				+				   (unsigned long)p->addr + p->len);
			
 
				+	}
			
 
				+
			
 
				 	return 0;
			
 
				 }
			
 
				 
			
@@ -631,10 +639,13 @@ static int __kprobes stop_machine_text_poke(void *data)
 
				 void *__kprobes text_poke_smp(void *addr, const void *opcode, size_t len)
			
 
				 {
			
 
				 	struct text_poke_params tpp;
			
 
				+	struct text_poke_param p;
			
 
				 
			
 
				-	tpp.addr = addr;
			
 
				-	tpp.opcode = opcode;
			
 
				-	tpp.len = len;
			
 
				+	p.addr = addr;
			
 
				+	p.opcode = opcode;
			
 
				+	p.len = len;
			
 
				+	tpp.params = &p;
			
 
				+	tpp.nparams = 1;
			
 
				 	atomic_set(&stop_machine_first, 1);
			
 
				 	wrote_text = 0;
			
 
				 	/* Use __stop_machine() because the caller already got online_cpus. */
			
@@ -642,6 +653,26 @@ void *__kprobes text_poke_smp(void *addr, const void *opcode, size_t len)
 
				 	return addr;
			
 
				 }
			
 
				 
			
 
				+/**
			
 
				+ * text_poke_smp_batch - Update instructions on a live kernel on SMP
			
 
				+ * @params: an array of text_poke parameters
			
 
				+ * @n: the number of elements in params.
			
 
				+ *
			
 
				+ * Modify multi-byte instruction by using stop_machine() on SMP. Since the
			
 
				+ * stop_machine() is heavy task, it is better to aggregate text_poke requests
			
 
				+ * and do it once if possible.
			
 
				+ *
			
 
				+ * Note: Must be called under get_online_cpus() and text_mutex.
			
 
				+ */
			
 
				+void __kprobes text_poke_smp_batch(struct text_poke_param *params, int n)
			
 
				+{
			
 
				+	struct text_poke_params tpp = {.params = params, .nparams = n};
			
 
				+
			
 
				+	atomic_set(&stop_machine_first, 1);
			
 
				+	wrote_text = 0;
			
 
				+	stop_machine(stop_machine_text_poke, (void *)&tpp, NULL);
			
 
				+}
			
 
				+
			
 
				 #if defined(CONFIG_DYNAMIC_FTRACE) || defined(HAVE_JUMP_LABEL)
			
 
				 
			
 
				 #ifdef CONFIG_X86_64
			
--- a/arch/x86/kernel/apic/Makefile
+++ b/arch/x86/kernel/apic/Makefile
@@ -3,10 +3,7 @@
 
				 #
			
 
				 
			
 
				 obj-$(CONFIG_X86_LOCAL_APIC)	+= apic.o apic_noop.o probe_$(BITS).o ipi.o
			
 
				-ifneq ($(CONFIG_HARDLOCKUP_DETECTOR),y)
			
 
				-obj-$(CONFIG_X86_LOCAL_APIC)	+= nmi.o
			
 
				-endif
			
 
				-obj-$(CONFIG_HARDLOCKUP_DETECTOR)	+= hw_nmi.o
			
 
				+obj-y				+= hw_nmi.o
			
 
				 
			
 
				 obj-$(CONFIG_X86_IO_APIC)	+= io_apic.o
			
 
				 obj-$(CONFIG_SMP)		+= ipi.o
			
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -31,7 +31,6 @@
 
				 #include <linux/init.h>
			
 
				 #include <linux/cpu.h>
			
 
				 #include <linux/dmi.h>
			
 
				-#include <linux/nmi.h>
			
 
				 #include <linux/smp.h>
			
 
				 #include <linux/mm.h>
			
 
				 
			
@@ -799,11 +798,7 @@ void __init setup_boot_APIC_clock(void)
 
				 	 * PIT/HPET going.  Otherwise register lapic as a dummy
			
 
				 	 * device.
			
 
				 	 */
			
 
				-	if (nmi_watchdog != NMI_IO_APIC)
			
 
				-		lapic_clockevent.features &= ~CLOCK_EVT_FEAT_DUMMY;
			
 
				-	else
			
 
				-		pr_warning("APIC timer registered as dummy,"
			
 
				-			" due to nmi_watchdog=%d!\n", nmi_watchdog);
			
 
				+	lapic_clockevent.features &= ~CLOCK_EVT_FEAT_DUMMY;
			
 
				 
			
 
				 	/* Setup the lapic or request the broadcast */
			
 
				 	setup_APIC_timer();
			
@@ -1387,7 +1382,6 @@ void __cpuinit end_local_APIC_setup(void)
 
				 	}
			
 
				 #endif
			
 
				 
			
 
				-	setup_apic_nmi_watchdog(NULL);
			
 
				 	apic_pm_activate();
			
 
				 
			
 
				 	/*
			
@@ -1758,17 +1752,10 @@ int __init APIC_init_uniprocessor(void)
 
				 		setup_IO_APIC();
			
 
				 	else {
			
 
				 		nr_ioapics = 0;
			
 
				-		localise_nmi_watchdog();
			
 
				 	}
			
 
				-#else
			
 
				-	localise_nmi_watchdog();
			
 
				 #endif
			
 
				 
			
 
				 	x86_init.timers.setup_percpu_clockev();
			
 
				-#ifdef CONFIG_X86_64
			
 
				-	check_nmi_watchdog();
			
 
				-#endif
			
 
				-
			
 
				 	return 0;
			
 
				 }
			
 
				 
			
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -17,20 +17,31 @@
 
				 #include <linux/nmi.h>
			
 
				 #include <linux/module.h>
			
 
				 
			
 
				+#ifdef CONFIG_HARDLOCKUP_DETECTOR
			
 
				 u64 hw_nmi_get_sample_period(void)
			
 
				 {
			
 
				 	return (u64)(cpu_khz) * 1000 * 60;
			
 
				 }
			
 
				+#endif
			
 
				 
			
 
				-#ifdef ARCH_HAS_NMI_WATCHDOG
			
 
				-
			
 
				+#ifdef arch_trigger_all_cpu_backtrace
			
 
				 /* For reliability, we're prepared to waste bits here. */
			
 
				 static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
			
 
				 
			
 
				+/* "in progress" flag of arch_trigger_all_cpu_backtrace */
			
 
				+static unsigned long backtrace_flag;
			
 
				+
			
 
				 void arch_trigger_all_cpu_backtrace(void)
			
 
				 {
			
 
				 	int i;
			
 
				 
			
 
				+	if (test_and_set_bit(0, &backtrace_flag))
			
 
				+		/*
			
 
				+		 * If there is already a trigger_all_cpu_backtrace() in progress
			
 
				+		 * (backtrace_flag == 1), don't output double cpu dump infos.
			
 
				+		 */
			
 
				+		return;
			
 
				+
			
 
				 	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
			
 
				 
			
 
				 	printk(KERN_INFO "sending NMI to all CPUs:\n");
			
@@ -42,6 +53,9 @@ void arch_trigger_all_cpu_backtrace(void)
 
				 			break;
			
 
				 		mdelay(1);
			
 
				 	}
			
 
				+
			
 
				+	clear_bit(0, &backtrace_flag);
			
 
				+	smp_mb__after_clear_bit();
			
 
				 }
			
 
				 
			
 
				 static int __kprobes
			
@@ -50,7 +64,7 @@ arch_trigger_all_cpu_backtrace_handler(struct notifier_block *self,
 
				 {
			
 
				 	struct die_args *args = __args;
			
 
				 	struct pt_regs *regs;
			
 
				-	int cpu = smp_processor_id();
			
 
				+	int cpu;
			
 
				 
			
 
				 	switch (cmd) {
			
 
				 	case DIE_NMI:
			
@@ -62,6 +76,7 @@ arch_trigger_all_cpu_backtrace_handler(struct notifier_block *self,
 
				 	}
			
 
				 
			
 
				 	regs = args->regs;
			
 
				+	cpu = smp_processor_id();
			
 
				 
			
 
				 	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
			
 
				 		static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
			
@@ -91,18 +106,3 @@ static int __init register_trigger_all_cpu_backtrace(void)
 
				 }
			
 
				 early_initcall(register_trigger_all_cpu_backtrace);
			
 
				 #endif
			
 
				-
			
 
				-/* STUB calls to mimic old nmi_watchdog behaviour */
			
 
				-#if defined(CONFIG_X86_LOCAL_APIC)
			
 
				-unsigned int nmi_watchdog = NMI_NONE;
			
 
				-EXPORT_SYMBOL(nmi_watchdog);
			
 
				-void acpi_nmi_enable(void) { return; }
			
 
				-void acpi_nmi_disable(void) { return; }
			
 
				-#endif
			
 
				-atomic_t nmi_active = ATOMIC_INIT(0);           /* oprofile uses this */
			
 
				-EXPORT_SYMBOL(nmi_active);
			
 
				-int unknown_nmi_panic;
			
 
				-void cpu_nmi_set_wd_enabled(void) { return; }
			
 
				-void stop_apic_nmi_watchdog(void *unused) { return; }
			
 
				-void setup_apic_nmi_watchdog(void *unused) { return; }
			
 
				-int __init check_nmi_watchdog(void) { return 0; }
			
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -54,7 +54,6 @@
 
				 #include <asm/dma.h>
			
 
				 #include <asm/timer.h>
			
 
				 #include <asm/i8259.h>
			
 
				-#include <asm/nmi.h>
			
 
				 #include <asm/msidef.h>
			
 
				 #include <asm/hypertransport.h>
			
 
				 #include <asm/setup.h>
			
@@ -2642,24 +2641,6 @@ static void lapic_register_intr(int irq)
 
				 				      "edge");
			
 
				 }
			
 
				 
			
 
				-static void __init setup_nmi(void)
			
 
				-{
			
 
				-	/*
			
 
				-	 * Dirty trick to enable the NMI watchdog ...
			
 
				-	 * We put the 8259A master into AEOI mode and
			
 
				-	 * unmask on all local APICs LVT0 as NMI.
			
 
				-	 *
			
 
				-	 * The idea to use the 8259A in AEOI mode ('8259A Virtual Wire')
			
 
				-	 * is from Maciej W. Rozycki - so we do not have to EOI from
			
 
				-	 * the NMI handler or the timer interrupt.
			
 
				-	 */
			
 
				-	apic_printk(APIC_VERBOSE, KERN_INFO "activating NMI Watchdog ...");
			
 
				-
			
 
				-	enable_NMI_through_LVT0();
			
 
				-
			
 
				-	apic_printk(APIC_VERBOSE, " done.\n");
			
 
				-}
			
 
				-
			
 
				 /*
			
 
				  * This looks a bit hackish but it's about the only one way of sending
			
 
				  * a few INTA cycles to 8259As and any associated glue logic.  ICR does
			
@@ -2765,15 +2746,6 @@ static inline void __init check_timer(void)
 
				 	 */
			
 
				 	apic_write(APIC_LVT0, APIC_LVT_MASKED | APIC_DM_EXTINT);
			
 
				 	legacy_pic->init(1);
			
 
				-#ifdef CONFIG_X86_32
			
 
				-	{
			
 
				-		unsigned int ver;
			
 
				-
			
 
				-		ver = apic_read(APIC_LVR);
			
 
				-		ver = GET_APIC_VERSION(ver);
			
 
				-		timer_ack = (nmi_watchdog == NMI_IO_APIC && !APIC_INTEGRATED(ver));
			
 
				-	}
			
 
				-#endif
			
 
				 
			
 
				 	pin1  = find_isa_irq_pin(0, mp_INT);
			
 
				 	apic1 = find_isa_irq_apic(0, mp_INT);
			
@@ -2821,10 +2793,6 @@ static inline void __init check_timer(void)
 
				 				unmask_ioapic(cfg);
			
 
				 		}
			
 
				 		if (timer_irq_works()) {
			
 
				-			if (nmi_watchdog == NMI_IO_APIC) {
			
 
				-				setup_nmi();
			
 
				-				legacy_pic->unmask(0);
			
 
				-			}
			
 
				 			if (disable_timer_pin_1 > 0)
			
 
				 				clear_IO_APIC_pin(0, pin1);
			
 
				 			goto out;
			
@@ -2850,11 +2818,6 @@ static inline void __init check_timer(void)
 
				 		if (timer_irq_works()) {
			
 
				 			apic_printk(APIC_QUIET, KERN_INFO "....... works.\n");
			
 
				 			timer_through_8259 = 1;
			
 
				-			if (nmi_watchdog == NMI_IO_APIC) {
			
 
				-				legacy_pic->mask(0);
			
 
				-				setup_nmi();
			
 
				-				legacy_pic->unmask(0);
			
 
				-			}
			
 
				 			goto out;
			
 
				 		}
			
 
				 		/*
			
@@ -2866,15 +2829,6 @@ static inline void __init check_timer(void)
 
				 		apic_printk(APIC_QUIET, KERN_INFO "....... failed.\n");
			
 
				 	}
			
 
				 
			
 
				-	if (nmi_watchdog == NMI_IO_APIC) {
			
 
				-		apic_printk(APIC_QUIET, KERN_WARNING "timer doesn't work "
			
 
				-			    "through the IO-APIC - disabling NMI Watchdog!\n");
			
 
				-		nmi_watchdog = NMI_NONE;
			
 
				-	}
			
 
				-#ifdef CONFIG_X86_32
			
 
				-	timer_ack = 0;
			
 
				-#endif
			
 
				-
			
 
				 	apic_printk(APIC_QUIET, KERN_INFO
			
 
				 		    "...trying to set up timer as Virtual Wire IRQ...\n");
			
 
				 
			
--- a/arch/x86/kernel/apic/nmi.c
+++ b/arch/x86/kernel/apic/nmi.c
@@ -1,567 +0,0 @@
 
				-/*
			
 
				- *  NMI watchdog support on APIC systems
			
 
				- *
			
 
				- *  Started by Ingo Molnar <mingo@redhat.com>
			
 
				- *
			
 
				- *  Fixes:
			
 
				- *  Mikael Pettersson	: AMD K7 support for local APIC NMI watchdog.
			
 
				- *  Mikael Pettersson	: Power Management for local APIC NMI watchdog.
			
 
				- *  Mikael Pettersson	: Pentium 4 support for local APIC NMI watchdog.
			
 
				- *  Pavel Machek and
			
 
				- *  Mikael Pettersson	: PM converted to driver model. Disable/enable API.
			
 
				- */
			
 
				-
			
 
				-#include <asm/apic.h>
			
 
				-
			
 
				-#include <linux/nmi.h>
			
 
				-#include <linux/mm.h>
			
 
				-#include <linux/delay.h>
			
 
				-#include <linux/interrupt.h>
			
 
				-#include <linux/module.h>
			
 
				-#include <linux/slab.h>
			
 
				-#include <linux/sysdev.h>
			
 
				-#include <linux/sysctl.h>
			
 
				-#include <linux/percpu.h>
			
 
				-#include <linux/kprobes.h>
			
 
				-#include <linux/cpumask.h>
			
 
				-#include <linux/kernel_stat.h>
			
 
				-#include <linux/kdebug.h>
			
 
				-#include <linux/smp.h>
			
 
				-
			
 
				-#include <asm/i8259.h>
			
 
				-#include <asm/io_apic.h>
			
 
				-#include <asm/proto.h>
			
 
				-#include <asm/timer.h>
			
 
				-
			
 
				-#include <asm/mce.h>
			
 
				-
			
 
				-#include <asm/mach_traps.h>
			
 
				-
			
 
				-int unknown_nmi_panic;
			
 
				-int nmi_watchdog_enabled;
			
 
				-
			
 
				-/* For reliability, we're prepared to waste bits here. */
			
 
				-static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
			
 
				-
			
 
				-/* nmi_active:
			
 
				- * >0: the lapic NMI watchdog is active, but can be disabled
			
 
				- * <0: the lapic NMI watchdog has not been set up, and cannot
			
 
				- *     be enabled
			
 
				- *  0: the lapic NMI watchdog is disabled, but can be enabled
			
 
				- */
			
 
				-atomic_t nmi_active = ATOMIC_INIT(0);		/* oprofile uses this */
			
 
				-EXPORT_SYMBOL(nmi_active);
			
 
				-
			
 
				-unsigned int nmi_watchdog = NMI_NONE;
			
 
				-EXPORT_SYMBOL(nmi_watchdog);
			
 
				-
			
 
				-static int panic_on_timeout;
			
 
				-
			
 
				-static unsigned int nmi_hz = HZ;
			
 
				-static DEFINE_PER_CPU(short, wd_enabled);
			
 
				-static int endflag __initdata;
			
 
				-
			
 
				-static inline unsigned int get_nmi_count(int cpu)
			
 
				-{
			
 
				-	return per_cpu(irq_stat, cpu).__nmi_count;
			
 
				-}
			
 
				-
			
 
				-static inline int mce_in_progress(void)
			
 
				-{
			
 
				-#if defined(CONFIG_X86_MCE)
			
 
				-	return atomic_read(&mce_entry) > 0;
			
 
				-#endif
			
 
				-	return 0;
			
 
				-}
			
 
				-
			
 
				-/*
			
 
				- * Take the local apic timer and PIT/HPET into account. We don't
			
 
				- * know which one is active, when we have highres/dyntick on
			
 
				- */
			
 
				-static inline unsigned int get_timer_irqs(int cpu)
			
 
				-{
			
 
				-	return per_cpu(irq_stat, cpu).apic_timer_irqs +
			
 
				-		per_cpu(irq_stat, cpu).irq0_irqs;
			
 
				-}
			
 
				-
			
 
				-#ifdef CONFIG_SMP
			
 
				-/*
			
 
				- * The performance counters used by NMI_LOCAL_APIC don't trigger when
			
 
				- * the CPU is idle. To make sure the NMI watchdog really ticks on all
			
 
				- * CPUs during the test make them busy.
			
 
				- */
			
 
				-static __init void nmi_cpu_busy(void *data)
			
 
				-{
			
 
				-	local_irq_enable_in_hardirq();
			
 
				-	/*
			
 
				-	 * Intentionally don't use cpu_relax here. This is
			
 
				-	 * to make sure that the performance counter really ticks,
			
 
				-	 * even if there is a simulator or similar that catches the
			
 
				-	 * pause instruction. On a real HT machine this is fine because
			
 
				-	 * all other CPUs are busy with "useless" delay loops and don't
			
 
				-	 * care if they get somewhat less cycles.
			
 
				-	 */
			
 
				-	while (endflag == 0)
			
 
				-		mb();
			
 
				-}
			
 
				-#endif
			
 
				-
			
 
				-static void report_broken_nmi(int cpu, unsigned int *prev_nmi_count)
			
 
				-{
			
 
				-	printk(KERN_CONT "\n");
			
 
				-
			
 
				-	printk(KERN_WARNING
			
 
				-		"WARNING: CPU#%d: NMI appears to be stuck (%d->%d)!\n",
			
 
				-			cpu, prev_nmi_count[cpu], get_nmi_count(cpu));
			
 
				-
			
 
				-	printk(KERN_WARNING
			
 
				-		"Please report this to bugzilla.kernel.org,\n");
			
 
				-	printk(KERN_WARNING
			
 
				-		"and attach the output of the 'dmesg' command.\n");
			
 
				-
			
 
				-	per_cpu(wd_enabled, cpu) = 0;
			
 
				-	atomic_dec(&nmi_active);
			
 
				-}
			
 
				-
			
 
				-static void __acpi_nmi_disable(void *__unused)
			
 
				-{
			
 
				-	apic_write(APIC_LVT0, APIC_DM_NMI | APIC_LVT_MASKED);
			
 
				-}
			
 
				-
			
 
				-int __init check_nmi_watchdog(void)
			
 
				-{
			
 
				-	unsigned int *prev_nmi_count;
			
 
				-	int cpu;
			
 
				-
			
 
				-	if (!nmi_watchdog_active() || !atomic_read(&nmi_active))
			
 
				-		return 0;
			
 
				-
			
 
				-	prev_nmi_count = kmalloc(nr_cpu_ids * sizeof(int), GFP_KERNEL);
			
 
				-	if (!prev_nmi_count)
			
 
				-		goto error;
			
 
				-
			
 
				-	printk(KERN_INFO "Testing NMI watchdog ... ");
			
 
				-
			
 
				-#ifdef CONFIG_SMP
			
 
				-	if (nmi_watchdog == NMI_LOCAL_APIC)
			
 
				-		smp_call_function(nmi_cpu_busy, (void *)&endflag, 0);
			
 
				-#endif
			
 
				-
			
 
				-	for_each_possible_cpu(cpu)
			
 
				-		prev_nmi_count[cpu] = get_nmi_count(cpu);
			
 
				-	local_irq_enable();
			
 
				-	mdelay((20 * 1000) / nmi_hz); /* wait 20 ticks */
			
 
				-
			
 
				-	for_each_online_cpu(cpu) {
			
 
				-		if (!per_cpu(wd_enabled, cpu))
			
 
				-			continue;
			
 
				-		if (get_nmi_count(cpu) - prev_nmi_count[cpu] <= 5)
			
 
				-			report_broken_nmi(cpu, prev_nmi_count);
			
 
				-	}
			
 
				-	endflag = 1;
			
 
				-	if (!atomic_read(&nmi_active)) {
			
 
				-		kfree(prev_nmi_count);
			
 
				-		atomic_set(&nmi_active, -1);
			
 
				-		goto error;
			
 
				-	}
			
 
				-	printk("OK.\n");
			
 
				-
			
 
				-	/*
			
 
				-	 * now that we know it works we can reduce NMI frequency to
			
 
				-	 * something more reasonable; makes a difference in some configs
			
 
				-	 */
			
 
				-	if (nmi_watchdog == NMI_LOCAL_APIC)
			
 
				-		nmi_hz = lapic_adjust_nmi_hz(1);
			
 
				-
			
 
				-	kfree(prev_nmi_count);
			
 
				-	return 0;
			
 
				-error:
			
 
				-	if (nmi_watchdog == NMI_IO_APIC) {
			
 
				-		if (!timer_through_8259)
			
 
				-			legacy_pic->mask(0);
			
 
				-		on_each_cpu(__acpi_nmi_disable, NULL, 1);
			
 
				-	}
			
 
				-
			
 
				-#ifdef CONFIG_X86_32
			
 
				-	timer_ack = 0;
			
 
				-#endif
			
 
				-	return -1;
			
 
				-}
			
 
				-
			
 
				-static int __init setup_nmi_watchdog(char *str)
			
 
				-{
			
 
				-	unsigned int nmi;
			
 
				-
			
 
				-	if (!strncmp(str, "panic", 5)) {
			
 
				-		panic_on_timeout = 1;
			
 
				-		str = strchr(str, ',');
			
 
				-		if (!str)
			
 
				-			return 1;
			
 
				-		++str;
			
 
				-	}
			
 
				-
			
 
				-	if (!strncmp(str, "lapic", 5))
			
 
				-		nmi_watchdog = NMI_LOCAL_APIC;
			
 
				-	else if (!strncmp(str, "ioapic", 6))
			
 
				-		nmi_watchdog = NMI_IO_APIC;
			
 
				-	else {
			
 
				-		get_option(&str, &nmi);
			
 
				-		if (nmi >= NMI_INVALID)
			
 
				-			return 0;
			
 
				-		nmi_watchdog = nmi;
			
 
				-	}
			
 
				-
			
 
				-	return 1;
			
 
				-}
			
 
				-__setup("nmi_watchdog=", setup_nmi_watchdog);
			
 
				-
			
 
				-/*
			
 
				- * Suspend/resume support
			
 
				- */
			
 
				-#ifdef CONFIG_PM
			
 
				-
			
 
				-static int nmi_pm_active; /* nmi_active before suspend */
			
 
				-
			
 
				-static int lapic_nmi_suspend(struct sys_device *dev, pm_message_t state)
			
 
				-{
			
 
				-	/* only CPU0 goes here, other CPUs should be offline */
			
 
				-	nmi_pm_active = atomic_read(&nmi_active);
			
 
				-	stop_apic_nmi_watchdog(NULL);
			
 
				-	BUG_ON(atomic_read(&nmi_active) != 0);
			
 
				-	return 0;
			
 
				-}
			
 
				-
			
 
				-static int lapic_nmi_resume(struct sys_device *dev)
			
 
				-{
			
 
				-	/* only CPU0 goes here, other CPUs should be offline */
			
 
				-	if (nmi_pm_active > 0) {
			
 
				-		setup_apic_nmi_watchdog(NULL);
			
 
				-		touch_nmi_watchdog();
			
 
				-	}
			
 
				-	return 0;
			
 
				-}
			
 
				-
			
 
				-static struct sysdev_class nmi_sysclass = {
			
 
				-	.name		= "lapic_nmi",
			
 
				-	.resume		= lapic_nmi_resume,
			
 
				-	.suspend	= lapic_nmi_suspend,
			
 
				-};
			
 
				-
			
 
				-static struct sys_device device_lapic_nmi = {
			
 
				-	.id	= 0,
			
 
				-	.cls	= &nmi_sysclass,
			
 
				-};
			
 
				-
			
 
				-static int __init init_lapic_nmi_sysfs(void)
			
 
				-{
			
 
				-	int error;
			
 
				-
			
 
				-	/*
			
 
				-	 * should really be a BUG_ON but b/c this is an
			
 
				-	 * init call, it just doesn't work.  -dcz
			
 
				-	 */
			
 
				-	if (nmi_watchdog != NMI_LOCAL_APIC)
			
 
				-		return 0;
			
 
				-
			
 
				-	if (atomic_read(&nmi_active) < 0)
			
 
				-		return 0;
			
 
				-
			
 
				-	error = sysdev_class_register(&nmi_sysclass);
			
 
				-	if (!error)
			
 
				-		error = sysdev_register(&device_lapic_nmi);
			
 
				-	return error;
			
 
				-}
			
 
				-
			
 
				-/* must come after the local APIC's device_initcall() */
			
 
				-late_initcall(init_lapic_nmi_sysfs);
			
 
				-
			
 
				-#endif	/* CONFIG_PM */
			
 
				-
			
 
				-static void __acpi_nmi_enable(void *__unused)
			
 
				-{
			
 
				-	apic_write(APIC_LVT0, APIC_DM_NMI);
			
 
				-}
			
 
				-
			
 
				-/*
			
 
				- * Enable timer based NMIs on all CPUs:
			
 
				- */
			
 
				-void acpi_nmi_enable(void)
			
 
				-{
			
 
				-	if (atomic_read(&nmi_active) && nmi_watchdog == NMI_IO_APIC)
			
 
				-		on_each_cpu(__acpi_nmi_enable, NULL, 1);
			
 
				-}
			
 
				-
			
 
				-/*
			
 
				- * Disable timer based NMIs on all CPUs:
			
 
				- */
			
 
				-void acpi_nmi_disable(void)
			
 
				-{
			
 
				-	if (atomic_read(&nmi_active) && nmi_watchdog == NMI_IO_APIC)
			
 
				-		on_each_cpu(__acpi_nmi_disable, NULL, 1);
			
 
				-}
			
 
				-
			
 
				-/*
			
 
				- * This function is called as soon the LAPIC NMI watchdog driver has everything
			
 
				- * in place and it's ready to check if the NMIs belong to the NMI watchdog
			
 
				- */
			
 
				-void cpu_nmi_set_wd_enabled(void)
			
 
				-{
			
 
				-	__get_cpu_var(wd_enabled) = 1;
			
 
				-}
			
 
				-
			
 
				-void setup_apic_nmi_watchdog(void *unused)
			
 
				-{
			
 
				-	if (__get_cpu_var(wd_enabled))
			
 
				-		return;
			
 
				-
			
 
				-	/* cheap hack to support suspend/resume */
			
 
				-	/* if cpu0 is not active neither should the other cpus */
			
 
				-	if (smp_processor_id() != 0 && atomic_read(&nmi_active) <= 0)
			
 
				-		return;
			
 
				-
			
 
				-	switch (nmi_watchdog) {
			
 
				-	case NMI_LOCAL_APIC:
			
 
				-		if (lapic_watchdog_init(nmi_hz) < 0) {
			
 
				-			__get_cpu_var(wd_enabled) = 0;
			
 
				-			return;
			
 
				-		}
			
 
				-		/* FALL THROUGH */
			
 
				-	case NMI_IO_APIC:
			
 
				-		__get_cpu_var(wd_enabled) = 1;
			
 
				-		atomic_inc(&nmi_active);
			
 
				-	}
			
 
				-}
			
 
				-
			
 
				-void stop_apic_nmi_watchdog(void *unused)
			
 
				-{
			
 
				-	/* only support LOCAL and IO APICs for now */
			
 
				-	if (!nmi_watchdog_active())
			
 
				-		return;
			
 
				-	if (__get_cpu_var(wd_enabled) == 0)
			
 
				-		return;
			
 
				-	if (nmi_watchdog == NMI_LOCAL_APIC)
			
 
				-		lapic_watchdog_stop();
			
 
				-	else
			
 
				-		__acpi_nmi_disable(NULL);
			
 
				-	__get_cpu_var(wd_enabled) = 0;
			
 
				-	atomic_dec(&nmi_active);
			
 
				-}
			
 
				-
			
 
				-/*
			
 
				- * the best way to detect whether a CPU has a 'hard lockup' problem
			
 
				- * is to check it's local APIC timer IRQ counts. If they are not
			
 
				- * changing then that CPU has some problem.
			
 
				- *
			
 
				- * as these watchdog NMI IRQs are generated on every CPU, we only
			
 
				- * have to check the current processor.
			
 
				- *
			
 
				- * since NMIs don't listen to _any_ locks, we have to be extremely
			
 
				- * careful not to rely on unsafe variables. The printk might lock
			
 
				- * up though, so we have to break up any console locks first ...
			
 
				- * [when there will be more tty-related locks, break them up here too!]
			
 
				- */
			
 
				-
			
 
				-static DEFINE_PER_CPU(unsigned, last_irq_sum);
			
 
				-static DEFINE_PER_CPU(long, alert_counter);
			
 
				-static DEFINE_PER_CPU(int, nmi_touch);
			
 
				-
			
 
				-void touch_nmi_watchdog(void)
			
 
				-{
			
 
				-	if (nmi_watchdog_active()) {
			
 
				-		unsigned cpu;
			
 
				-
			
 
				-		/*
			
 
				-		 * Tell other CPUs to reset their alert counters. We cannot
			
 
				-		 * do it ourselves because the alert count increase is not
			
 
				-		 * atomic.
			
 
				-		 */
			
 
				-		for_each_present_cpu(cpu) {
			
 
				-			if (per_cpu(nmi_touch, cpu) != 1)
			
 
				-				per_cpu(nmi_touch, cpu) = 1;
			
 
				-		}
			
 
				-	}
			
 
				-
			
 
				-	/*
			
 
				-	 * Tickle the softlockup detector too:
			
 
				-	 */
			
 
				-	touch_softlockup_watchdog();
			
 
				-}
			
 
				-EXPORT_SYMBOL(touch_nmi_watchdog);
			
 
				-
			
 
				-notrace __kprobes int
			
 
				-nmi_watchdog_tick(struct pt_regs *regs, unsigned reason)
			
 
				-{
			
 
				-	/*
			
 
				-	 * Since current_thread_info()-> is always on the stack, and we
			
 
				-	 * always switch the stack NMI-atomically, it's safe to use
			
 
				-	 * smp_processor_id().
			
 
				-	 */
			
 
				-	unsigned int sum;
			
 
				-	int touched = 0;
			
 
				-	int cpu = smp_processor_id();
			
 
				-	int rc = 0;
			
 
				-
			
 
				-	sum = get_timer_irqs(cpu);
			
 
				-
			
 
				-	if (__get_cpu_var(nmi_touch)) {
			
 
				-		__get_cpu_var(nmi_touch) = 0;
			
 
				-		touched = 1;
			
 
				-	}
			
 
				-
			
 
				-	/* We can be called before check_nmi_watchdog, hence NULL check. */
			
 
				-	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
			
 
				-		static DEFINE_RAW_SPINLOCK(lock); /* Serialise the printks */
			
 
				-
			
 
				-		raw_spin_lock(&lock);
			
 
				-		printk(KERN_WARNING "NMI backtrace for cpu %d\n", cpu);
			
 
				-		show_regs(regs);
			
 
				-		dump_stack();
			
 
				-		raw_spin_unlock(&lock);
			
 
				-		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
			
 
				-
			
 
				-		rc = 1;
			
 
				-	}
			
 
				-
			
 
				-	/* Could check oops_in_progress here too, but it's safer not to */
			
 
				-	if (mce_in_progress())
			
 
				-		touched = 1;
			
 
				-
			
 
				-	/* if the none of the timers isn't firing, this cpu isn't doing much */
			
 
				-	if (!touched && __get_cpu_var(last_irq_sum) == sum) {
			
 
				-		/*
			
 
				-		 * Ayiee, looks like this CPU is stuck ...
			
 
				-		 * wait a few IRQs (5 seconds) before doing the oops ...
			
 
				-		 */
			
 
				-		__this_cpu_inc(alert_counter);
			
 
				-		if (__this_cpu_read(alert_counter) == 5 * nmi_hz)
			
 
				-			/*
			
 
				-			 * die_nmi will return ONLY if NOTIFY_STOP happens..
			
 
				-			 */
			
 
				-			die_nmi("BUG: NMI Watchdog detected LOCKUP",
			
 
				-				regs, panic_on_timeout);
			
 
				-	} else {
			
 
				-		__get_cpu_var(last_irq_sum) = sum;
			
 
				-		__this_cpu_write(alert_counter, 0);
			
 
				-	}
			
 
				-
			
 
				-	/* see if the nmi watchdog went off */
			
 
				-	if (!__get_cpu_var(wd_enabled))
			
 
				-		return rc;
			
 
				-	switch (nmi_watchdog) {
			
 
				-	case NMI_LOCAL_APIC:
			
 
				-		rc |= lapic_wd_event(nmi_hz);
			
 
				-		break;
			
 
				-	case NMI_IO_APIC:
			
 
				-		/*
			
 
				-		 * don't know how to accurately check for this.
			
 
				-		 * just assume it was a watchdog timer interrupt
			
 
				-		 * This matches the old behaviour.
			
 
				-		 */
			
 
				-		rc = 1;
			
 
				-		break;
			
 
				-	}
			
 
				-	return rc;
			
 
				-}
			
 
				-
			
 
				-#ifdef CONFIG_SYSCTL
			
 
				-
			
 
				-static void enable_ioapic_nmi_watchdog_single(void *unused)
			
 
				-{
			
 
				-	__get_cpu_var(wd_enabled) = 1;
			
 
				-	atomic_inc(&nmi_active);
			
 
				-	__acpi_nmi_enable(NULL);
			
 
				-}
			
 
				-
			
 
				-static void enable_ioapic_nmi_watchdog(void)
			
 
				-{
			
 
				-	on_each_cpu(enable_ioapic_nmi_watchdog_single, NULL, 1);
			
 
				-	touch_nmi_watchdog();
			
 
				-}
			
 
				-
			
 
				-static void disable_ioapic_nmi_watchdog(void)
			
 
				-{
			
 
				-	on_each_cpu(stop_apic_nmi_watchdog, NULL, 1);
			
 
				-}
			
 
				-
			
 
				-static int __init setup_unknown_nmi_panic(char *str)
			
 
				-{
			
 
				-	unknown_nmi_panic = 1;
			
 
				-	return 1;
			
 
				-}
			
 
				-__setup("unknown_nmi_panic", setup_unknown_nmi_panic);
			
 
				-
			
 
				-static int unknown_nmi_panic_callback(struct pt_regs *regs, int cpu)
			
 
				-{
			
 
				-	unsigned char reason = get_nmi_reason();
			
 
				-	char buf[64];
			
 
				-
			
 
				-	sprintf(buf, "NMI received for unknown reason %02x\n", reason);
			
 
				-	die_nmi(buf, regs, 1); /* Always panic here */
			
 
				-	return 0;
			
 
				-}
			
 
				-
			
 
				-/*
			
 
				- * proc handler for /proc/sys/kernel/nmi
			
 
				- */
			
 
				-int proc_nmi_enabled(struct ctl_table *table, int write,
			
 
				-			void __user *buffer, size_t *length, loff_t *ppos)
			
 
				-{
			
 
				-	int old_state;
			
 
				-
			
 
				-	nmi_watchdog_enabled = (atomic_read(&nmi_active) > 0) ? 1 : 0;
			
 
				-	old_state = nmi_watchdog_enabled;
			
 
				-	proc_dointvec(table, write, buffer, length, ppos);
			
 
				-	if (!!old_state == !!nmi_watchdog_enabled)
			
 
				-		return 0;
			
 
				-
			
 
				-	if (atomic_read(&nmi_active) < 0 || !nmi_watchdog_active()) {
			
 
				-		printk(KERN_WARNING
			
 
				-			"NMI watchdog is permanently disabled\n");
			
 
				-		return -EIO;
			
 
				-	}
			
 
				-
			
 
				-	if (nmi_watchdog == NMI_LOCAL_APIC) {
			
 
				-		if (nmi_watchdog_enabled)
			
 
				-			enable_lapic_nmi_watchdog();
			
 
				-		else
			
 
				-			disable_lapic_nmi_watchdog();
			
 
				-	} else if (nmi_watchdog == NMI_IO_APIC) {
			
 
				-		if (nmi_watchdog_enabled)
			
 
				-			enable_ioapic_nmi_watchdog();
			
 
				-		else
			
 
				-			disable_ioapic_nmi_watchdog();
			
 
				-	} else {
			
 
				-		printk(KERN_WARNING
			
 
				-			"NMI watchdog doesn't know what hardware to touch\n");
			
 
				-		return -EIO;
			
 
				-	}
			
 
				-	return 0;
			
 
				-}
			
 
				-
			
 
				-#endif /* CONFIG_SYSCTL */
			
 
				-
			
 
				-int do_nmi_callback(struct pt_regs *regs, int cpu)
			
 
				-{
			
 
				-#ifdef CONFIG_SYSCTL
			
 
				-	if (unknown_nmi_panic)
			
 
				-		return unknown_nmi_panic_callback(regs, cpu);
			
 
				-#endif
			
 
				-	return 0;
			
 
				-}
			
 
				-
			
 
				-void arch_trigger_all_cpu_backtrace(void)
			
 
				-{
			
 
				-	int i;
			
 
				-
			
 
				-	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
			
 
				-
			
 
				-	printk(KERN_INFO "sending NMI to all CPUs:\n");
			
 
				-	apic->send_IPI_all(NMI_VECTOR);
			
 
				-
			
 
				-	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
			
 
				-	for (i = 0; i < 10 * 1000; i++) {
			
 
				-		if (cpumask_empty(to_cpumask(backtrace_mask)))
			
 
				-			break;
			
 
				-		mdelay(1);
			
 
				-	}
			
 
				-}
			
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -894,7 +894,6 @@ void __init identify_boot_cpu(void)
 
				 #else
			
 
				 	vgetcpu_set_mode();
			
 
				 #endif
			
 
				-	init_hw_perf_events();
			
 
				 }
			
 
				 
			
 
				 void __cpuinit identify_secondary_cpu(struct cpuinfo_x86 *c)
			
--- a/arch/x86/kernel/cpu/perf_event.c
+++ b/arch/x86/kernel/cpu/perf_event.c
@@ -330,9 +330,6 @@ static bool reserve_pmc_hardware(void)
 
				 {
			
 
				 	int i;
			
 
				 
			
 
				-	if (nmi_watchdog == NMI_LOCAL_APIC)
			
 
				-		disable_lapic_nmi_watchdog();
			
 
				-
			
 
				 	for (i = 0; i < x86_pmu.num_counters; i++) {
			
 
				 		if (!reserve_perfctr_nmi(x86_pmu.perfctr + i))
			
 
				 			goto perfctr_fail;
			
@@ -355,9 +352,6 @@ perfctr_fail:
 
				 	for (i--; i >= 0; i--)
			
 
				 		release_perfctr_nmi(x86_pmu.perfctr + i);
			
 
				 
			
 
				-	if (nmi_watchdog == NMI_LOCAL_APIC)
			
 
				-		enable_lapic_nmi_watchdog();
			
 
				-
			
 
				 	return false;
			
 
				 }
			
 
				 
			
@@ -369,9 +363,6 @@ static void release_pmc_hardware(void)
 
				 		release_perfctr_nmi(x86_pmu.perfctr + i);
			
 
				 		release_evntsel_nmi(x86_pmu.eventsel + i);
			
 
				 	}
			
 
				-
			
 
				-	if (nmi_watchdog == NMI_LOCAL_APIC)
			
 
				-		enable_lapic_nmi_watchdog();
			
 
				 }
			
 
				 
			
 
				 #else
			
@@ -384,15 +375,53 @@ static void release_pmc_hardware(void) {}
 
				 static bool check_hw_exists(void)
			
 
				 {
			
 
				 	u64 val, val_new = 0;
			
 
				-	int ret = 0;
			
 
				+	int i, reg, ret = 0;
			
 
				+
			
 
				+	/*
			
 
				+	 * Check to see if the BIOS enabled any of the counters, if so
			
 
				+	 * complain and bail.
			
 
				+	 */
			
 
				+	for (i = 0; i < x86_pmu.num_counters; i++) {
			
 
				+		reg = x86_pmu.eventsel + i;
			
 
				+		ret = rdmsrl_safe(reg, &val);
			
 
				+		if (ret)
			
 
				+			goto msr_fail;
			
 
				+		if (val & ARCH_PERFMON_EVENTSEL_ENABLE)
			
 
				+			goto bios_fail;
			
 
				+	}
			
 
				 
			
 
				+	if (x86_pmu.num_counters_fixed) {
			
 
				+		reg = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
			
 
				+		ret = rdmsrl_safe(reg, &val);
			
 
				+		if (ret)
			
 
				+			goto msr_fail;
			
 
				+		for (i = 0; i < x86_pmu.num_counters_fixed; i++) {
			
 
				+			if (val & (0x03 << i*4))
			
 
				+				goto bios_fail;
			
 
				+		}
			
 
				+	}
			
 
				+
			
 
				+	/*
			
 
				+	 * Now write a value and read it back to see if it matches,
			
 
				+	 * this is needed to detect certain hardware emulators (qemu/kvm)
			
 
				+	 * that don't trap on the MSR access and always return 0s.
			
 
				+	 */
			
 
				 	val = 0xabcdUL;
			
 
				-	ret |= checking_wrmsrl(x86_pmu.perfctr, val);
			
 
				+	ret = checking_wrmsrl(x86_pmu.perfctr, val);
			
 
				 	ret |= rdmsrl_safe(x86_pmu.perfctr, &val_new);
			
 
				 	if (ret || val != val_new)
			
 
				-		return false;
			
 
				+		goto msr_fail;
			
 
				 
			
 
				 	return true;
			
 
				+
			
 
				+bios_fail:
			
 
				+	printk(KERN_CONT "Broken BIOS detected, using software events only.\n");
			
 
				+	printk(KERN_ERR FW_BUG "the BIOS has corrupted hw-PMU resources (MSR %x is %Lx)\n", reg, val);
			
 
				+	return false;
			
 
				+
			
 
				+msr_fail:
			
 
				+	printk(KERN_CONT "Broken PMU hardware detected, using software events only.\n");
			
 
				+	return false;
			
 
				 }
			
 
				 
			
 
				 static void reserve_ds_buffers(void);
			
@@ -451,7 +480,7 @@ static int x86_setup_perfctr(struct perf_event *event)
 
				 	struct hw_perf_event *hwc = &event->hw;
			
 
				 	u64 config;
			
 
				 
			
 
				-	if (!hwc->sample_period) {
			
 
				+	if (!is_sampling_event(event)) {
			
 
				 		hwc->sample_period = x86_pmu.max_period;
			
 
				 		hwc->last_period = hwc->sample_period;
			
 
				 		local64_set(&hwc->period_left, hwc->sample_period);
			
@@ -1362,7 +1391,7 @@ static void __init pmu_check_apic(void)
 
				 	pr_info("no hardware sampling interrupt available.\n");
			
 
				 }
			
 
				 
			
 
				-void __init init_hw_perf_events(void)
			
 
				+int __init init_hw_perf_events(void)
			
 
				 {
			
 
				 	struct event_constraint *c;
			
 
				 	int err;
			
@@ -1377,20 +1406,18 @@ void __init init_hw_perf_events(void)
 
				 		err = amd_pmu_init();
			
 
				 		break;
			
 
				 	default:
			
 
				-		return;
			
 
				+		return 0;
			
 
				 	}
			
 
				 	if (err != 0) {
			
 
				 		pr_cont("no PMU driver, software events only.\n");
			
 
				-		return;
			
 
				+		return 0;
			
 
				 	}
			
 
				 
			
 
				 	pmu_check_apic();
			
 
				 
			
 
				 	/* sanity check that the hardware exists or is emulated */
			
 
				-	if (!check_hw_exists()) {
			
 
				-		pr_cont("Broken PMU hardware detected, software events only.\n");
			
 
				-		return;
			
 
				-	}
			
 
				+	if (!check_hw_exists())
			
 
				+		return 0;
			
 
				 
			
 
				 	pr_cont("%s PMU driver.\n", x86_pmu.name);
			
 
				 
			
@@ -1438,9 +1465,12 @@ void __init init_hw_perf_events(void)
 
				 	pr_info("... fixed-purpose events:   %d\n",     x86_pmu.num_counters_fixed);
			
 
				 	pr_info("... event mask:             %016Lx\n", x86_pmu.intel_ctrl);
			
 
				 
			
 
				-	perf_pmu_register(&pmu);
			
 
				+	perf_pmu_register(&pmu, "cpu", PERF_TYPE_RAW);
			
 
				 	perf_cpu_notifier(x86_pmu_notifier);
			
 
				+
			
 
				+	return 0;
			
 
				 }
			
 
				+early_initcall(init_hw_perf_events);
			
 
				 
			
 
				 static inline void x86_pmu_read(struct perf_event *event)
			
 
				 {
			
@@ -1686,7 +1716,7 @@ perf_callchain_kernel(struct perf_callchain_entry *entry, struct pt_regs *regs)
 
				 
			
 
				 	perf_callchain_store(entry, regs->ip);
			
 
				 
			
 
				-	dump_trace(NULL, regs, NULL, regs->bp, &backtrace_ops, entry);
			
 
				+	dump_trace(NULL, regs, NULL, &backtrace_ops, entry);
			
 
				 }
			
 
				 
			
 
				 #ifdef CONFIG_COMPAT
			
--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -1,7 +1,5 @@
 
				 #ifdef CONFIG_CPU_SUP_AMD
			
 
				 
			
 
				-static DEFINE_RAW_SPINLOCK(amd_nb_lock);
			
 
				-
			
 
				 static __initconst const u64 amd_hw_cache_event_ids
			
 
				 				[PERF_COUNT_HW_CACHE_MAX]
			
 
				 				[PERF_COUNT_HW_CACHE_OP_MAX]
			
@@ -275,7 +273,7 @@ done:
 
				 	return &emptyconstraint;
			
 
				 }
			
 
				 
			
 
				-static struct amd_nb *amd_alloc_nb(int cpu, int nb_id)
			
 
				+static struct amd_nb *amd_alloc_nb(int cpu)
			
 
				 {
			
 
				 	struct amd_nb *nb;
			
 
				 	int i;
			
@@ -285,7 +283,7 @@ static struct amd_nb *amd_alloc_nb(int cpu, int nb_id)
 
				 	if (!nb)
			
 
				 		return NULL;
			
 
				 
			
 
				-	nb->nb_id = nb_id;
			
 
				+	nb->nb_id = -1;
			
 
				 
			
 
				 	/*
			
 
				 	 * initialize all possible NB constraints
			
@@ -306,7 +304,7 @@ static int amd_pmu_cpu_prepare(int cpu)
 
				 	if (boot_cpu_data.x86_max_cores < 2)
			
 
				 		return NOTIFY_OK;
			
 
				 
			
 
				-	cpuc->amd_nb = amd_alloc_nb(cpu, -1);
			
 
				+	cpuc->amd_nb = amd_alloc_nb(cpu);
			
 
				 	if (!cpuc->amd_nb)
			
 
				 		return NOTIFY_BAD;
			
 
				 
			
@@ -325,8 +323,6 @@ static void amd_pmu_cpu_starting(int cpu)
 
				 	nb_id = amd_get_nb_id(cpu);
			
 
				 	WARN_ON_ONCE(nb_id == BAD_APICID);
			
 
				 
			
 
				-	raw_spin_lock(&amd_nb_lock);
			
 
				-
			
 
				 	for_each_online_cpu(i) {
			
 
				 		nb = per_cpu(cpu_hw_events, i).amd_nb;
			
 
				 		if (WARN_ON_ONCE(!nb))
			
@@ -341,8 +337,6 @@ static void amd_pmu_cpu_starting(int cpu)
 
				 
			
 
				 	cpuc->amd_nb->nb_id = nb_id;
			
 
				 	cpuc->amd_nb->refcnt++;
			
 
				-
			
 
				-	raw_spin_unlock(&amd_nb_lock);
			
 
				 }
			
 
				 
			
 
				 static void amd_pmu_cpu_dead(int cpu)
			
@@ -354,8 +348,6 @@ static void amd_pmu_cpu_dead(int cpu)
 
				 
			
 
				 	cpuhw = &per_cpu(cpu_hw_events, cpu);
			
 
				 
			
 
				-	raw_spin_lock(&amd_nb_lock);
			
 
				-
			
 
				 	if (cpuhw->amd_nb) {
			
 
				 		struct amd_nb *nb = cpuhw->amd_nb;
			
 
				 
			
@@ -364,8 +356,6 @@ static void amd_pmu_cpu_dead(int cpu)
 
				 
			
 
				 		cpuhw->amd_nb = NULL;
			
 
				 	}
			
 
				-
			
 
				-	raw_spin_unlock(&amd_nb_lock);
			
 
				 }
			
 
				 
			
 
				 static __initconst const struct x86_pmu amd_pmu = {
			
--- a/arch/x86/kernel/cpu/perf_event_intel.c
+++ b/arch/x86/kernel/cpu/perf_event_intel.c
@@ -816,6 +816,32 @@ static int intel_pmu_hw_config(struct perf_event *event)
 
				 	if (ret)
			
 
				 		return ret;
			
 
				 
			
 
				+	if (event->attr.precise_ip &&
			
 
				+	    (event->hw.config & X86_RAW_EVENT_MASK) == 0x003c) {
			
 
				+		/*
			
 
				+		 * Use an alternative encoding for CPU_CLK_UNHALTED.THREAD_P
			
 
				+		 * (0x003c) so that we can use it with PEBS.
			
 
				+		 *
			
 
				+		 * The regular CPU_CLK_UNHALTED.THREAD_P event (0x003c) isn't
			
 
				+		 * PEBS capable. However we can use INST_RETIRED.ANY_P
			
 
				+		 * (0x00c0), which is a PEBS capable event, to get the same
			
 
				+		 * count.
			
 
				+		 *
			
 
				+		 * INST_RETIRED.ANY_P counts the number of cycles that retires
			
 
				+		 * CNTMASK instructions. By setting CNTMASK to a value (16)
			
 
				+		 * larger than the maximum number of instructions that can be
			
 
				+		 * retired per cycle (4) and then inverting the condition, we
			
 
				+		 * count all cycles that retire 16 or less instructions, which
			
 
				+		 * is every cycle.
			
 
				+		 *
			
 
				+		 * Thereby we gain a PEBS capable cycle counter.
			
 
				+		 */
			
 
				+		u64 alt_config = 0x108000c0; /* INST_RETIRED.TOTAL_CYCLES */
			
 
				+
			
 
				+		alt_config |= (event->hw.config & ~X86_RAW_EVENT_MASK);
			
 
				+		event->hw.config = alt_config;
			
 
				+	}
			
 
				+
			
 
				 	if (event->attr.type != PERF_TYPE_RAW)
			
 
				 		return 0;
			
 
				 
			
--- a/arch/x86/kernel/cpu/perfctr-watchdog.c
+++ b/arch/x86/kernel/cpu/perfctr-watchdog.c
@@ -16,32 +16,12 @@
 
				 #include <linux/kernel.h>
			
 
				 #include <linux/bitops.h>
			
 
				 #include <linux/smp.h>
			
 
				-#include <linux/nmi.h>
			
 
				+#include <asm/nmi.h>
			
 
				 #include <linux/kprobes.h>
			
 
				 
			
 
				 #include <asm/apic.h>
			
 
				 #include <asm/perf_event.h>
			
 
				 
			
 
				-struct nmi_watchdog_ctlblk {
			
 
				-	unsigned int cccr_msr;
			
 
				-	unsigned int perfctr_msr;  /* the MSR to reset in NMI handler */
			
 
				-	unsigned int evntsel_msr;  /* the MSR to select the events to handle */
			
 
				-};
			
 
				-
			
 
				-/* Interface defining a CPU specific perfctr watchdog */
			
 
				-struct wd_ops {
			
 
				-	int (*reserve)(void);
			
 
				-	void (*unreserve)(void);
			
 
				-	int (*setup)(unsigned nmi_hz);
			
 
				-	void (*rearm)(struct nmi_watchdog_ctlblk *wd, unsigned nmi_hz);
			
 
				-	void (*stop)(void);
			
 
				-	unsigned perfctr;
			
 
				-	unsigned evntsel;
			
 
				-	u64 checkbit;
			
 
				-};
			
 
				-
			
 
				-static const struct wd_ops *wd_ops;
			
 
				-
			
 
				 /*
			
 
				  * this number is calculated from Intel's MSR_P4_CRU_ESCR5 register and it's
			
 
				  * offset from MSR_P4_BSU_ESCR0.
			
@@ -60,8 +40,6 @@ static const struct wd_ops *wd_ops;
 
				 static DECLARE_BITMAP(perfctr_nmi_owner, NMI_MAX_COUNTER_BITS);
			
 
				 static DECLARE_BITMAP(evntsel_nmi_owner, NMI_MAX_COUNTER_BITS);
			
 
				 
			
 
				-static DEFINE_PER_CPU(struct nmi_watchdog_ctlblk, nmi_watchdog_ctlblk);
			
 
				-
			
 
				 /* converts an msr to an appropriate reservation bit */
			
 
				 static inline unsigned int nmi_perfctr_msr_to_bit(unsigned int msr)
			
 
				 {
			
@@ -172,623 +150,3 @@ void release_evntsel_nmi(unsigned int msr)
 
				 	clear_bit(counter, evntsel_nmi_owner);
			
 
				 }
			
 
				 EXPORT_SYMBOL(release_evntsel_nmi);
			
 
				-
			
 
				-void disable_lapic_nmi_watchdog(void)
			
 
				-{
			
 
				-	BUG_ON(nmi_watchdog != NMI_LOCAL_APIC);
			
 
				-
			
 
				-	if (atomic_read(&nmi_active) <= 0)
			
 
				-		return;
			
 
				-
			
 
				-	on_each_cpu(stop_apic_nmi_watchdog, NULL, 1);
			
 
				-
			
 
				-	if (wd_ops)
			
 
				-		wd_ops->unreserve();
			
 
				-
			
 
				-	BUG_ON(atomic_read(&nmi_active) != 0);
			
 
				-}
			
 
				-
			
 
				-void enable_lapic_nmi_watchdog(void)
			
 
				-{
			
 
				-	BUG_ON(nmi_watchdog != NMI_LOCAL_APIC);
			
 
				-
			
 
				-	/* are we already enabled */
			
 
				-	if (atomic_read(&nmi_active) != 0)
			
 
				-		return;
			
 
				-
			
 
				-	/* are we lapic aware */
			
 
				-	if (!wd_ops)
			
 
				-		return;
			
 
				-	if (!wd_ops->reserve()) {
			
 
				-		printk(KERN_ERR "NMI watchdog: cannot reserve perfctrs\n");
			
 
				-		return;
			
 
				-	}
			
 
				-
			
 
				-	on_each_cpu(setup_apic_nmi_watchdog, NULL, 1);
			
 
				-	touch_nmi_watchdog();
			
 
				-}
			
 
				-
			
 
				-/*
			
 
				- * Activate the NMI watchdog via the local APIC.
			
 
				- */
			
 
				-
			
 
				-static unsigned int adjust_for_32bit_ctr(unsigned int hz)
			
 
				-{
			
 
				-	u64 counter_val;
			
 
				-	unsigned int retval = hz;
			
 
				-
			
 
				-	/*
			
 
				-	 * On Intel CPUs with P6/ARCH_PERFMON only 32 bits in the counter
			
 
				-	 * are writable, with higher bits sign extending from bit 31.
			
 
				-	 * So, we can only program the counter with 31 bit values and
			
 
				-	 * 32nd bit should be 1, for 33.. to be 1.
			
 
				-	 * Find the appropriate nmi_hz
			
 
				-	 */
			
 
				-	counter_val = (u64)cpu_khz * 1000;
			
 
				-	do_div(counter_val, retval);
			
 
				-	if (counter_val > 0x7fffffffULL) {
			
 
				-		u64 count = (u64)cpu_khz * 1000;
			
 
				-		do_div(count, 0x7fffffffUL);
			
 
				-		retval = count + 1;
			
 
				-	}
			
 
				-	return retval;
			
 
				-}
			
 
				-
			
 
				-static void write_watchdog_counter(unsigned int perfctr_msr,
			
 
				-				const char *descr, unsigned nmi_hz)
			
 
				-{
			
 
				-	u64 count = (u64)cpu_khz * 1000;
			
 
				-
			
 
				-	do_div(count, nmi_hz);
			
 
				-	if (descr)
			
 
				-		pr_debug("setting %s to -0x%08Lx\n", descr, count);
			
 
				-	wrmsrl(perfctr_msr, 0 - count);
			
 
				-}
			
 
				-
			
 
				-static void write_watchdog_counter32(unsigned int perfctr_msr,
			
 
				-				const char *descr, unsigned nmi_hz)
			
 
				-{
			
 
				-	u64 count = (u64)cpu_khz * 1000;
			
 
				-
			
 
				-	do_div(count, nmi_hz);
			
 
				-	if (descr)
			
 
				-		pr_debug("setting %s to -0x%08Lx\n", descr, count);
			
 
				-	wrmsr(perfctr_msr, (u32)(-count), 0);
			
 
				-}
			
 
				-
			
 
				-/*
			
 
				- * AMD K7/K8/Family10h/Family11h support.
			
 
				- * AMD keeps this interface nicely stable so there is not much variety
			
 
				- */
			
 
				-#define K7_EVNTSEL_ENABLE	(1 << 22)
			
 
				-#define K7_EVNTSEL_INT		(1 << 20)
			
 
				-#define K7_EVNTSEL_OS		(1 << 17)
			
 
				-#define K7_EVNTSEL_USR		(1 << 16)
			
 
				-#define K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING	0x76
			
 
				-#define K7_NMI_EVENT		K7_EVENT_CYCLES_PROCESSOR_IS_RUNNING
			
 
				-
			
 
				-static int setup_k7_watchdog(unsigned nmi_hz)
			
 
				-{
			
 
				-	unsigned int perfctr_msr, evntsel_msr;
			
 
				-	unsigned int evntsel;
			
 
				-	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
			
 
				-
			
 
				-	perfctr_msr = wd_ops->perfctr;
			
 
				-	evntsel_msr = wd_ops->evntsel;
			
 
				-
			
 
				-	wrmsrl(perfctr_msr, 0UL);
			
 
				-
			
 
				-	evntsel = K7_EVNTSEL_INT
			
 
				-		| K7_EVNTSEL_OS
			
 
				-		| K7_EVNTSEL_USR
			
 
				-		| K7_NMI_EVENT;
			
 
				-
			
 
				-	/* setup the timer */
			
 
				-	wrmsr(evntsel_msr, evntsel, 0);
			
 
				-	write_watchdog_counter(perfctr_msr, "K7_PERFCTR0", nmi_hz);
			
 
				-
			
 
				-	/* initialize the wd struct before enabling */
			
 
				-	wd->perfctr_msr = perfctr_msr;
			
 
				-	wd->evntsel_msr = evntsel_msr;
			
 
				-	wd->cccr_msr = 0;  /* unused */
			
 
				-
			
 
				-	/* ok, everything is initialized, announce that we're set */
			
 
				-	cpu_nmi_set_wd_enabled();
			
 
				-
			
 
				-	apic_write(APIC_LVTPC, APIC_DM_NMI);
			
 
				-	evntsel |= K7_EVNTSEL_ENABLE;
			
 
				-	wrmsr(evntsel_msr, evntsel, 0);
			
 
				-
			
 
				-	return 1;
			
 
				-}
			
 
				-
			
 
				-static void single_msr_stop_watchdog(void)
			
 
				-{
			
 
				-	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
			
 
				-
			
 
				-	wrmsr(wd->evntsel_msr, 0, 0);
			
 
				-}
			
 
				-
			
 
				-static int single_msr_reserve(void)
			
 
				-{
			
 
				-	if (!reserve_perfctr_nmi(wd_ops->perfctr))
			
 
				-		return 0;
			
 
				-
			
 
				-	if (!reserve_evntsel_nmi(wd_ops->evntsel)) {
			
 
				-		release_perfctr_nmi(wd_ops->perfctr);
			
 
				-		return 0;
			
 
				-	}
			
 
				-	return 1;
			
 
				-}
			
 
				-
			
 
				-static void single_msr_unreserve(void)
			
 
				-{
			
 
				-	release_evntsel_nmi(wd_ops->evntsel);
			
 
				-	release_perfctr_nmi(wd_ops->perfctr);
			
 
				-}
			
 
				-
			
 
				-static void __kprobes
			
 
				-single_msr_rearm(struct nmi_watchdog_ctlblk *wd, unsigned nmi_hz)
			
 
				-{
			
 
				-	/* start the cycle over again */
			
 
				-	write_watchdog_counter(wd->perfctr_msr, NULL, nmi_hz);
			
 
				-}
			
 
				-
			
 
				-static const struct wd_ops k7_wd_ops = {
			
 
				-	.reserve	= single_msr_reserve,
			
 
				-	.unreserve	= single_msr_unreserve,
			
 
				-	.setup		= setup_k7_watchdog,
			
 
				-	.rearm		= single_msr_rearm,
			
 
				-	.stop		= single_msr_stop_watchdog,
			
 
				-	.perfctr	= MSR_K7_PERFCTR0,
			
 
				-	.evntsel	= MSR_K7_EVNTSEL0,
			
 
				-	.checkbit	= 1ULL << 47,
			
 
				-};
			
 
				-
			
 
				-/*
			
 
				- * Intel Model 6 (PPro+,P2,P3,P-M,Core1)
			
 
				- */
			
 
				-#define P6_EVNTSEL0_ENABLE	(1 << 22)
			
 
				-#define P6_EVNTSEL_INT		(1 << 20)
			
 
				-#define P6_EVNTSEL_OS		(1 << 17)
			
 
				-#define P6_EVNTSEL_USR		(1 << 16)
			
 
				-#define P6_EVENT_CPU_CLOCKS_NOT_HALTED	0x79
			
 
				-#define P6_NMI_EVENT		P6_EVENT_CPU_CLOCKS_NOT_HALTED
			
 
				-
			
 
				-static int setup_p6_watchdog(unsigned nmi_hz)
			
 
				-{
			
 
				-	unsigned int perfctr_msr, evntsel_msr;
			
 
				-	unsigned int evntsel;
			
 
				-	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
			
 
				-
			
 
				-	perfctr_msr = wd_ops->perfctr;
			
 
				-	evntsel_msr = wd_ops->evntsel;
			
 
				-
			
 
				-	/* KVM doesn't implement this MSR */
			
 
				-	if (wrmsr_safe(perfctr_msr, 0, 0) < 0)
			
 
				-		return 0;
			
 
				-
			
 
				-	evntsel = P6_EVNTSEL_INT
			
 
				-		| P6_EVNTSEL_OS
			
 
				-		| P6_EVNTSEL_USR
			
 
				-		| P6_NMI_EVENT;
			
 
				-
			
 
				-	/* setup the timer */
			
 
				-	wrmsr(evntsel_msr, evntsel, 0);
			
 
				-	nmi_hz = adjust_for_32bit_ctr(nmi_hz);
			
 
				-	write_watchdog_counter32(perfctr_msr, "P6_PERFCTR0", nmi_hz);
			
 
				-
			
 
				-	/* initialize the wd struct before enabling */
			
 
				-	wd->perfctr_msr = perfctr_msr;
			
 
				-	wd->evntsel_msr = evntsel_msr;
			
 
				-	wd->cccr_msr = 0;  /* unused */
			
 
				-
			
 
				-	/* ok, everything is initialized, announce that we're set */
			
 
				-	cpu_nmi_set_wd_enabled();
			
 
				-
			
 
				-	apic_write(APIC_LVTPC, APIC_DM_NMI);
			
 
				-	evntsel |= P6_EVNTSEL0_ENABLE;
			
 
				-	wrmsr(evntsel_msr, evntsel, 0);
			
 
				-
			
 
				-	return 1;
			
 
				-}
			
 
				-
			
 
				-static void __kprobes p6_rearm(struct nmi_watchdog_ctlblk *wd, unsigned nmi_hz)
			
 
				-{
			
 
				-	/*
			
 
				-	 * P6 based Pentium M need to re-unmask
			
 
				-	 * the apic vector but it doesn't hurt
			
 
				-	 * other P6 variant.
			
 
				-	 * ArchPerfom/Core Duo also needs this
			
 
				-	 */
			
 
				-	apic_write(APIC_LVTPC, APIC_DM_NMI);
			
 
				-
			
 
				-	/* P6/ARCH_PERFMON has 32 bit counter write */
			
 
				-	write_watchdog_counter32(wd->perfctr_msr, NULL, nmi_hz);
			
 
				-}
			
 
				-
			
 
				-static const struct wd_ops p6_wd_ops = {
			
 
				-	.reserve	= single_msr_reserve,
			
 
				-	.unreserve	= single_msr_unreserve,
			
 
				-	.setup		= setup_p6_watchdog,
			
 
				-	.rearm		= p6_rearm,
			
 
				-	.stop		= single_msr_stop_watchdog,
			
 
				-	.perfctr	= MSR_P6_PERFCTR0,
			
 
				-	.evntsel	= MSR_P6_EVNTSEL0,
			
 
				-	.checkbit	= 1ULL << 39,
			
 
				-};
			
 
				-
			
 
				-/*
			
 
				- * Intel P4 performance counters.
			
 
				- * By far the most complicated of all.
			
 
				- */
			
 
				-#define MSR_P4_MISC_ENABLE_PERF_AVAIL	(1 << 7)
			
 
				-#define P4_ESCR_EVENT_SELECT(N)	((N) << 25)
			
 
				-#define P4_ESCR_OS		(1 << 3)
			
 
				-#define P4_ESCR_USR		(1 << 2)
			
 
				-#define P4_CCCR_OVF_PMI0	(1 << 26)
			
 
				-#define P4_CCCR_OVF_PMI1	(1 << 27)
			
 
				-#define P4_CCCR_THRESHOLD(N)	((N) << 20)
			
 
				-#define P4_CCCR_COMPLEMENT	(1 << 19)
			
 
				-#define P4_CCCR_COMPARE		(1 << 18)
			
 
				-#define P4_CCCR_REQUIRED	(3 << 16)
			
 
				-#define P4_CCCR_ESCR_SELECT(N)	((N) << 13)
			
 
				-#define P4_CCCR_ENABLE		(1 << 12)
			
 
				-#define P4_CCCR_OVF 		(1 << 31)
			
 
				-
			
 
				-#define P4_CONTROLS 18
			
 
				-static unsigned int p4_controls[18] = {
			
 
				-	MSR_P4_BPU_CCCR0,
			
 
				-	MSR_P4_BPU_CCCR1,
			
 
				-	MSR_P4_BPU_CCCR2,
			
 
				-	MSR_P4_BPU_CCCR3,
			
 
				-	MSR_P4_MS_CCCR0,
			
 
				-	MSR_P4_MS_CCCR1,
			
 
				-	MSR_P4_MS_CCCR2,
			
 
				-	MSR_P4_MS_CCCR3,
			
 
				-	MSR_P4_FLAME_CCCR0,
			
 
				-	MSR_P4_FLAME_CCCR1,
			
 
				-	MSR_P4_FLAME_CCCR2,
			
 
				-	MSR_P4_FLAME_CCCR3,
			
 
				-	MSR_P4_IQ_CCCR0,
			
 
				-	MSR_P4_IQ_CCCR1,
			
 
				-	MSR_P4_IQ_CCCR2,
			
 
				-	MSR_P4_IQ_CCCR3,
			
 
				-	MSR_P4_IQ_CCCR4,
			
 
				-	MSR_P4_IQ_CCCR5,
			
 
				-};
			
 
				-/*
			
 
				- * Set up IQ_COUNTER0 to behave like a clock, by having IQ_CCCR0 filter
			
 
				- * CRU_ESCR0 (with any non-null event selector) through a complemented
			
 
				- * max threshold. [IA32-Vol3, Section 14.9.9]
			
 
				- */
			
 
				-static int setup_p4_watchdog(unsigned nmi_hz)
			
 
				-{
			
 
				-	unsigned int perfctr_msr, evntsel_msr, cccr_msr;
			
 
				-	unsigned int evntsel, cccr_val;
			
 
				-	unsigned int misc_enable, dummy;
			
 
				-	unsigned int ht_num;
			
 
				-	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
			
 
				-
			
 
				-	rdmsr(MSR_IA32_MISC_ENABLE, misc_enable, dummy);
			
 
				-	if (!(misc_enable & MSR_P4_MISC_ENABLE_PERF_AVAIL))
			
 
				-		return 0;
			
 
				-
			
 
				-#ifdef CONFIG_SMP
			
 
				-	/* detect which hyperthread we are on */
			
 
				-	if (smp_num_siblings == 2) {
			
 
				-		unsigned int ebx, apicid;
			
 
				-
			
 
				-		ebx = cpuid_ebx(1);
			
 
				-		apicid = (ebx >> 24) & 0xff;
			
 
				-		ht_num = apicid & 1;
			
 
				-	} else
			
 
				-#endif
			
 
				-		ht_num = 0;
			
 
				-
			
 
				-	/*
			
 
				-	 * performance counters are shared resources
			
 
				-	 * assign each hyperthread its own set
			
 
				-	 * (re-use the ESCR0 register, seems safe
			
 
				-	 * and keeps the cccr_val the same)
			
 
				-	 */
			
 
				-	if (!ht_num) {
			
 
				-		/* logical cpu 0 */
			
 
				-		perfctr_msr = MSR_P4_IQ_PERFCTR0;
			
 
				-		evntsel_msr = MSR_P4_CRU_ESCR0;
			
 
				-		cccr_msr = MSR_P4_IQ_CCCR0;
			
 
				-		cccr_val = P4_CCCR_OVF_PMI0 | P4_CCCR_ESCR_SELECT(4);
			
 
				-
			
 
				-		/*
			
 
				-		 * If we're on the kdump kernel or other situation, we may
			
 
				-		 * still have other performance counter registers set to
			
 
				-		 * interrupt and they'll keep interrupting forever because
			
 
				-		 * of the P4_CCCR_OVF quirk. So we need to ACK all the
			
 
				-		 * pending interrupts and disable all the registers here,
			
 
				-		 * before reenabling the NMI delivery. Refer to p4_rearm()
			
 
				-		 * about the P4_CCCR_OVF quirk.
			
 
				-		 */
			
 
				-		if (reset_devices) {
			
 
				-			unsigned int low, high;
			
 
				-			int i;
			
 
				-
			
 
				-			for (i = 0; i < P4_CONTROLS; i++) {
			
 
				-				rdmsr(p4_controls[i], low, high);
			
 
				-				low &= ~(P4_CCCR_ENABLE | P4_CCCR_OVF);
			
 
				-				wrmsr(p4_controls[i], low, high);
			
 
				-			}
			
 
				-		}
			
 
				-	} else {
			
 
				-		/* logical cpu 1 */
			
 
				-		perfctr_msr = MSR_P4_IQ_PERFCTR1;
			
 
				-		evntsel_msr = MSR_P4_CRU_ESCR0;
			
 
				-		cccr_msr = MSR_P4_IQ_CCCR1;
			
 
				-
			
 
				-		/* Pentium 4 D processors don't support P4_CCCR_OVF_PMI1 */
			
 
				-		if (boot_cpu_data.x86_model == 4 && boot_cpu_data.x86_mask == 4)
			
 
				-			cccr_val = P4_CCCR_OVF_PMI0;
			
 
				-		else
			
 
				-			cccr_val = P4_CCCR_OVF_PMI1;
			
 
				-		cccr_val |= P4_CCCR_ESCR_SELECT(4);
			
 
				-	}
			
 
				-
			
 
				-	evntsel = P4_ESCR_EVENT_SELECT(0x3F)
			
 
				-		| P4_ESCR_OS
			
 
				-		| P4_ESCR_USR;
			
 
				-
			
 
				-	cccr_val |= P4_CCCR_THRESHOLD(15)
			
 
				-		 | P4_CCCR_COMPLEMENT
			
 
				-		 | P4_CCCR_COMPARE
			
 
				-		 | P4_CCCR_REQUIRED;
			
 
				-
			
 
				-	wrmsr(evntsel_msr, evntsel, 0);
			
 
				-	wrmsr(cccr_msr, cccr_val, 0);
			
 
				-	write_watchdog_counter(perfctr_msr, "P4_IQ_COUNTER0", nmi_hz);
			
 
				-
			
 
				-	wd->perfctr_msr = perfctr_msr;
			
 
				-	wd->evntsel_msr = evntsel_msr;
			
 
				-	wd->cccr_msr = cccr_msr;
			
 
				-
			
 
				-	/* ok, everything is initialized, announce that we're set */
			
 
				-	cpu_nmi_set_wd_enabled();
			
 
				-
			
 
				-	apic_write(APIC_LVTPC, APIC_DM_NMI);
			
 
				-	cccr_val |= P4_CCCR_ENABLE;
			
 
				-	wrmsr(cccr_msr, cccr_val, 0);
			
 
				-	return 1;
			
 
				-}
			
 
				-
			
 
				-static void stop_p4_watchdog(void)
			
 
				-{
			
 
				-	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
			
 
				-	wrmsr(wd->cccr_msr, 0, 0);
			
 
				-	wrmsr(wd->evntsel_msr, 0, 0);
			
 
				-}
			
 
				-
			
 
				-static int p4_reserve(void)
			
 
				-{
			
 
				-	if (!reserve_perfctr_nmi(MSR_P4_IQ_PERFCTR0))
			
 
				-		return 0;
			
 
				-#ifdef CONFIG_SMP
			
 
				-	if (smp_num_siblings > 1 && !reserve_perfctr_nmi(MSR_P4_IQ_PERFCTR1))
			
 
				-		goto fail1;
			
 
				-#endif
			
 
				-	if (!reserve_evntsel_nmi(MSR_P4_CRU_ESCR0))
			
 
				-		goto fail2;
			
 
				-	/* RED-PEN why is ESCR1 not reserved here? */
			
 
				-	return 1;
			
 
				- fail2:
			
 
				-#ifdef CONFIG_SMP
			
 
				-	if (smp_num_siblings > 1)
			
 
				-		release_perfctr_nmi(MSR_P4_IQ_PERFCTR1);
			
 
				- fail1:
			
 
				-#endif
			
 
				-	release_perfctr_nmi(MSR_P4_IQ_PERFCTR0);
			
 
				-	return 0;
			
 
				-}
			
 
				-
			
 
				-static void p4_unreserve(void)
			
 
				-{
			
 
				-#ifdef CONFIG_SMP
			
 
				-	if (smp_num_siblings > 1)
			
 
				-		release_perfctr_nmi(MSR_P4_IQ_PERFCTR1);
			
 
				-#endif
			
 
				-	release_evntsel_nmi(MSR_P4_CRU_ESCR0);
			
 
				-	release_perfctr_nmi(MSR_P4_IQ_PERFCTR0);
			
 
				-}
			
 
				-
			
 
				-static void __kprobes p4_rearm(struct nmi_watchdog_ctlblk *wd, unsigned nmi_hz)
			
 
				-{
			
 
				-	unsigned dummy;
			
 
				-	/*
			
 
				-	 * P4 quirks:
			
 
				-	 * - An overflown perfctr will assert its interrupt
			
 
				-	 *   until the OVF flag in its CCCR is cleared.
			
 
				-	 * - LVTPC is masked on interrupt and must be
			
 
				-	 *   unmasked by the LVTPC handler.
			
 
				-	 */
			
 
				-	rdmsrl(wd->cccr_msr, dummy);
			
 
				-	dummy &= ~P4_CCCR_OVF;
			
 
				-	wrmsrl(wd->cccr_msr, dummy);
			
 
				-	apic_write(APIC_LVTPC, APIC_DM_NMI);
			
 
				-	/* start the cycle over again */
			
 
				-	write_watchdog_counter(wd->perfctr_msr, NULL, nmi_hz);
			
 
				-}
			
 
				-
			
 
				-static const struct wd_ops p4_wd_ops = {
			
 
				-	.reserve	= p4_reserve,
			
 
				-	.unreserve	= p4_unreserve,
			
 
				-	.setup		= setup_p4_watchdog,
			
 
				-	.rearm		= p4_rearm,
			
 
				-	.stop		= stop_p4_watchdog,
			
 
				-	/* RED-PEN this is wrong for the other sibling */
			
 
				-	.perfctr	= MSR_P4_BPU_PERFCTR0,
			
 
				-	.evntsel	= MSR_P4_BSU_ESCR0,
			
 
				-	.checkbit	= 1ULL << 39,
			
 
				-};
			
 
				-
			
 
				-/*
			
 
				- * Watchdog using the Intel architected PerfMon.
			
 
				- * Used for Core2 and hopefully all future Intel CPUs.
			
 
				- */
			
 
				-#define ARCH_PERFMON_NMI_EVENT_SEL	ARCH_PERFMON_UNHALTED_CORE_CYCLES_SEL
			
 
				-#define ARCH_PERFMON_NMI_EVENT_UMASK	ARCH_PERFMON_UNHALTED_CORE_CYCLES_UMASK
			
 
				-
			
 
				-static struct wd_ops intel_arch_wd_ops;
			
 
				-
			
 
				-static int setup_intel_arch_watchdog(unsigned nmi_hz)
			
 
				-{
			
 
				-	unsigned int ebx;
			
 
				-	union cpuid10_eax eax;
			
 
				-	unsigned int unused;
			
 
				-	unsigned int perfctr_msr, evntsel_msr;
			
 
				-	unsigned int evntsel;
			
 
				-	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
			
 
				-
			
 
				-	/*
			
 
				-	 * Check whether the Architectural PerfMon supports
			
 
				-	 * Unhalted Core Cycles Event or not.
			
 
				-	 * NOTE: Corresponding bit = 0 in ebx indicates event present.
			
 
				-	 */
			
 
				-	cpuid(10, &(eax.full), &ebx, &unused, &unused);
			
 
				-	if ((eax.split.mask_length <
			
 
				-			(ARCH_PERFMON_UNHALTED_CORE_CYCLES_INDEX+1)) ||
			
 
				-	    (ebx & ARCH_PERFMON_UNHALTED_CORE_CYCLES_PRESENT))
			
 
				-		return 0;
			
 
				-
			
 
				-	perfctr_msr = wd_ops->perfctr;
			
 
				-	evntsel_msr = wd_ops->evntsel;
			
 
				-
			
 
				-	wrmsrl(perfctr_msr, 0UL);
			
 
				-
			
 
				-	evntsel = ARCH_PERFMON_EVENTSEL_INT
			
 
				-		| ARCH_PERFMON_EVENTSEL_OS
			
 
				-		| ARCH_PERFMON_EVENTSEL_USR
			
 
				-		| ARCH_PERFMON_NMI_EVENT_SEL
			
 
				-		| ARCH_PERFMON_NMI_EVENT_UMASK;
			
 
				-
			
 
				-	/* setup the timer */
			
 
				-	wrmsr(evntsel_msr, evntsel, 0);
			
 
				-	nmi_hz = adjust_for_32bit_ctr(nmi_hz);
			
 
				-	write_watchdog_counter32(perfctr_msr, "INTEL_ARCH_PERFCTR0", nmi_hz);
			
 
				-
			
 
				-	wd->perfctr_msr = perfctr_msr;
			
 
				-	wd->evntsel_msr = evntsel_msr;
			
 
				-	wd->cccr_msr = 0;  /* unused */
			
 
				-
			
 
				-	/* ok, everything is initialized, announce that we're set */
			
 
				-	cpu_nmi_set_wd_enabled();
			
 
				-
			
 
				-	apic_write(APIC_LVTPC, APIC_DM_NMI);
			
 
				-	evntsel |= ARCH_PERFMON_EVENTSEL_ENABLE;
			
 
				-	wrmsr(evntsel_msr, evntsel, 0);
			
 
				-	intel_arch_wd_ops.checkbit = 1ULL << (eax.split.bit_width - 1);
			
 
				-	return 1;
			
 
				-}
			
 
				-
			
 
				-static struct wd_ops intel_arch_wd_ops __read_mostly = {
			
 
				-	.reserve	= single_msr_reserve,
			
 
				-	.unreserve	= single_msr_unreserve,
			
 
				-	.setup		= setup_intel_arch_watchdog,
			
 
				-	.rearm		= p6_rearm,
			
 
				-	.stop		= single_msr_stop_watchdog,
			
 
				-	.perfctr	= MSR_ARCH_PERFMON_PERFCTR1,
			
 
				-	.evntsel	= MSR_ARCH_PERFMON_EVENTSEL1,
			
 
				-};
			
 
				-
			
 
				-static void probe_nmi_watchdog(void)
			
 
				-{
			
 
				-	switch (boot_cpu_data.x86_vendor) {
			
 
				-	case X86_VENDOR_AMD:
			
 
				-		if (boot_cpu_data.x86 == 6 ||
			
 
				-		    (boot_cpu_data.x86 >= 0xf && boot_cpu_data.x86 <= 0x15))
			
 
				-			wd_ops = &k7_wd_ops;
			
 
				-		return;
			
 
				-	case X86_VENDOR_INTEL:
			
 
				-		/* Work around where perfctr1 doesn't have a working enable
			
 
				-		 * bit as described in the following errata:
			
 
				-		 * AE49 Core Duo and Intel Core Solo 65 nm
			
 
				-		 * AN49 Intel Pentium Dual-Core
			
 
				-		 * AF49 Dual-Core Intel Xeon Processor LV
			
 
				-		 */
			
 
				-		if ((boot_cpu_data.x86 == 6 && boot_cpu_data.x86_model == 14) ||
			
 
				-		    ((boot_cpu_data.x86 == 6 && boot_cpu_data.x86_model == 15 &&
			
 
				-		     boot_cpu_data.x86_mask == 4))) {
			
 
				-			intel_arch_wd_ops.perfctr = MSR_ARCH_PERFMON_PERFCTR0;
			
 
				-			intel_arch_wd_ops.evntsel = MSR_ARCH_PERFMON_EVENTSEL0;
			
 
				-		}
			
 
				-		if (cpu_has(&boot_cpu_data, X86_FEATURE_ARCH_PERFMON)) {
			
 
				-			wd_ops = &intel_arch_wd_ops;
			
 
				-			break;
			
 
				-		}
			
 
				-		switch (boot_cpu_data.x86) {
			
 
				-		case 6:
			
 
				-			if (boot_cpu_data.x86_model > 13)
			
 
				-				return;
			
 
				-
			
 
				-			wd_ops = &p6_wd_ops;
			
 
				-			break;
			
 
				-		case 15:
			
 
				-			wd_ops = &p4_wd_ops;
			
 
				-			break;
			
 
				-		default:
			
 
				-			return;
			
 
				-		}
			
 
				-		break;
			
 
				-	}
			
 
				-}
			
 
				-
			
 
				-/* Interface to nmi.c */
			
 
				-
			
 
				-int lapic_watchdog_init(unsigned nmi_hz)
			
 
				-{
			
 
				-	if (!wd_ops) {
			
 
				-		probe_nmi_watchdog();
			
 
				-		if (!wd_ops) {
			
 
				-			printk(KERN_INFO "NMI watchdog: CPU not supported\n");
			
 
				-			return -1;
			
 
				-		}
			
 
				-
			
 
				-		if (!wd_ops->reserve()) {
			
 
				-			printk(KERN_ERR
			
 
				-				"NMI watchdog: cannot reserve perfctrs\n");
			
 
				-			return -1;
			
 
				-		}
			
 
				-	}
			
 
				-
			
 
				-	if (!(wd_ops->setup(nmi_hz))) {
			
 
				-		printk(KERN_ERR "Cannot setup NMI watchdog on CPU %d\n",
			
 
				-		       raw_smp_processor_id());
			
 
				-		return -1;
			
 
				-	}
			
 
				-
			
 
				-	return 0;
			
 
				-}
			
 
				-
			
 
				-void lapic_watchdog_stop(void)
			
 
				-{
			
 
				-	if (wd_ops)
			
 
				-		wd_ops->stop();
			
 
				-}
			
 
				-
			
 
				-unsigned lapic_adjust_nmi_hz(unsigned hz)
			
 
				-{
			
 
				-	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
			
 
				-	if (wd->perfctr_msr == MSR_P6_PERFCTR0 ||
			
 
				-	    wd->perfctr_msr == MSR_ARCH_PERFMON_PERFCTR1)
			
 
				-		hz = adjust_for_32bit_ctr(hz);
			
 
				-	return hz;
			
 
				-}
			
 
				-
			
 
				-int __kprobes lapic_wd_event(unsigned nmi_hz)
			
 
				-{
			
 
				-	struct nmi_watchdog_ctlblk *wd = &__get_cpu_var(nmi_watchdog_ctlblk);
			
 
				-	u64 ctr;
			
 
				-
			
 
				-	rdmsrl(wd->perfctr_msr, ctr);
			
 
				-	if (ctr & wd_ops->checkbit) /* perfctr still running? */
			
 
				-		return 0;
			
 
				-
			
 
				-	wd_ops->rearm(wd, nmi_hz);
			
 
				-	return 1;
			
 
				-}
			
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -175,21 +175,21 @@ static const struct stacktrace_ops print_trace_ops = {
 
				 
			
 
				 void
			
 
				 show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
			
 
				-		unsigned long *stack, unsigned long bp, char *log_lvl)
			
 
				+		unsigned long *stack, char *log_lvl)
			
 
				 {
			
 
				 	printk("%sCall Trace:\n", log_lvl);
			
 
				-	dump_trace(task, regs, stack, bp, &print_trace_ops, log_lvl);
			
 
				+	dump_trace(task, regs, stack, &print_trace_ops, log_lvl);
			
 
				 }
			
 
				 
			
 
				 void show_trace(struct task_struct *task, struct pt_regs *regs,
			
 
				-		unsigned long *stack, unsigned long bp)
			
 
				+		unsigned long *stack)
			
 
				 {
			
 
				-	show_trace_log_lvl(task, regs, stack, bp, "");
			
 
				+	show_trace_log_lvl(task, regs, stack, "");
			
 
				 }
			
 
				 
			
 
				 void show_stack(struct task_struct *task, unsigned long *sp)
			
 
				 {
			
 
				-	show_stack_log_lvl(task, NULL, sp, 0, "");
			
 
				+	show_stack_log_lvl(task, NULL, sp, "");
			
 
				 }
			
 
				 
			
 
				 /*
			
@@ -210,7 +210,7 @@ void dump_stack(void)
 
				 		init_utsname()->release,
			
 
				 		(int)strcspn(init_utsname()->version, " "),
			
 
				 		init_utsname()->version);
			
 
				-	show_trace(NULL, NULL, &stack, bp);
			
 
				+	show_trace(NULL, NULL, &stack);
			
 
				 }
			
 
				 EXPORT_SYMBOL(dump_stack);
			
 
				 
			
--- a/arch/x86/kernel/dumpstack_32.c
+++ b/arch/x86/kernel/dumpstack_32.c
@@ -17,11 +17,12 @@
 
				 #include <asm/stacktrace.h>
			
 
				 
			
 
				 
			
 
				-void dump_trace(struct task_struct *task, struct pt_regs *regs,
			
 
				-		unsigned long *stack, unsigned long bp,
			
 
				+void dump_trace(struct task_struct *task,
			
 
				+		struct pt_regs *regs, unsigned long *stack,
			
 
				 		const struct stacktrace_ops *ops, void *data)
			
 
				 {
			
 
				 	int graph = 0;
			
 
				+	unsigned long bp;
			
 
				 
			
 
				 	if (!task)
			
 
				 		task = current;
			
@@ -34,18 +35,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
 
				 			stack = (unsigned long *)task->thread.sp;
			
 
				 	}
			
 
				 
			
 
				-#ifdef CONFIG_FRAME_POINTER
			
 
				-	if (!bp) {
			
 
				-		if (task == current) {
			
 
				-			/* Grab bp right from our regs */
			
 
				-			get_bp(bp);
			
 
				-		} else {
			
 
				-			/* bp is the last reg pushed by switch_to */
			
 
				-			bp = *(unsigned long *) task->thread.sp;
			
 
				-		}
			
 
				-	}
			
 
				-#endif
			
 
				-
			
 
				+	bp = stack_frame(task, regs);
			
 
				 	for (;;) {
			
 
				 		struct thread_info *context;
			
 
				 
			
@@ -65,7 +55,7 @@ EXPORT_SYMBOL(dump_trace);
 
				 
			
 
				 void
			
 
				 show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
			
 
				-		   unsigned long *sp, unsigned long bp, char *log_lvl)
			
 
				+		   unsigned long *sp, char *log_lvl)
			
 
				 {
			
 
				 	unsigned long *stack;
			
 
				 	int i;
			
@@ -87,7 +77,7 @@ show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
 
				 		touch_nmi_watchdog();
			
 
				 	}
			
 
				 	printk(KERN_CONT "\n");
			
 
				-	show_trace_log_lvl(task, regs, sp, bp, log_lvl);
			
 
				+	show_trace_log_lvl(task, regs, sp, log_lvl);
			
 
				 }
			
 
				 
			
 
				 
			
@@ -112,8 +102,7 @@ void show_registers(struct pt_regs *regs)
 
				 		u8 *ip;
			
 
				 
			
 
				 		printk(KERN_EMERG "Stack:\n");
			
 
				-		show_stack_log_lvl(NULL, regs, &regs->sp,
			
 
				-				0, KERN_EMERG);
			
 
				+		show_stack_log_lvl(NULL, regs, &regs->sp, KERN_EMERG);
			
 
				 
			
 
				 		printk(KERN_EMERG "Code: ");
			
 
				 
			
--- a/arch/x86/kernel/dumpstack_64.c
+++ b/arch/x86/kernel/dumpstack_64.c
@@ -139,8 +139,8 @@ fixup_bp_irq_link(unsigned long bp, unsigned long *stack,
 
				  * severe exception (double fault, nmi, stack fault, debug, mce) hardware stack
			
 
				  */
			
 
				 
			
 
				-void dump_trace(struct task_struct *task, struct pt_regs *regs,
			
 
				-		unsigned long *stack, unsigned long bp,
			
 
				+void dump_trace(struct task_struct *task,
			
 
				+		struct pt_regs *regs, unsigned long *stack,
			
 
				 		const struct stacktrace_ops *ops, void *data)
			
 
				 {
			
 
				 	const unsigned cpu = get_cpu();
			
@@ -149,6 +149,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
 
				 	unsigned used = 0;
			
 
				 	struct thread_info *tinfo;
			
 
				 	int graph = 0;
			
 
				+	unsigned long bp;
			
 
				 
			
 
				 	if (!task)
			
 
				 		task = current;
			
@@ -160,18 +161,7 @@ void dump_trace(struct task_struct *task, struct pt_regs *regs,
 
				 			stack = (unsigned long *)task->thread.sp;
			
 
				 	}
			
 
				 
			
 
				-#ifdef CONFIG_FRAME_POINTER
			
 
				-	if (!bp) {
			
 
				-		if (task == current) {
			
 
				-			/* Grab bp right from our regs */
			
 
				-			get_bp(bp);
			
 
				-		} else {
			
 
				-			/* bp is the last reg pushed by switch_to */
			
 
				-			bp = *(unsigned long *) task->thread.sp;
			
 
				-		}
			
 
				-	}
			
 
				-#endif
			
 
				-
			
 
				+	bp = stack_frame(task, regs);
			
 
				 	/*
			
 
				 	 * Print function call entries in all stacks, starting at the
			
 
				 	 * current stack address. If the stacks consist of nested
			
@@ -235,7 +225,7 @@ EXPORT_SYMBOL(dump_trace);
 
				 
			
 
				 void
			
 
				 show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
			
 
				-		   unsigned long *sp, unsigned long bp, char *log_lvl)
			
 
				+		   unsigned long *sp, char *log_lvl)
			
 
				 {
			
 
				 	unsigned long *irq_stack_end;
			
 
				 	unsigned long *irq_stack;
			
@@ -279,7 +269,7 @@ show_stack_log_lvl(struct task_struct *task, struct pt_regs *regs,
 
				 	preempt_enable();
			
 
				 
			
 
				 	printk(KERN_CONT "\n");
			
 
				-	show_trace_log_lvl(task, regs, sp, bp, log_lvl);
			
 
				+	show_trace_log_lvl(task, regs, sp, log_lvl);
			
 
				 }
			
 
				 
			
 
				 void show_registers(struct pt_regs *regs)
			
@@ -308,7 +298,7 @@ void show_registers(struct pt_regs *regs)
 
				 
			
 
				 		printk(KERN_EMERG "Stack:\n");
			
 
				 		show_stack_log_lvl(NULL, regs, (unsigned long *)sp,
			
 
				-				regs->bp, KERN_EMERG);
			
 
				+				   KERN_EMERG);
			
 
				 
			
 
				 		printk(KERN_EMERG "Code: ");
			
 
				 
			
--- a/arch/x86/kernel/kprobes.c
+++ b/arch/x86/kernel/kprobes.c
@@ -1184,6 +1184,10 @@ static void __kprobes optimized_callback(struct optimized_kprobe *op,
 
				 {
			
 
				 	struct kprobe_ctlblk *kcb = get_kprobe_ctlblk();
			
 
				 
			
 
				+	/* This is possible if op is under delayed unoptimizing */
			
 
				+	if (kprobe_disabled(&op->kp))
			
 
				+		return;
			
 
				+
			
 
				 	preempt_disable();
			
 
				 	if (kprobe_running()) {
			
 
				 		kprobes_inc_nmissed_count(&op->kp);
			
@@ -1401,10 +1405,16 @@ int __kprobes arch_prepare_optimized_kprobe(struct optimized_kprobe *op)
 
				 	return 0;
			
 
				 }
			
 
				 
			
 
				-/* Replace a breakpoint (int3) with a relative jump.  */
			
 
				-int __kprobes arch_optimize_kprobe(struct optimized_kprobe *op)
			
 
				+#define MAX_OPTIMIZE_PROBES 256
			
 
				+static struct text_poke_param *jump_poke_params;
			
 
				+static struct jump_poke_buffer {
			
 
				+	u8 buf[RELATIVEJUMP_SIZE];
			
 
				+} *jump_poke_bufs;
			
 
				+
			
 
				+static void __kprobes setup_optimize_kprobe(struct text_poke_param *tprm,
			
 
				+					    u8 *insn_buf,
			
 
				+					    struct optimized_kprobe *op)
			
 
				 {
			
 
				-	unsigned char jmp_code[RELATIVEJUMP_SIZE];
			
 
				 	s32 rel = (s32)((long)op->optinsn.insn -
			
 
				 			((long)op->kp.addr + RELATIVEJUMP_SIZE));
			
 
				 
			
@@ -1412,16 +1422,79 @@ int __kprobes arch_optimize_kprobe(struct optimized_kprobe *op)
 
				 	memcpy(op->optinsn.copied_insn, op->kp.addr + INT3_SIZE,
			
 
				 	       RELATIVE_ADDR_SIZE);
			
 
				 
			
 
				-	jmp_code[0] = RELATIVEJUMP_OPCODE;
			
 
				-	*(s32 *)(&jmp_code[1]) = rel;
			
 
				+	insn_buf[0] = RELATIVEJUMP_OPCODE;
			
 
				+	*(s32 *)(&insn_buf[1]) = rel;
			
 
				+
			
 
				+	tprm->addr = op->kp.addr;
			
 
				+	tprm->opcode = insn_buf;
			
 
				+	tprm->len = RELATIVEJUMP_SIZE;
			
 
				+}
			
 
				+
			
 
				+/*
			
 
				+ * Replace breakpoints (int3) with relative jumps.
			
 
				+ * Caller must call with locking kprobe_mutex and text_mutex.
			
 
				+ */
			
 
				+void __kprobes arch_optimize_kprobes(struct list_head *oplist)
			
 
				+{
			
 
				+	struct optimized_kprobe *op, *tmp;
			
 
				+	int c = 0;
			
 
				+
			
 
				+	list_for_each_entry_safe(op, tmp, oplist, list) {
			
 
				+		WARN_ON(kprobe_disabled(&op->kp));
			
 
				+		/* Setup param */
			
 
				+		setup_optimize_kprobe(&jump_poke_params[c],
			
 
				+				      jump_poke_bufs[c].buf, op);
			
 
				+		list_del_init(&op->list);
			
 
				+		if (++c >= MAX_OPTIMIZE_PROBES)
			
 
				+			break;
			
 
				+	}
			
 
				 
			
 
				 	/*
			
 
				 	 * text_poke_smp doesn't support NMI/MCE code modifying.
			
 
				 	 * However, since kprobes itself also doesn't support NMI/MCE
			
 
				 	 * code probing, it's not a problem.
			
 
				 	 */
			
 
				-	text_poke_smp(op->kp.addr, jmp_code, RELATIVEJUMP_SIZE);
			
 
				-	return 0;
			
 
				+	text_poke_smp_batch(jump_poke_params, c);
			
 
				+}
			
 
				+
			
 
				+static void __kprobes setup_unoptimize_kprobe(struct text_poke_param *tprm,
			
 
				+					      u8 *insn_buf,
			
 
				+					      struct optimized_kprobe *op)
			
 
				+{
			
 
				+	/* Set int3 to first byte for kprobes */
			
 
				+	insn_buf[0] = BREAKPOINT_INSTRUCTION;
			
 
				+	memcpy(insn_buf + 1, op->optinsn.copied_insn, RELATIVE_ADDR_SIZE);
			
 
				+
			
 
				+	tprm->addr = op->kp.addr;
			
 
				+	tprm->opcode = insn_buf;
			
 
				+	tprm->len = RELATIVEJUMP_SIZE;
			
 
				+}
			
 
				+
			
 
				+/*
			
 
				+ * Recover original instructions and breakpoints from relative jumps.
			
 
				+ * Caller must call with locking kprobe_mutex.
			
 
				+ */
			
 
				+extern void arch_unoptimize_kprobes(struct list_head *oplist,
			
 
				+				    struct list_head *done_list)
			
 
				+{
			
 
				+	struct optimized_kprobe *op, *tmp;
			
 
				+	int c = 0;
			
 
				+
			
 
				+	list_for_each_entry_safe(op, tmp, oplist, list) {
			
 
				+		/* Setup param */
			
 
				+		setup_unoptimize_kprobe(&jump_poke_params[c],
			
 
				+					jump_poke_bufs[c].buf, op);
			
 
				+		list_move(&op->list, done_list);
			
 
				+		if (++c >= MAX_OPTIMIZE_PROBES)
			
 
				+			break;
			
 
				+	}
			
 
				+
			
 
				+	/*
			
 
				+	 * text_poke_smp doesn't support NMI/MCE code modifying.
			
 
				+	 * However, since kprobes itself also doesn't support NMI/MCE
			
 
				+	 * code probing, it's not a problem.
			
 
				+	 */
			
 
				+	text_poke_smp_batch(jump_poke_params, c);
			
 
				 }
			
 
				 
			
 
				 /* Replace a relative jump with a breakpoint (int3).  */
			
@@ -1453,11 +1526,35 @@ static int  __kprobes setup_detour_execution(struct kprobe *p,
 
				 	}
			
 
				 	return 0;
			
 
				 }
			
 
				+
			
 
				+static int __kprobes init_poke_params(void)
			
 
				+{
			
 
				+	/* Allocate code buffer and parameter array */
			
 
				+	jump_poke_bufs = kmalloc(sizeof(struct jump_poke_buffer) *
			
 
				+				 MAX_OPTIMIZE_PROBES, GFP_KERNEL);
			
 
				+	if (!jump_poke_bufs)
			
 
				+		return -ENOMEM;
			
 
				+
			
 
				+	jump_poke_params = kmalloc(sizeof(struct text_poke_param) *
			
 
				+				   MAX_OPTIMIZE_PROBES, GFP_KERNEL);
			
 
				+	if (!jump_poke_params) {
			
 
				+		kfree(jump_poke_bufs);
			
 
				+		jump_poke_bufs = NULL;
			
 
				+		return -ENOMEM;
			
 
				+	}
			
 
				+
			
 
				+	return 0;
			
 
				+}
			
 
				+#else	/* !CONFIG_OPTPROBES */
			
 
				+static int __kprobes init_poke_params(void)
			
 
				+{
			
 
				+	return 0;
			
 
				+}
			
 
				 #endif
			
 
				 
			
 
				 int __init arch_init_kprobes(void)
			
 
				 {
			
 
				-	return 0;
			
 
				+	return init_poke_params();
			
 
				 }
			
 
				 
			
 
				 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
			
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -91,8 +91,7 @@ void exit_thread(void)
 
				 void show_regs(struct pt_regs *regs)
			
 
				 {
			
 
				 	show_registers(regs);
			
 
				-	show_trace(NULL, regs, (unsigned long *)kernel_stack_pointer(regs),
			
 
				-		   regs->bp);
			
 
				+	show_trace(NULL, regs, (unsigned long *)kernel_stack_pointer(regs));
			
 
				 }
			
 
				 
			
 
				 void show_regs_common(void)
			
@@ -374,6 +373,7 @@ void default_idle(void)
 
				 {
			
 
				 	if (hlt_use_halt()) {
			
 
				 		trace_power_start(POWER_CSTATE, 1, smp_processor_id());
			
 
				+		trace_cpu_idle(1, smp_processor_id());
			
 
				 		current_thread_info()->status &= ~TS_POLLING;
			
 
				 		/*
			
 
				 		 * TS_POLLING-cleared state must be visible before we
			
@@ -444,6 +444,7 @@ EXPORT_SYMBOL_GPL(cpu_idle_wait);
 
				 void mwait_idle_with_hints(unsigned long ax, unsigned long cx)
			
 
				 {
			
 
				 	trace_power_start(POWER_CSTATE, (ax>>4)+1, smp_processor_id());
			
 
				+	trace_cpu_idle((ax>>4)+1, smp_processor_id());
			
 
				 	if (!need_resched()) {
			
 
				 		if (cpu_has(&current_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
			
 
				 			clflush((void *)&current_thread_info()->flags);
			
@@ -460,6 +461,7 @@ static void mwait_idle(void)
 
				 {
			
 
				 	if (!need_resched()) {
			
 
				 		trace_power_start(POWER_CSTATE, 1, smp_processor_id());
			
 
				+		trace_cpu_idle(1, smp_processor_id());
			
 
				 		if (cpu_has(&current_cpu_data, X86_FEATURE_CLFLUSH_MONITOR))
			
 
				 			clflush((void *)&current_thread_info()->flags);
			
 
				 
			
@@ -481,10 +483,12 @@ static void mwait_idle(void)
 
				 static void poll_idle(void)
			
 
				 {
			
 
				 	trace_power_start(POWER_CSTATE, 0, smp_processor_id());
			
 
				+	trace_cpu_idle(0, smp_processor_id());
			
 
				 	local_irq_enable();
			
 
				 	while (!need_resched())
			
 
				 		cpu_relax();
			
 
				-	trace_power_end(0);
			
 
				+	trace_power_end(smp_processor_id());
			
 
				+	trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
			
 
				 }
			
 
				 
			
 
				 /*
			
--- a/arch/x86/kernel/process_32.c
+++ b/arch/x86/kernel/process_32.c
@@ -113,8 +113,8 @@ void cpu_idle(void)
 
				 			stop_critical_timings();
			
 
				 			pm_idle();
			
 
				 			start_critical_timings();
			
 
				-
			
 
				 			trace_power_end(smp_processor_id());
			
 
				+			trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
			
 
				 		}
			
 
				 		tick_nohz_restart_sched_tick();
			
 
				 		preempt_enable_no_resched();
			
--- a/arch/x86/kernel/process_64.c
+++ b/arch/x86/kernel/process_64.c
@@ -142,6 +142,8 @@ void cpu_idle(void)
 
				 			start_critical_timings();
			
 
				 
			
 
				 			trace_power_end(smp_processor_id());
			
 
				+			trace_cpu_idle(PWR_EVENT_EXIT,
			
 
				+				       smp_processor_id());
			
 
				 
			
 
				 			/* In many cases the interrupt that ended idle
			
 
				 			   has already called exit_idle. But some idle
			
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -281,6 +281,13 @@ static void __cpuinit smp_callin(void)
 
				 	 */
			
 
				 	smp_store_cpu_info(cpuid);
			
 
				 
			
 
				+	/*
			
 
				+	 * This must be done before setting cpu_online_mask
			
 
				+	 * or calling notify_cpu_starting.
			
 
				+	 */
			
 
				+	set_cpu_sibling_map(raw_smp_processor_id());
			
 
				+	wmb();
			
 
				+
			
 
				 	notify_cpu_starting(cpuid);
			
 
				 
			
 
				 	/*
			
@@ -316,16 +323,6 @@ notrace static void __cpuinit start_secondary(void *unused)
 
				 	 */
			
 
				 	check_tsc_sync_target();
			
 
				 
			
 
				-	if (nmi_watchdog == NMI_IO_APIC) {
			
 
				-		legacy_pic->mask(0);
			
 
				-		enable_NMI_through_LVT0();
			
 
				-		legacy_pic->unmask(0);
			
 
				-	}
			
 
				-
			
 
				-	/* This must be done before setting cpu_online_mask */
			
 
				-	set_cpu_sibling_map(raw_smp_processor_id());
			
 
				-	wmb();
			
 
				-
			
 
				 	/*
			
 
				 	 * We need to hold call_lock, so there is no inconsistency
			
 
				 	 * between the time smp_call_function() determines number of
			
@@ -1061,8 +1058,6 @@ static int __init smp_sanity_check(unsigned max_cpus)
 
				 		printk(KERN_INFO "SMP mode deactivated.\n");
			
 
				 		smpboot_clear_io_apic();
			
 
				 
			
 
				-		localise_nmi_watchdog();
			
 
				-
			
 
				 		connect_bsp_APIC();
			
 
				 		setup_local_APIC();
			
 
				 		end_local_APIC_setup();
			
@@ -1196,7 +1191,6 @@ void __init native_smp_cpus_done(unsigned int max_cpus)
 
				 #ifdef CONFIG_X86_IO_APIC
			
 
				 	setup_ioapic_dest();
			
 
				 #endif
			
 
				-	check_nmi_watchdog();
			
 
				 	mtrr_aps_init();
			
 
				 }
			
 
				 
			
@@ -1341,8 +1335,6 @@ int native_cpu_disable(void)
 
				 	if (cpu == 0)
			
 
				 		return -EBUSY;
			
 
				 
			
 
				-	if (nmi_watchdog == NMI_LOCAL_APIC)
			
 
				-		stop_apic_nmi_watchdog(NULL);
			
 
				 	clear_local_APIC();
			
 
				 
			
 
				 	cpu_disable_common();
			
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -73,22 +73,22 @@ static const struct stacktrace_ops save_stack_ops_nosched = {
 
				  */
			
 
				 void save_stack_trace(struct stack_trace *trace)
			
 
				 {
			
 
				-	dump_trace(current, NULL, NULL, 0, &save_stack_ops, trace);
			
 
				+	dump_trace(current, NULL, NULL, &save_stack_ops, trace);
			
 
				 	if (trace->nr_entries < trace->max_entries)
			
 
				 		trace->entries[trace->nr_entries++] = ULONG_MAX;
			
 
				 }
			
 
				 EXPORT_SYMBOL_GPL(save_stack_trace);
			
 
				 
			
 
				-void save_stack_trace_bp(struct stack_trace *trace, unsigned long bp)
			
 
				+void save_stack_trace_regs(struct stack_trace *trace, struct pt_regs *regs)
			
 
				 {
			
 
				-	dump_trace(current, NULL, NULL, bp, &save_stack_ops, trace);
			
 
				+	dump_trace(current, regs, NULL, &save_stack_ops, trace);
			
 
				 	if (trace->nr_entries < trace->max_entries)
			
 
				 		trace->entries[trace->nr_entries++] = ULONG_MAX;
			
 
				 }
			
 
				 
			
 
				 void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
			
 
				 {
			
 
				-	dump_trace(tsk, NULL, NULL, 0, &save_stack_ops_nosched, trace);
			
 
				+	dump_trace(tsk, NULL, NULL, &save_stack_ops_nosched, trace);
			
 
				 	if (trace->nr_entries < trace->max_entries)
			
 
				 		trace->entries[trace->nr_entries++] = ULONG_MAX;
			
 
				 }
			
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -22,10 +22,6 @@
 
				 #include <asm/hpet.h>
			
 
				 #include <asm/time.h>
			
 
				 
			
 
				-#if defined(CONFIG_X86_32) && defined(CONFIG_X86_IO_APIC)
			
 
				-int timer_ack;
			
 
				-#endif
			
 
				-
			
 
				 #ifdef CONFIG_X86_64
			
 
				 volatile unsigned long __jiffies __section_jiffies = INITIAL_JIFFIES;
			
 
				 #endif
			
@@ -63,20 +59,6 @@ static irqreturn_t timer_interrupt(int irq, void *dev_id)
 
				 	/* Keep nmi watchdog up to date */
			
 
				 	inc_irq_stat(irq0_irqs);
			
 
				 
			
 
				-	/* Optimized out for !IO_APIC and x86_64 */
			
 
				-	if (timer_ack) {
			
 
				-		/*
			
 
				-		 * Subtle, when I/O APICs are used we have to ack timer IRQ
			
 
				-		 * manually to deassert NMI lines for the watchdog if run
			
 
				-		 * on an 82489DX-based system.
			
 
				-		 */
			
 
				-		raw_spin_lock(&i8259A_lock);
			
 
				-		outb(0x0c, PIC_MASTER_OCW3);
			
 
				-		/* Ack the IRQ; AEOI will end it automatically. */
			
 
				-		inb(PIC_MASTER_POLL);
			
 
				-		raw_spin_unlock(&i8259A_lock);
			
 
				-	}
			
 
				-
			
 
				 	global_clock_event->event_handler(global_clock_event);
			
 
				 
			
 
				 	/* MCA bus quirk: Acknowledge irq0 by setting bit 7 in port 0x61 */
			
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -83,6 +83,8 @@ EXPORT_SYMBOL_GPL(used_vectors);
 
				 
			
 
				 static int ignore_nmis;
			
 
				 
			
 
				+int unknown_nmi_panic;
			
 
				+
			
 
				 static inline void conditional_sti(struct pt_regs *regs)
			
 
				 {
			
 
				 	if (regs->flags & X86_EFLAGS_IF)
			
@@ -300,6 +302,13 @@ gp_in_kernel:
 
				 	die("general protection fault", regs, error_code);
			
 
				 }
			
 
				 
			
 
				+static int __init setup_unknown_nmi_panic(char *str)
			
 
				+{
			
 
				+	unknown_nmi_panic = 1;
			
 
				+	return 1;
			
 
				+}
			
 
				+__setup("unknown_nmi_panic", setup_unknown_nmi_panic);
			
 
				+
			
 
				 static notrace __kprobes void
			
 
				 mem_parity_error(unsigned char reason, struct pt_regs *regs)
			
 
				 {
			
@@ -342,9 +351,11 @@ io_check_error(unsigned char reason, struct pt_regs *regs)
 
				 	reason = (reason & 0xf) | 8;
			
 
				 	outb(reason, 0x61);
			
 
				 
			
 
				-	i = 2000;
			
 
				-	while (--i)
			
 
				-		udelay(1000);
			
 
				+	i = 20000;
			
 
				+	while (--i) {
			
 
				+		touch_nmi_watchdog();
			
 
				+		udelay(100);
			
 
				+	}
			
 
				 
			
 
				 	reason &= ~8;
			
 
				 	outb(reason, 0x61);
			
@@ -371,7 +382,7 @@ unknown_nmi_error(unsigned char reason, struct pt_regs *regs)
 
				 			reason, smp_processor_id());
			
 
				 
			
 
				 	printk(KERN_EMERG "Do you have a strange power saving mode enabled?\n");
			
 
				-	if (panic_on_unrecovered_nmi)
			
 
				+	if (unknown_nmi_panic || panic_on_unrecovered_nmi)
			
 
				 		panic("NMI: Not continuing");
			
 
				 
			
 
				 	printk(KERN_EMERG "Dazed and confused, but trying to continue\n");
			
@@ -397,20 +408,8 @@ static notrace __kprobes void default_do_nmi(struct pt_regs *regs)
 
				 		if (notify_die(DIE_NMI, "nmi", regs, reason, 2, SIGINT)
			
 
				 							== NOTIFY_STOP)
			
 
				 			return;
			
 
				-
			
 
				-#ifndef CONFIG_LOCKUP_DETECTOR
			
 
				-		/*
			
 
				-		 * Ok, so this is none of the documented NMI sources,
			
 
				-		 * so it must be the NMI watchdog.
			
 
				-		 */
			
 
				-		if (nmi_watchdog_tick(regs, reason))
			
 
				-			return;
			
 
				-		if (!do_nmi_callback(regs, cpu))
			
 
				-#endif /* !CONFIG_LOCKUP_DETECTOR */
			
 
				-			unknown_nmi_error(reason, regs);
			
 
				-#else
			
 
				-		unknown_nmi_error(reason, regs);
			
 
				 #endif
			
 
				+		unknown_nmi_error(reason, regs);
			
 
				 
			
 
				 		return;
			
 
				 	}
			
@@ -446,14 +445,12 @@ do_nmi(struct pt_regs *regs, long error_code)
 
				 
			
 
				 void stop_nmi(void)
			
 
				 {
			
 
				-	acpi_nmi_disable();
			
 
				 	ignore_nmis++;
			
 
				 }
			
 
				 
			
 
				 void restart_nmi(void)
			
 
				 {
			
 
				 	ignore_nmis--;
			
 
				-	acpi_nmi_enable();
			
 
				 }
			
 
				 
			
 
				 /* May run on IST stack. */
			
--- a/arch/x86/mm/kmemcheck/error.c
+++ b/arch/x86/mm/kmemcheck/error.c
@@ -185,7 +185,7 @@ void kmemcheck_error_save(enum kmemcheck_shadow state,
 
				 	e->trace.entries = e->trace_entries;
			
 
				 	e->trace.max_entries = ARRAY_SIZE(e->trace_entries);
			
 
				 	e->trace.skip = 0;
			
 
				-	save_stack_trace_bp(&e->trace, regs->bp);
			
 
				+	save_stack_trace_regs(&e->trace, regs);
			
 
				 
			
 
				 	/* Round address down to nearest 16 bytes */
			
 
				 	shadow_copy = kmemcheck_shadow_lookup(address
			
--- a/arch/x86/oprofile/backtrace.c
+++ b/arch/x86/oprofile/backtrace.c
@@ -126,7 +126,7 @@ x86_backtrace(struct pt_regs * const regs, unsigned int depth)
 
				 	if (!user_mode_vm(regs)) {
			
 
				 		unsigned long stack = kernel_stack_pointer(regs);
			
 
				 		if (depth)
			
 
				-			dump_trace(NULL, regs, (unsigned long *)stack, 0,
			
 
				+			dump_trace(NULL, regs, (unsigned long *)stack,
			
 
				 				   &backtrace_ops, &depth);
			
 
				 		return;
			
 
				 	}
			
--- a/arch/x86/oprofile/nmi_int.c
+++ b/arch/x86/oprofile/nmi_int.c
@@ -732,6 +732,9 @@ int __init op_nmi_init(struct oprofile_operations *ops)
 
				 		case 0x14:
			
 
				 			cpu_type = "x86-64/family14h";
			
 
				 			break;
			
 
				+		case 0x15:
			
 
				+			cpu_type = "x86-64/family15h";
			
 
				+			break;
			
 
				 		default:
			
 
				 			return -ENODEV;
			
 
				 		}
			
--- a/arch/x86/oprofile/nmi_timer_int.c
+++ b/arch/x86/oprofile/nmi_timer_int.c
@@ -58,9 +58,6 @@ static void timer_stop(void)
 
				 
			
 
				 int __init op_nmi_timer_init(struct oprofile_operations *ops)
			
 
				 {
			
 
				-	if ((nmi_watchdog != NMI_IO_APIC) || (atomic_read(&nmi_active) <= 0))
			
 
				-		return -ENODEV;
			
 
				-
			
 
				 	ops->start = timer_start;
			
 
				 	ops->stop = timer_stop;
			
 
				 	ops->cpu_type = "timer";
			
--- a/arch/x86/oprofile/op_model_amd.c
+++ b/arch/x86/oprofile/op_model_amd.c
@@ -29,11 +29,12 @@
 
				 #include "op_x86_model.h"
			
 
				 #include "op_counter.h"
			
 
				 
			
 
				-#define NUM_COUNTERS 4
			
 
				+#define NUM_COUNTERS		4
			
 
				+#define NUM_COUNTERS_F15H	6
			
 
				 #ifdef CONFIG_OPROFILE_EVENT_MULTIPLEX
			
 
				-#define NUM_VIRT_COUNTERS 32
			
 
				+#define NUM_VIRT_COUNTERS	32
			
 
				 #else
			
 
				-#define NUM_VIRT_COUNTERS NUM_COUNTERS
			
 
				+#define NUM_VIRT_COUNTERS	0
			
 
				 #endif
			
 
				 
			
 
				 #define OP_EVENT_MASK			0x0FFF
			
@@ -41,7 +42,8 @@
 
				 
			
 
				 #define MSR_AMD_EVENTSEL_RESERVED	((0xFFFFFCF0ULL<<32)|(1ULL<<21))
			
 
				 
			
 
				-static unsigned long reset_value[NUM_VIRT_COUNTERS];
			
 
				+static int num_counters;
			
 
				+static unsigned long reset_value[OP_MAX_COUNTER];
			
 
				 
			
 
				 #define IBS_FETCH_SIZE			6
			
 
				 #define IBS_OP_SIZE			12
			
@@ -387,7 +389,7 @@ static void op_mux_switch_ctrl(struct op_x86_model_spec const *model,
 
				 	int i;
			
 
				 
			
 
				 	/* enable active counters */
			
 
				-	for (i = 0; i < NUM_COUNTERS; ++i) {
			
 
				+	for (i = 0; i < num_counters; ++i) {
			
 
				 		int virt = op_x86_phys_to_virt(i);
			
 
				 		if (!reset_value[virt])
			
 
				 			continue;
			
@@ -406,7 +408,7 @@ static void op_amd_shutdown(struct op_msrs const * const msrs)
 
				 {
			
 
				 	int i;
			
 
				 
			
 
				-	for (i = 0; i < NUM_COUNTERS; ++i) {
			
 
				+	for (i = 0; i < num_counters; ++i) {
			
 
				 		if (!msrs->counters[i].addr)
			
 
				 			continue;
			
 
				 		release_perfctr_nmi(MSR_K7_PERFCTR0 + i);
			
@@ -418,7 +420,7 @@ static int op_amd_fill_in_addresses(struct op_msrs * const msrs)
 
				 {
			
 
				 	int i;
			
 
				 
			
 
				-	for (i = 0; i < NUM_COUNTERS; i++) {
			
 
				+	for (i = 0; i < num_counters; i++) {
			
 
				 		if (!reserve_perfctr_nmi(MSR_K7_PERFCTR0 + i))
			
 
				 			goto fail;
			
 
				 		if (!reserve_evntsel_nmi(MSR_K7_EVNTSEL0 + i)) {
			
@@ -426,8 +428,13 @@ static int op_amd_fill_in_addresses(struct op_msrs * const msrs)
 
				 			goto fail;
			
 
				 		}
			
 
				 		/* both registers must be reserved */
			
 
				-		msrs->counters[i].addr = MSR_K7_PERFCTR0 + i;
			
 
				-		msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i;
			
 
				+		if (num_counters == NUM_COUNTERS_F15H) {
			
 
				+			msrs->counters[i].addr = MSR_F15H_PERF_CTR + (i << 1);
			
 
				+			msrs->controls[i].addr = MSR_F15H_PERF_CTL + (i << 1);
			
 
				+		} else {
			
 
				+			msrs->controls[i].addr = MSR_K7_EVNTSEL0 + i;
			
 
				+			msrs->counters[i].addr = MSR_K7_PERFCTR0 + i;
			
 
				+		}
			
 
				 		continue;
			
 
				 	fail:
			
 
				 		if (!counter_config[i].enabled)
			
@@ -447,7 +454,7 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
 
				 	int i;
			
 
				 
			
 
				 	/* setup reset_value */
			
 
				-	for (i = 0; i < NUM_VIRT_COUNTERS; ++i) {
			
 
				+	for (i = 0; i < OP_MAX_COUNTER; ++i) {
			
 
				 		if (counter_config[i].enabled
			
 
				 		    && msrs->counters[op_x86_virt_to_phys(i)].addr)
			
 
				 			reset_value[i] = counter_config[i].count;
			
@@ -456,7 +463,7 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
 
				 	}
			
 
				 
			
 
				 	/* clear all counters */
			
 
				-	for (i = 0; i < NUM_COUNTERS; ++i) {
			
 
				+	for (i = 0; i < num_counters; ++i) {
			
 
				 		if (!msrs->controls[i].addr)
			
 
				 			continue;
			
 
				 		rdmsrl(msrs->controls[i].addr, val);
			
@@ -472,7 +479,7 @@ static void op_amd_setup_ctrs(struct op_x86_model_spec const *model,
 
				 	}
			
 
				 
			
 
				 	/* enable active counters */
			
 
				-	for (i = 0; i < NUM_COUNTERS; ++i) {
			
 
				+	for (i = 0; i < num_counters; ++i) {
			
 
				 		int virt = op_x86_phys_to_virt(i);
			
 
				 		if (!reset_value[virt])
			
 
				 			continue;
			
@@ -503,7 +510,7 @@ static int op_amd_check_ctrs(struct pt_regs * const regs,
 
				 	u64 val;
			
 
				 	int i;
			
 
				 
			
 
				-	for (i = 0; i < NUM_COUNTERS; ++i) {
			
 
				+	for (i = 0; i < num_counters; ++i) {
			
 
				 		int virt = op_x86_phys_to_virt(i);
			
 
				 		if (!reset_value[virt])
			
 
				 			continue;
			
@@ -526,7 +533,7 @@ static void op_amd_start(struct op_msrs const * const msrs)
 
				 	u64 val;
			
 
				 	int i;
			
 
				 
			
 
				-	for (i = 0; i < NUM_COUNTERS; ++i) {
			
 
				+	for (i = 0; i < num_counters; ++i) {
			
 
				 		if (!reset_value[op_x86_phys_to_virt(i)])
			
 
				 			continue;
			
 
				 		rdmsrl(msrs->controls[i].addr, val);
			
@@ -546,7 +553,7 @@ static void op_amd_stop(struct op_msrs const * const msrs)
 
				 	 * Subtle: stop on all counters to avoid race with setting our
			
 
				 	 * pm callback
			
 
				 	 */
			
 
				-	for (i = 0; i < NUM_COUNTERS; ++i) {
			
 
				+	for (i = 0; i < num_counters; ++i) {
			
 
				 		if (!reset_value[op_x86_phys_to_virt(i)])
			
 
				 			continue;
			
 
				 		rdmsrl(msrs->controls[i].addr, val);
			
@@ -706,18 +713,29 @@ static int setup_ibs_files(struct super_block *sb, struct dentry *root)
 
				 	return 0;
			
 
				 }
			
 
				 
			
 
				+struct op_x86_model_spec op_amd_spec;
			
 
				+
			
 
				 static int op_amd_init(struct oprofile_operations *ops)
			
 
				 {
			
 
				 	init_ibs();
			
 
				 	create_arch_files = ops->create_files;
			
 
				 	ops->create_files = setup_ibs_files;
			
 
				+
			
 
				+	if (boot_cpu_data.x86 == 0x15) {
			
 
				+		num_counters = NUM_COUNTERS_F15H;
			
 
				+	} else {
			
 
				+		num_counters = NUM_COUNTERS;
			
 
				+	}
			
 
				+
			
 
				+	op_amd_spec.num_counters = num_counters;
			
 
				+	op_amd_spec.num_controls = num_counters;
			
 
				+	op_amd_spec.num_virt_counters = max(num_counters, NUM_VIRT_COUNTERS);
			
 
				+
			
 
				 	return 0;
			
 
				 }
			
 
				 
			
 
				 struct op_x86_model_spec op_amd_spec = {
			
 
				-	.num_counters		= NUM_COUNTERS,
			
 
				-	.num_controls		= NUM_COUNTERS,
			
 
				-	.num_virt_counters	= NUM_VIRT_COUNTERS,
			
 
				+	/* num_counters/num_controls filled in at runtime */
			
 
				 	.reserved		= MSR_AMD_EVENTSEL_RESERVED,
			
 
				 	.event_mask		= OP_EVENT_MASK,
			
 
				 	.init			= op_amd_init,
			
--- a/arch/x86/oprofile/op_model_p4.c
+++ b/arch/x86/oprofile/op_model_p4.c
@@ -11,7 +11,7 @@
 
				 #include <linux/oprofile.h>
			
 
				 #include <linux/smp.h>
			
 
				 #include <linux/ptrace.h>
			
 
				-#include <linux/nmi.h>
			
 
				+#include <asm/nmi.h>
			
 
				 #include <asm/msr.h>
			
 
				 #include <asm/fixmap.h>
			
 
				 #include <asm/apic.h>
			
--- a/drivers/acpi/acpica/nsinit.c
+++ b/drivers/acpi/acpica/nsinit.c
@@ -577,9 +577,7 @@ acpi_ns_init_one_device(acpi_handle obj_handle,
 
				 	 * as possible (without an NMI being received in the middle of
			
 
				 	 * this) - so disable NMIs and initialize the device:
			
 
				 	 */
			
 
				-	acpi_nmi_disable();
			
 
				 	status = acpi_ns_evaluate(info);
			
 
				-	acpi_nmi_enable();
			
 
				 
			
 
				 	if (ACPI_SUCCESS(status)) {
			
 
				 		walk_info->num_INI++;
			
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -355,6 +355,7 @@ void cpufreq_notify_transition(struct cpufreq_freqs *freqs, unsigned int state)
 
				 		dprintk("FREQ: %lu - CPU: %lu", (unsigned long)freqs->new,
			
 
				 			(unsigned long)freqs->cpu);
			
 
				 		trace_power_frequency(POWER_PSTATE, freqs->new, freqs->cpu);
			
 
				+		trace_cpu_frequency(freqs->new, freqs->cpu);
			
 
				 		srcu_notifier_call_chain(&cpufreq_transition_notifier_list,
			
 
				 				CPUFREQ_POSTCHANGE, freqs);
			
 
				 		if (likely(policy) && likely(policy->cpu == freqs->cpu))
			
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -107,6 +107,7 @@ static void cpuidle_idle_call(void)
 
				 	if (cpuidle_curr_governor->reflect)
			
 
				 		cpuidle_curr_governor->reflect(dev);
			
 
				 	trace_power_end(smp_processor_id());
			
 
				+	trace_cpu_idle(PWR_EVENT_EXIT, smp_processor_id());
			
 
				 }
			
 
				 
			
 
				 /**
			
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -220,9 +220,8 @@ static int intel_idle(struct cpuidle_device *dev, struct cpuidle_state *state)
 
				 	kt_before = ktime_get_real();
			
 
				 
			
 
				 	stop_critical_timings();
			
 
				-#ifndef MODULE
			
 
				 	trace_power_start(POWER_CSTATE, (eax >> 4) + 1, cpu);
			
 
				-#endif
			
 
				+	trace_cpu_idle((eax >> 4) + 1, cpu);
			
 
				 	if (!need_resched()) {
			
 
				 
			
 
				 		__monitor((void *)&current_thread_info()->flags, 0, 0);
			
--- a/drivers/watchdog/hpwdt.c
+++ b/drivers/watchdog/hpwdt.c
@@ -642,19 +642,14 @@ static struct notifier_block die_notifier = {
 
				  */
			
 
				 
			
 
				 #ifdef CONFIG_HPWDT_NMI_DECODING
			
 
				-#ifdef ARCH_HAS_NMI_WATCHDOG
			
 
				+#ifdef CONFIG_X86_LOCAL_APIC
			
 
				 static void __devinit hpwdt_check_nmi_decoding(struct pci_dev *dev)
			
 
				 {
			
 
				 	/*
			
 
				 	 * If nmi_watchdog is turned off then we can turn on
			
 
				 	 * our nmi decoding capability.
			
 
				 	 */
			
 
				-	if (!nmi_watchdog_active())
			
 
				-		hpwdt_nmi_decoding = 1;
			
 
				-	else
			
 
				-		dev_warn(&dev->dev, "NMI decoding is disabled. To enable this "
			
 
				-			"functionality you must reboot with nmi_watchdog=0 "
			
 
				-			"and load the hpwdt driver with priority=1.\n");
			
 
				+	hpwdt_nmi_decoding = 1;
			
 
				 }
			
 
				 #else
			
 
				 static void __devinit hpwdt_check_nmi_decoding(struct pci_dev *dev)
			
@@ -662,7 +657,7 @@ static void __devinit hpwdt_check_nmi_decoding(struct pci_dev *dev)
 
				 	dev_warn(&dev->dev, "NMI decoding is disabled. "
			
 
				 		"Your kernel does not support a NMI Watchdog.\n");
			
 
				 }
			
 
				-#endif /* ARCH_HAS_NMI_WATCHDOG */
			
 
				+#endif /* CONFIG_X86_LOCAL_APIC */
			
 
				 
			
 
				 static int __devinit hpwdt_init_nmi_decoding(struct pci_dev *dev)
			
 
				 {
			
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -154,12 +154,14 @@ enum {
 
				 	TRACE_EVENT_FL_ENABLED_BIT,
			
 
				 	TRACE_EVENT_FL_FILTERED_BIT,
			
 
				 	TRACE_EVENT_FL_RECORDED_CMD_BIT,
			
 
				+	TRACE_EVENT_FL_CAP_ANY_BIT,
			
 
				 };
			
 
				 
			
 
				 enum {
			
 
				 	TRACE_EVENT_FL_ENABLED		= (1 << TRACE_EVENT_FL_ENABLED_BIT),
			
 
				 	TRACE_EVENT_FL_FILTERED		= (1 << TRACE_EVENT_FL_FILTERED_BIT),
			
 
				 	TRACE_EVENT_FL_RECORDED_CMD	= (1 << TRACE_EVENT_FL_RECORDED_CMD_BIT),
			
 
				+	TRACE_EVENT_FL_CAP_ANY		= (1 << TRACE_EVENT_FL_CAP_ANY_BIT),
			
 
				 };
			
 
				 
			
 
				 struct ftrace_event_call {
			
@@ -196,6 +198,14 @@ struct ftrace_event_call {
 
				 #endif
			
 
				 };
			
 
				 
			
 
				+#define __TRACE_EVENT_FLAGS(name, value)				\
			
 
				+	static int __init trace_init_flags_##name(void)			\
			
 
				+	{								\
			
 
				+		event_##name.flags = value;				\
			
 
				+		return 0;						\
			
 
				+	}								\
			
 
				+	early_initcall(trace_init_flags_##name);
			
 
				+
			
 
				 #define PERF_MAX_TRACE_SIZE	2048
			
 
				 
			
 
				 #define MAX_FILTER_PRED		32
			
@@ -215,6 +225,10 @@ enum {
 
				 	FILTER_PTR_STRING,
			
 
				 };
			
 
				 
			
 
				+#define EVENT_STORAGE_SIZE 128
			
 
				+extern struct mutex event_storage_mutex;
			
 
				+extern char event_storage[EVENT_STORAGE_SIZE];
			
 
				+
			
 
				 extern int trace_event_raw_init(struct ftrace_event_call *call);
			
 
				 extern int trace_define_field(struct ftrace_event_call *call, const char *type,
			
 
				 			      const char *name, int offset, int size,
			
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -275,7 +275,9 @@ extern int arch_prepared_optinsn(struct arch_optimized_insn *optinsn);
 
				 extern int arch_check_optimized_kprobe(struct optimized_kprobe *op);
			
 
				 extern int arch_prepare_optimized_kprobe(struct optimized_kprobe *op);
			
 
				 extern void arch_remove_optimized_kprobe(struct optimized_kprobe *op);
			
 
				-extern int  arch_optimize_kprobe(struct optimized_kprobe *op);
			
 
				+extern void arch_optimize_kprobes(struct list_head *oplist);
			
 
				+extern void arch_unoptimize_kprobes(struct list_head *oplist,
			
 
				+				    struct list_head *done_list);
			
 
				 extern void arch_unoptimize_kprobe(struct optimized_kprobe *op);
			
 
				 extern kprobe_opcode_t *get_optinsn_slot(void);
			
 
				 extern void free_optinsn_slot(kprobe_opcode_t *slot, int dirty);
			
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -14,22 +14,14 @@
 
				  * may be used to reset the timeout - for code which intentionally
			
 
				  * disables interrupts for a long time. This call is stateless.
			
 
				  */
			
 
				-#ifdef ARCH_HAS_NMI_WATCHDOG
			
 
				+#if defined(ARCH_HAS_NMI_WATCHDOG) || defined(CONFIG_HARDLOCKUP_DETECTOR)
			
 
				 #include <asm/nmi.h>
			
 
				 extern void touch_nmi_watchdog(void);
			
 
				-extern void acpi_nmi_disable(void);
			
 
				-extern void acpi_nmi_enable(void);
			
 
				 #else
			
 
				-#ifndef CONFIG_HARDLOCKUP_DETECTOR
			
 
				 static inline void touch_nmi_watchdog(void)
			
 
				 {
			
 
				 	touch_softlockup_watchdog();
			
 
				 }
			
 
				-#else
			
 
				-extern void touch_nmi_watchdog(void);
			
 
				-#endif
			
 
				-static inline void acpi_nmi_disable(void) { }
			
 
				-static inline void acpi_nmi_enable(void) { }
			
 
				 #endif
			
 
				 
			
 
				 /*
			
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -215,8 +215,9 @@ struct perf_event_attr {
 
				 				 */
			
 
				 				precise_ip     :  2, /* skid constraint       */
			
 
				 				mmap_data      :  1, /* non-exec mmap data    */
			
 
				+				sample_id_all  :  1, /* sample_type all events */
			
 
				 
			
 
				-				__reserved_1   : 46;
			
 
				+				__reserved_1   : 45;
			
 
				 
			
 
				 	union {
			
 
				 		__u32		wakeup_events;	  /* wakeup every n events */
			
@@ -327,6 +328,15 @@ struct perf_event_header {
 
				 enum perf_event_type {
			
 
				 
			
 
				 	/*
			
 
				+	 * If perf_event_attr.sample_id_all is set then all event types will
			
 
				+	 * have the sample_type selected fields related to where/when
			
 
				+	 * (identity) an event took place (TID, TIME, ID, CPU, STREAM_ID)
			
 
				+	 * described in PERF_RECORD_SAMPLE below, it will be stashed just after
			
 
				+	 * the perf_event_header and the fields already present for the existing
			
 
				+	 * fields, i.e. at the end of the payload. That way a newer perf.data
			
 
				+	 * file will be supported by older perf tools, with these new optional
			
 
				+	 * fields being ignored.
			
 
				+	 *
			
 
				 	 * The MMAP events record the PROT_EXEC mappings so that we can
			
 
				 	 * correlate userspace IPs to code. They have the following structure:
			
 
				 	 *
			
@@ -578,6 +588,10 @@ struct perf_event;
 
				 struct pmu {
			
 
				 	struct list_head		entry;
			
 
				 
			
 
				+	struct device			*dev;
			
 
				+	char				*name;
			
 
				+	int				type;
			
 
				+
			
 
				 	int * __percpu			pmu_disable_count;
			
 
				 	struct perf_cpu_context * __percpu pmu_cpu_context;
			
 
				 	int				task_ctx_nr;
			
@@ -758,6 +772,9 @@ struct perf_event {
 
				 	u64				shadow_ctx_time;
			
 
				 
			
 
				 	struct perf_event_attr		attr;
			
 
				+	u16				header_size;
			
 
				+	u16				id_header_size;
			
 
				+	u16				read_size;
			
 
				 	struct hw_perf_event		hw;
			
 
				 
			
 
				 	struct perf_event_context	*ctx;
			
@@ -903,7 +920,7 @@ struct perf_output_handle {
 
				 
			
 
				 #ifdef CONFIG_PERF_EVENTS
			
 
				 
			
 
				-extern int perf_pmu_register(struct pmu *pmu);
			
 
				+extern int perf_pmu_register(struct pmu *pmu, char *name, int type);
			
 
				 extern void perf_pmu_unregister(struct pmu *pmu);
			
 
				 
			
 
				 extern int perf_num_counters(void);
			
@@ -970,6 +987,11 @@ extern int perf_event_overflow(struct perf_event *event, int nmi,
 
				 				 struct perf_sample_data *data,
			
 
				 				 struct pt_regs *regs);
			
 
				 
			
 
				+static inline bool is_sampling_event(struct perf_event *event)
			
 
				+{
			
 
				+	return event->attr.sample_period != 0;
			
 
				+}
			
 
				+
			
 
				 /*
			
 
				  * Return 1 for a software event, 0 for a hardware event
			
 
				  */
			
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -316,6 +316,7 @@ extern int proc_dowatchdog_thresh(struct ctl_table *table, int write,
 
				 				  size_t *lenp, loff_t *ppos);
			
 
				 extern unsigned int  softlockup_panic;
			
 
				 extern int softlockup_thresh;
			
 
				+void lockup_detector_init(void);
			
 
				 #else
			
 
				 static inline void touch_softlockup_watchdog(void)
			
 
				 {
			
@@ -326,6 +327,9 @@ static inline void touch_softlockup_watchdog_sync(void)
 
				 static inline void touch_all_softlockup_watchdogs(void)
			
 
				 {
			
 
				 }
			
 
				+static inline void lockup_detector_init(void)
			
 
				+{
			
 
				+}
			
 
				 #endif
			
 
				 
			
 
				 #ifdef CONFIG_DETECT_HUNG_TASK
			
--- a/include/linux/stacktrace.h
+++ b/include/linux/stacktrace.h
@@ -2,6 +2,7 @@
 
				 #define __LINUX_STACKTRACE_H
			
 
				 
			
 
				 struct task_struct;
			
 
				+struct pt_regs;
			
 
				 
			
 
				 #ifdef CONFIG_STACKTRACE
			
 
				 struct task_struct;
			
@@ -13,7 +14,8 @@ struct stack_trace {
 
				 };
			
 
				 
			
 
				 extern void save_stack_trace(struct stack_trace *trace);
			
 
				-extern void save_stack_trace_bp(struct stack_trace *trace, unsigned long bp);
			
 
				+extern void save_stack_trace_regs(struct stack_trace *trace,
			
 
				+				  struct pt_regs *regs);
			
 
				 extern void save_stack_trace_tsk(struct task_struct *tsk,
			
 
				 				struct stack_trace *trace);
			
 
				 
			
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -127,8 +127,6 @@ extern struct trace_event_functions exit_syscall_print_funcs;
 
				 #define SYSCALL_TRACE_ENTER_EVENT(sname)				\
			
 
				 	static struct syscall_metadata					\
			
 
				 	__attribute__((__aligned__(4))) __syscall_meta_##sname;		\
			
 
				-	static struct ftrace_event_call					\
			
 
				-	__attribute__((__aligned__(4))) event_enter_##sname;		\
			
 
				 	static struct ftrace_event_call __used				\
			
 
				 	  __attribute__((__aligned__(4)))				\
			
 
				 	  __attribute__((section("_ftrace_events")))			\
			
@@ -137,13 +135,12 @@ extern struct trace_event_functions exit_syscall_print_funcs;
 
				 		.class			= &event_class_syscall_enter,	\
			
 
				 		.event.funcs            = &enter_syscall_print_funcs,	\
			
 
				 		.data			= (void *)&__syscall_meta_##sname,\
			
 
				-	}
			
 
				+	};								\
			
 
				+	__TRACE_EVENT_FLAGS(enter_##sname, TRACE_EVENT_FL_CAP_ANY)
			
 
				 
			
 
				 #define SYSCALL_TRACE_EXIT_EVENT(sname)					\
			
 
				 	static struct syscall_metadata					\
			
 
				 	__attribute__((__aligned__(4))) __syscall_meta_##sname;		\
			
 
				-	static struct ftrace_event_call					\
			
 
				-	__attribute__((__aligned__(4))) event_exit_##sname;		\
			
 
				 	static struct ftrace_event_call __used				\
			
 
				 	  __attribute__((__aligned__(4)))				\
			
 
				 	  __attribute__((section("_ftrace_events")))			\
			
@@ -152,7 +149,8 @@ extern struct trace_event_functions exit_syscall_print_funcs;
 
				 		.class			= &event_class_syscall_exit,	\
			
 
				 		.event.funcs		= &exit_syscall_print_funcs,	\
			
 
				 		.data			= (void *)&__syscall_meta_##sname,\
			
 
				-	}
			
 
				+	};								\
			
 
				+	__TRACE_EVENT_FLAGS(exit_##sname, TRACE_EVENT_FL_CAP_ANY)
			
 
				 
			
 
				 #define SYSCALL_METADATA(sname, nb)				\
			
 
				 	SYSCALL_TRACE_ENTER_EVENT(sname);			\
			
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -106,6 +106,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
 
				 
			
 
				 #define TP_PROTO(args...)	args
			
 
				 #define TP_ARGS(args...)	args
			
 
				+#define TP_CONDITION(args...)	args
			
 
				 
			
 
				 #ifdef CONFIG_TRACEPOINTS
			
 
				 
			
@@ -119,12 +120,14 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
 
				  * as "(void *, void)". The DECLARE_TRACE_NOARGS() will pass in just
			
 
				  * "void *data", where as the DECLARE_TRACE() will pass in "void *data, proto".
			
 
				  */
			
 
				-#define __DO_TRACE(tp, proto, args)					\
			
 
				+#define __DO_TRACE(tp, proto, args, cond)				\
			
 
				 	do {								\
			
 
				 		struct tracepoint_func *it_func_ptr;			\
			
 
				 		void *it_func;						\
			
 
				 		void *__data;						\
			
 
				 									\
			
 
				+		if (!(cond))						\
			
 
				+			return;						\
			
 
				 		rcu_read_lock_sched_notrace();				\
			
 
				 		it_func_ptr = rcu_dereference_sched((tp)->funcs);	\
			
 
				 		if (it_func_ptr) {					\
			
@@ -142,7 +145,7 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
 
				  * not add unwanted padding between the beginning of the section and the
			
 
				  * structure. Force alignment to the same alignment as the section start.
			
 
				  */
			
 
				-#define __DECLARE_TRACE(name, proto, args, data_proto, data_args)	\
			
 
				+#define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args)	\
			
 
				 	extern struct tracepoint __tracepoint_##name;			\
			
 
				 	static inline void trace_##name(proto)				\
			
 
				 	{								\
			
@@ -151,7 +154,8 @@ static inline void tracepoint_update_probe_range(struct tracepoint *begin,
 
				 do_trace:								\
			
 
				 			__DO_TRACE(&__tracepoint_##name,		\
			
 
				 				TP_PROTO(data_proto),			\
			
 
				-				TP_ARGS(data_args));			\
			
 
				+				TP_ARGS(data_args),			\
			
 
				+				TP_CONDITION(cond));			\
			
 
				 	}								\
			
 
				 	static inline int						\
			
 
				 	register_trace_##name(void (*probe)(data_proto), void *data)	\
			
@@ -186,7 +190,7 @@ do_trace:								\
 
				 	EXPORT_SYMBOL(__tracepoint_##name)
			
 
				 
			
 
				 #else /* !CONFIG_TRACEPOINTS */
			
 
				-#define __DECLARE_TRACE(name, proto, args, data_proto, data_args)	\
			
 
				+#define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args)	\
			
 
				 	static inline void trace_##name(proto)				\
			
 
				 	{ }								\
			
 
				 	static inline int						\
			
@@ -227,13 +231,20 @@ do_trace:								\
 
				  * "void *__data, proto" as the callback prototype.
			
 
				  */
			
 
				 #define DECLARE_TRACE_NOARGS(name)					\
			
 
				-		__DECLARE_TRACE(name, void, , void *__data, __data)
			
 
				+		__DECLARE_TRACE(name, void, , 1, void *__data, __data)
			
 
				 
			
 
				 #define DECLARE_TRACE(name, proto, args)				\
			
 
				-		__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args),	\
			
 
				+		__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), 1,	\
			
 
				 				PARAMS(void *__data, proto),		\
			
 
				 				PARAMS(__data, args))
			
 
				 
			
 
				+#define DECLARE_TRACE_CONDITION(name, proto, args, cond)		\
			
 
				+	__DECLARE_TRACE(name, PARAMS(proto), PARAMS(args), PARAMS(cond), \
			
 
				+			PARAMS(void *__data, proto),			\
			
 
				+			PARAMS(__data, args))
			
 
				+
			
 
				+#define TRACE_EVENT_FLAGS(event, flag)
			
 
				+
			
 
				 #endif /* DECLARE_TRACE */
			
 
				 
			
 
				 #ifndef TRACE_EVENT
			
@@ -347,11 +358,21 @@ do_trace:								\
 
				 	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
			
 
				 #define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
			
 
				 	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
			
 
				+#define DEFINE_EVENT_CONDITION(template, name, proto,		\
			
 
				+			       args, cond)			\
			
 
				+	DECLARE_TRACE_CONDITION(name, PARAMS(proto),		\
			
 
				+				PARAMS(args), PARAMS(cond))
			
 
				 
			
 
				 #define TRACE_EVENT(name, proto, args, struct, assign, print)	\
			
 
				 	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
			
 
				 #define TRACE_EVENT_FN(name, proto, args, struct,		\
			
 
				 		assign, print, reg, unreg)			\
			
 
				 	DECLARE_TRACE(name, PARAMS(proto), PARAMS(args))
			
 
				+#define TRACE_EVENT_CONDITION(name, proto, args, cond,		\
			
 
				+			      struct, assign, print)		\
			
 
				+	DECLARE_TRACE_CONDITION(name, PARAMS(proto),		\
			
 
				+				PARAMS(args), PARAMS(cond))
			
 
				+
			
 
				+#define TRACE_EVENT_FLAGS(event, flag)
			
 
				 
			
 
				 #endif /* ifdef TRACE_EVENT (see note above) */
			
--- a/include/trace/define_trace.h
+++ b/include/trace/define_trace.h
@@ -26,6 +26,15 @@
 
				 #define TRACE_EVENT(name, proto, args, tstruct, assign, print)	\
			
 
				 	DEFINE_TRACE(name)
			
 
				 
			
 
				+#undef TRACE_EVENT_CONDITION
			
 
				+#define TRACE_EVENT_CONDITION(name, proto, args, cond, tstruct, assign, print) \
			
 
				+	TRACE_EVENT(name,						\
			
 
				+		PARAMS(proto),						\
			
 
				+		PARAMS(args),						\
			
 
				+		PARAMS(tstruct),					\
			
 
				+		PARAMS(assign),						\
			
 
				+		PARAMS(print))
			
 
				+
			
 
				 #undef TRACE_EVENT_FN
			
 
				 #define TRACE_EVENT_FN(name, proto, args, tstruct,		\
			
 
				 		assign, print, reg, unreg)			\
			
@@ -39,6 +48,10 @@
 
				 #define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
			
 
				 	DEFINE_TRACE(name)
			
 
				 
			
 
				+#undef DEFINE_EVENT_CONDITION
			
 
				+#define DEFINE_EVENT_CONDITION(template, name, proto, args, cond) \
			
 
				+	DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
			
 
				+
			
 
				 #undef DECLARE_TRACE
			
 
				 #define DECLARE_TRACE(name, proto, args)	\
			
 
				 	DEFINE_TRACE(name)
			
@@ -75,9 +88,11 @@
 
				 
			
 
				 #undef TRACE_EVENT
			
 
				 #undef TRACE_EVENT_FN
			
 
				+#undef TRACE_EVENT_CONDITION
			
 
				 #undef DECLARE_EVENT_CLASS
			
 
				 #undef DEFINE_EVENT
			
 
				 #undef DEFINE_EVENT_PRINT
			
 
				+#undef DEFINE_EVENT_CONDITION
			
 
				 #undef TRACE_HEADER_MULTI_READ
			
 
				 #undef DECLARE_TRACE
			
 
				 
			
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -7,16 +7,67 @@
 
				 #include <linux/ktime.h>
			
 
				 #include <linux/tracepoint.h>
			
 
				 
			
 
				-#ifndef _TRACE_POWER_ENUM_
			
 
				-#define _TRACE_POWER_ENUM_
			
 
				-enum {
			
 
				-	POWER_NONE	= 0,
			
 
				-	POWER_CSTATE	= 1,	/* C-State */
			
 
				-	POWER_PSTATE	= 2,	/* Fequency change or DVFS */
			
 
				-	POWER_SSTATE	= 3,	/* Suspend */
			
 
				-};
			
 
				+DECLARE_EVENT_CLASS(cpu,
			
 
				+
			
 
				+	TP_PROTO(unsigned int state, unsigned int cpu_id),
			
 
				+
			
 
				+	TP_ARGS(state, cpu_id),
			
 
				+
			
 
				+	TP_STRUCT__entry(
			
 
				+		__field(	u32,		state		)
			
 
				+		__field(	u32,		cpu_id		)
			
 
				+	),
			
 
				+
			
 
				+	TP_fast_assign(
			
 
				+		__entry->state = state;
			
 
				+		__entry->cpu_id = cpu_id;
			
 
				+	),
			
 
				+
			
 
				+	TP_printk("state=%lu cpu_id=%lu", (unsigned long)__entry->state,
			
 
				+		  (unsigned long)__entry->cpu_id)
			
 
				+);
			
 
				+
			
 
				+DEFINE_EVENT(cpu, cpu_idle,
			
 
				+
			
 
				+	TP_PROTO(unsigned int state, unsigned int cpu_id),
			
 
				+
			
 
				+	TP_ARGS(state, cpu_id)
			
 
				+);
			
 
				+
			
 
				+/* This file can get included multiple times, TRACE_HEADER_MULTI_READ at top */
			
 
				+#ifndef _PWR_EVENT_AVOID_DOUBLE_DEFINING
			
 
				+#define _PWR_EVENT_AVOID_DOUBLE_DEFINING
			
 
				+
			
 
				+#define PWR_EVENT_EXIT -1
			
 
				 #endif
			
 
				 
			
 
				+DEFINE_EVENT(cpu, cpu_frequency,
			
 
				+
			
 
				+	TP_PROTO(unsigned int frequency, unsigned int cpu_id),
			
 
				+
			
 
				+	TP_ARGS(frequency, cpu_id)
			
 
				+);
			
 
				+
			
 
				+TRACE_EVENT(machine_suspend,
			
 
				+
			
 
				+	TP_PROTO(unsigned int state),
			
 
				+
			
 
				+	TP_ARGS(state),
			
 
				+
			
 
				+	TP_STRUCT__entry(
			
 
				+		__field(	u32,		state		)
			
 
				+	),
			
 
				+
			
 
				+	TP_fast_assign(
			
 
				+		__entry->state = state;
			
 
				+	),
			
 
				+
			
 
				+	TP_printk("state=%lu", (unsigned long)__entry->state)
			
 
				+);
			
 
				+
			
 
				+/* This code will be removed after deprecation time exceeded (2.6.41) */
			
 
				+#ifdef CONFIG_EVENT_POWER_TRACING_DEPRECATED
			
 
				+
			
 
				 /*
			
 
				  * The power events are used for cpuidle & suspend (power_start, power_end)
			
 
				  *  and for cpufreq (power_frequency)
			
@@ -75,6 +126,36 @@ TRACE_EVENT(power_end,
 
				 
			
 
				 );
			
 
				 
			
 
				+/* Deprecated dummy functions must be protected against multi-declartion */
			
 
				+#ifndef _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
			
 
				+#define _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
			
 
				+
			
 
				+enum {
			
 
				+	POWER_NONE = 0,
			
 
				+	POWER_CSTATE = 1,
			
 
				+	POWER_PSTATE = 2,
			
 
				+};
			
 
				+#endif /* _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED */
			
 
				+
			
 
				+#else /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
			
 
				+
			
 
				+#ifndef _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
			
 
				+#define _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED
			
 
				+enum {
			
 
				+       POWER_NONE = 0,
			
 
				+       POWER_CSTATE = 1,
			
 
				+       POWER_PSTATE = 2,
			
 
				+};
			
 
				+
			
 
				+/* These dummy declaration have to be ripped out when the deprecated
			
 
				+   events get removed */
			
 
				+static inline void trace_power_start(u64 type, u64 state, u64 cpuid) {};
			
 
				+static inline void trace_power_end(u64 cpuid) {};
			
 
				+static inline void trace_power_frequency(u64 type, u64 state, u64 cpuid) {};
			
 
				+#endif /* _PWR_EVENT_AVOID_DOUBLE_DEFINING_DEPRECATED */
			
 
				+
			
 
				+#endif /* CONFIG_EVENT_POWER_TRACING_DEPRECATED */
			
 
				+
			
 
				 /*
			
 
				  * The clock events are used for clock enable/disable and for
			
 
				  *  clock rate change
			
@@ -153,7 +234,6 @@ DEFINE_EVENT(power_domain, power_domain_target,
 
				 
			
 
				 	TP_ARGS(name, state, cpu_id)
			
 
				 );
			
 
				-
			
 
				 #endif /* _TRACE_POWER_H */
			
 
				 
			
 
				 /* This part must be outside protection */
			
--- a/include/trace/events/syscalls.h
+++ b/include/trace/events/syscalls.h
@@ -40,6 +40,8 @@ TRACE_EVENT_FN(sys_enter,
 
				 	syscall_regfunc, syscall_unregfunc
			
 
				 );
			
 
				 
			
 
				+TRACE_EVENT_FLAGS(sys_enter, TRACE_EVENT_FL_CAP_ANY)
			
 
				+
			
 
				 TRACE_EVENT_FN(sys_exit,
			
 
				 
			
 
				 	TP_PROTO(struct pt_regs *regs, long ret),
			
@@ -62,6 +64,8 @@ TRACE_EVENT_FN(sys_exit,
 
				 	syscall_regfunc, syscall_unregfunc
			
 
				 );
			
 
				 
			
 
				+TRACE_EVENT_FLAGS(sys_exit, TRACE_EVENT_FL_CAP_ANY)
			
 
				+
			
 
				 #endif /* CONFIG_HAVE_SYSCALL_TRACEPOINTS */
			
 
				 
			
 
				 #endif /* _TRACE_EVENTS_SYSCALLS_H */
			
--- a/include/trace/ftrace.h
+++ b/include/trace/ftrace.h
@@ -82,6 +82,10 @@
 
				 	TRACE_EVENT(name, PARAMS(proto), PARAMS(args),			\
			
 
				 		PARAMS(tstruct), PARAMS(assign), PARAMS(print))		\
			
 
				 
			
 
				+#undef TRACE_EVENT_FLAGS
			
 
				+#define TRACE_EVENT_FLAGS(name, value)					\
			
 
				+	__TRACE_EVENT_FLAGS(name, value)
			
 
				+
			
 
				 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
			
 
				 
			
 
				 
			
@@ -129,6 +133,9 @@
 
				 #define DEFINE_EVENT_PRINT(template, name, proto, args, print)	\
			
 
				 	DEFINE_EVENT(template, name, PARAMS(proto), PARAMS(args))
			
 
				 
			
 
				+#undef TRACE_EVENT_FLAGS
			
 
				+#define TRACE_EVENT_FLAGS(event, flag)
			
 
				+
			
 
				 #include TRACE_INCLUDE(TRACE_INCLUDE_FILE)
			
 
				 
			
 
				 /*
			
@@ -289,13 +296,19 @@ static struct trace_event_functions ftrace_event_type_funcs_##call = {	\
 
				 
			
 
				 #undef __array
			
 
				 #define __array(type, item, len)					\
			
 
				-	BUILD_BUG_ON(len > MAX_FILTER_STR_VAL);				\
			
 
				-	ret = trace_define_field(event_call, #type "[" #len "]", #item,	\
			
 
				+	do {								\
			
 
				+		mutex_lock(&event_storage_mutex);			\
			
 
				+		BUILD_BUG_ON(len > MAX_FILTER_STR_VAL);			\
			
 
				+		snprintf(event_storage, sizeof(event_storage),		\
			
 
				+			 "%s[%d]", #type, len);				\
			
 
				+		ret = trace_define_field(event_call, event_storage, #item, \
			
 
				 				 offsetof(typeof(field), item),		\
			
 
				 				 sizeof(field.item),			\
			
 
				 				 is_signed_type(type), FILTER_OTHER);	\
			
 
				-	if (ret)							\
			
 
				-		return ret;
			
 
				+		mutex_unlock(&event_storage_mutex);			\
			
 
				+		if (ret)						\
			
 
				+			return ret;					\
			
 
				+	} while (0);
			
 
				 
			
 
				 #undef __dynamic_array
			
 
				 #define __dynamic_array(type, item, len)				       \
			
--- a/init/main.c
+++ b/init/main.c
@@ -67,6 +67,7 @@
 
				 #include <linux/sfi.h>
			
 
				 #include <linux/shmem_fs.h>
			
 
				 #include <linux/slab.h>
			
 
				+#include <linux/perf_event.h>
			
 
				 
			
 
				 #include <asm/io.h>
			
 
				 #include <asm/bugs.h>
			
@@ -603,6 +604,8 @@ asmlinkage void __init start_kernel(void)
 
				 				"enabled *very* early, fixing it\n");
			
 
				 		local_irq_disable();
			
 
				 	}
			
 
				+	idr_init_cache();
			
 
				+	perf_event_init();
			
 
				 	rcu_init();
			
 
				 	radix_tree_init();
			
 
				 	/* init some links before init_ISA_irqs() */
			
@@ -658,7 +661,6 @@ asmlinkage void __init start_kernel(void)
 
				 	enable_debug_pagealloc();
			
 
				 	kmemleak_init();
			
 
				 	debug_objects_mem_init();
			
 
				-	idr_init_cache();
			
 
				 	setup_per_cpu_pageset();
			
 
				 	numa_policy_init();
			
 
				 	if (late_time_init)
			
@@ -882,6 +884,7 @@ static int __init kernel_init(void * unused)
 
				 	smp_prepare_cpus(setup_max_cpus);
			
 
				 
			
 
				 	do_pre_smp_initcalls();
			
 
				+	lockup_detector_init();
			
 
				 
			
 
				 	smp_init();
			
 
				 	sched_init_smp();
			
--- a/kernel/hw_breakpoint.c
+++ b/kernel/hw_breakpoint.c
@@ -641,7 +641,7 @@ int __init init_hw_breakpoint(void)
 
				 
			
 
				 	constraints_initialized = 1;
			
 
				 
			
 
				-	perf_pmu_register(&perf_breakpoint);
			
 
				+	perf_pmu_register(&perf_breakpoint, "breakpoint", PERF_TYPE_BREAKPOINT);
			
 
				 
			
 
				 	return register_die_notifier(&hw_breakpoint_exceptions_nb);
			
 
				 
			
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -354,13 +354,20 @@ static inline int kprobe_aggrprobe(struct kprobe *p)
 
				 	return p->pre_handler == aggr_pre_handler;
			
 
				 }
			
 
				 
			
 
				+/* Return true(!0) if the kprobe is unused */
			
 
				+static inline int kprobe_unused(struct kprobe *p)
			
 
				+{
			
 
				+	return kprobe_aggrprobe(p) && kprobe_disabled(p) &&
			
 
				+	       list_empty(&p->list);
			
 
				+}
			
 
				+
			
 
				 /*
			
 
				  * Keep all fields in the kprobe consistent
			
 
				  */
			
 
				-static inline void copy_kprobe(struct kprobe *old_p, struct kprobe *p)
			
 
				+static inline void copy_kprobe(struct kprobe *ap, struct kprobe *p)
			
 
				 {
			
 
				-	memcpy(&p->opcode, &old_p->opcode, sizeof(kprobe_opcode_t));
			
 
				-	memcpy(&p->ainsn, &old_p->ainsn, sizeof(struct arch_specific_insn));
			
 
				+	memcpy(&p->opcode, &ap->opcode, sizeof(kprobe_opcode_t));
			
 
				+	memcpy(&p->ainsn, &ap->ainsn, sizeof(struct arch_specific_insn));
			
 
				 }
			
 
				 
			
 
				 #ifdef CONFIG_OPTPROBES
			
@@ -384,6 +391,17 @@ void __kprobes opt_pre_handler(struct kprobe *p, struct pt_regs *regs)
 
				 	}
			
 
				 }
			
 
				 
			
 
				+/* Free optimized instructions and optimized_kprobe */
			
 
				+static __kprobes void free_aggr_kprobe(struct kprobe *p)
			
 
				+{
			
 
				+	struct optimized_kprobe *op;
			
 
				+
			
 
				+	op = container_of(p, struct optimized_kprobe, kp);
			
 
				+	arch_remove_optimized_kprobe(op);
			
 
				+	arch_remove_kprobe(p);
			
 
				+	kfree(op);
			
 
				+}
			
 
				+
			
 
				 /* Return true(!0) if the kprobe is ready for optimization. */
			
 
				 static inline int kprobe_optready(struct kprobe *p)
			
 
				 {
			
@@ -397,6 +415,33 @@ static inline int kprobe_optready(struct kprobe *p)
 
				 	return 0;
			
 
				 }
			
 
				 
			
 
				+/* Return true(!0) if the kprobe is disarmed. Note: p must be on hash list */
			
 
				+static inline int kprobe_disarmed(struct kprobe *p)
			
 
				+{
			
 
				+	struct optimized_kprobe *op;
			
 
				+
			
 
				+	/* If kprobe is not aggr/opt probe, just return kprobe is disabled */
			
 
				+	if (!kprobe_aggrprobe(p))
			
 
				+		return kprobe_disabled(p);
			
 
				+
			
 
				+	op = container_of(p, struct optimized_kprobe, kp);
			
 
				+
			
 
				+	return kprobe_disabled(p) && list_empty(&op->list);
			
 
				+}
			
 
				+
			
 
				+/* Return true(!0) if the probe is queued on (un)optimizing lists */
			
 
				+static int __kprobes kprobe_queued(struct kprobe *p)
			
 
				+{
			
 
				+	struct optimized_kprobe *op;
			
 
				+
			
 
				+	if (kprobe_aggrprobe(p)) {
			
 
				+		op = container_of(p, struct optimized_kprobe, kp);
			
 
				+		if (!list_empty(&op->list))
			
 
				+			return 1;
			
 
				+	}
			
 
				+	return 0;
			
 
				+}
			
 
				+
			
 
				 /*
			
 
				  * Return an optimized kprobe whose optimizing code replaces
			
 
				  * instructions including addr (exclude breakpoint).
			
@@ -422,30 +467,23 @@ static struct kprobe *__kprobes get_optimized_kprobe(unsigned long addr)
 
				 
			
 
				 /* Optimization staging list, protected by kprobe_mutex */
			
 
				 static LIST_HEAD(optimizing_list);
			
 
				+static LIST_HEAD(unoptimizing_list);
			
 
				 
			
 
				 static void kprobe_optimizer(struct work_struct *work);
			
 
				 static DECLARE_DELAYED_WORK(optimizing_work, kprobe_optimizer);
			
 
				+static DECLARE_COMPLETION(optimizer_comp);
			
 
				 #define OPTIMIZE_DELAY 5
			
 
				 
			
 
				-/* Kprobe jump optimizer */
			
 
				-static __kprobes void kprobe_optimizer(struct work_struct *work)
			
 
				+/*
			
 
				+ * Optimize (replace a breakpoint with a jump) kprobes listed on
			
 
				+ * optimizing_list.
			
 
				+ */
			
 
				+static __kprobes void do_optimize_kprobes(void)
			
 
				 {
			
 
				-	struct optimized_kprobe *op, *tmp;
			
 
				-
			
 
				-	/* Lock modules while optimizing kprobes */
			
 
				-	mutex_lock(&module_mutex);
			
 
				-	mutex_lock(&kprobe_mutex);
			
 
				-	if (kprobes_all_disarmed || !kprobes_allow_optimization)
			
 
				-		goto end;
			
 
				-
			
 
				-	/*
			
 
				-	 * Wait for quiesence period to ensure all running interrupts
			
 
				-	 * are done. Because optprobe may modify multiple instructions
			
 
				-	 * there is a chance that Nth instruction is interrupted. In that
			
 
				-	 * case, running interrupt can return to 2nd-Nth byte of jump
			
 
				-	 * instruction. This wait is for avoiding it.
			
 
				-	 */
			
 
				-	synchronize_sched();
			
 
				+	/* Optimization never be done when disarmed */
			
 
				+	if (kprobes_all_disarmed || !kprobes_allow_optimization ||
			
 
				+	    list_empty(&optimizing_list))
			
 
				+		return;
			
 
				 
			
 
				 	/*
			
 
				 	 * The optimization/unoptimization refers online_cpus via
			
@@ -459,17 +497,111 @@ static __kprobes void kprobe_optimizer(struct work_struct *work)
 
				 	 */
			
 
				 	get_online_cpus();
			
 
				 	mutex_lock(&text_mutex);
			
 
				-	list_for_each_entry_safe(op, tmp, &optimizing_list, list) {
			
 
				-		WARN_ON(kprobe_disabled(&op->kp));
			
 
				-		if (arch_optimize_kprobe(op) < 0)
			
 
				-			op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
			
 
				-		list_del_init(&op->list);
			
 
				+	arch_optimize_kprobes(&optimizing_list);
			
 
				+	mutex_unlock(&text_mutex);
			
 
				+	put_online_cpus();
			
 
				+}
			
 
				+
			
 
				+/*
			
 
				+ * Unoptimize (replace a jump with a breakpoint and remove the breakpoint
			
 
				+ * if need) kprobes listed on unoptimizing_list.
			
 
				+ */
			
 
				+static __kprobes void do_unoptimize_kprobes(struct list_head *free_list)
			
 
				+{
			
 
				+	struct optimized_kprobe *op, *tmp;
			
 
				+
			
 
				+	/* Unoptimization must be done anytime */
			
 
				+	if (list_empty(&unoptimizing_list))
			
 
				+		return;
			
 
				+
			
 
				+	/* Ditto to do_optimize_kprobes */
			
 
				+	get_online_cpus();
			
 
				+	mutex_lock(&text_mutex);
			
 
				+	arch_unoptimize_kprobes(&unoptimizing_list, free_list);
			
 
				+	/* Loop free_list for disarming */
			
 
				+	list_for_each_entry_safe(op, tmp, free_list, list) {
			
 
				+		/* Disarm probes if marked disabled */
			
 
				+		if (kprobe_disabled(&op->kp))
			
 
				+			arch_disarm_kprobe(&op->kp);
			
 
				+		if (kprobe_unused(&op->kp)) {
			
 
				+			/*
			
 
				+			 * Remove unused probes from hash list. After waiting
			
 
				+			 * for synchronization, these probes are reclaimed.
			
 
				+			 * (reclaiming is done by do_free_cleaned_kprobes.)
			
 
				+			 */
			
 
				+			hlist_del_rcu(&op->kp.hlist);
			
 
				+		} else
			
 
				+			list_del_init(&op->list);
			
 
				 	}
			
 
				 	mutex_unlock(&text_mutex);
			
 
				 	put_online_cpus();
			
 
				-end:
			
 
				+}
			
 
				+
			
 
				+/* Reclaim all kprobes on the free_list */
			
 
				+static __kprobes void do_free_cleaned_kprobes(struct list_head *free_list)
			
 
				+{
			
 
				+	struct optimized_kprobe *op, *tmp;
			
 
				+
			
 
				+	list_for_each_entry_safe(op, tmp, free_list, list) {
			
 
				+		BUG_ON(!kprobe_unused(&op->kp));
			
 
				+		list_del_init(&op->list);
			
 
				+		free_aggr_kprobe(&op->kp);
			
 
				+	}
			
 
				+}
			
 
				+
			
 
				+/* Start optimizer after OPTIMIZE_DELAY passed */
			
 
				+static __kprobes void kick_kprobe_optimizer(void)
			
 
				+{
			
 
				+	if (!delayed_work_pending(&optimizing_work))
			
 
				+		schedule_delayed_work(&optimizing_work, OPTIMIZE_DELAY);
			
 
				+}
			
 
				+
			
 
				+/* Kprobe jump optimizer */
			
 
				+static __kprobes void kprobe_optimizer(struct work_struct *work)
			
 
				+{
			
 
				+	LIST_HEAD(free_list);
			
 
				+
			
 
				+	/* Lock modules while optimizing kprobes */
			
 
				+	mutex_lock(&module_mutex);
			
 
				+	mutex_lock(&kprobe_mutex);
			
 
				+
			
 
				+	/*
			
 
				+	 * Step 1: Unoptimize kprobes and collect cleaned (unused and disarmed)
			
 
				+	 * kprobes before waiting for quiesence period.
			
 
				+	 */
			
 
				+	do_unoptimize_kprobes(&free_list);
			
 
				+
			
 
				+	/*
			
 
				+	 * Step 2: Wait for quiesence period to ensure all running interrupts
			
 
				+	 * are done. Because optprobe may modify multiple instructions
			
 
				+	 * there is a chance that Nth instruction is interrupted. In that
			
 
				+	 * case, running interrupt can return to 2nd-Nth byte of jump
			
 
				+	 * instruction. This wait is for avoiding it.
			
 
				+	 */
			
 
				+	synchronize_sched();
			
 
				+
			
 
				+	/* Step 3: Optimize kprobes after quiesence period */
			
 
				+	do_optimize_kprobes();
			
 
				+
			
 
				+	/* Step 4: Free cleaned kprobes after quiesence period */
			
 
				+	do_free_cleaned_kprobes(&free_list);
			
 
				+
			
 
				 	mutex_unlock(&kprobe_mutex);
			
 
				 	mutex_unlock(&module_mutex);
			
 
				+
			
 
				+	/* Step 5: Kick optimizer again if needed */
			
 
				+	if (!list_empty(&optimizing_list) || !list_empty(&unoptimizing_list))
			
 
				+		kick_kprobe_optimizer();
			
 
				+	else
			
 
				+		/* Wake up all waiters */
			
 
				+		complete_all(&optimizer_comp);
			
 
				+}
			
 
				+
			
 
				+/* Wait for completing optimization and unoptimization */
			
 
				+static __kprobes void wait_for_kprobe_optimizer(void)
			
 
				+{
			
 
				+	if (delayed_work_pending(&optimizing_work))
			
 
				+		wait_for_completion(&optimizer_comp);
			
 
				 }
			
 
				 
			
 
				 /* Optimize kprobe if p is ready to be optimized */
			
@@ -495,42 +627,99 @@ static __kprobes void optimize_kprobe(struct kprobe *p)
 
				 	/* Check if it is already optimized. */
			
 
				 	if (op->kp.flags & KPROBE_FLAG_OPTIMIZED)
			
 
				 		return;
			
 
				-
			
 
				 	op->kp.flags |= KPROBE_FLAG_OPTIMIZED;
			
 
				-	list_add(&op->list, &optimizing_list);
			
 
				-	if (!delayed_work_pending(&optimizing_work))
			
 
				-		schedule_delayed_work(&optimizing_work, OPTIMIZE_DELAY);
			
 
				+
			
 
				+	if (!list_empty(&op->list))
			
 
				+		/* This is under unoptimizing. Just dequeue the probe */
			
 
				+		list_del_init(&op->list);
			
 
				+	else {
			
 
				+		list_add(&op->list, &optimizing_list);
			
 
				+		kick_kprobe_optimizer();
			
 
				+	}
			
 
				+}
			
 
				+
			
 
				+/* Short cut to direct unoptimizing */
			
 
				+static __kprobes void force_unoptimize_kprobe(struct optimized_kprobe *op)
			
 
				+{
			
 
				+	get_online_cpus();
			
 
				+	arch_unoptimize_kprobe(op);
			
 
				+	put_online_cpus();
			
 
				+	if (kprobe_disabled(&op->kp))
			
 
				+		arch_disarm_kprobe(&op->kp);
			
 
				 }
			
 
				 
			
 
				 /* Unoptimize a kprobe if p is optimized */
			
 
				-static __kprobes void unoptimize_kprobe(struct kprobe *p)
			
 
				+static __kprobes void unoptimize_kprobe(struct kprobe *p, bool force)
			
 
				 {
			
 
				 	struct optimized_kprobe *op;
			
 
				 
			
 
				-	if ((p->flags & KPROBE_FLAG_OPTIMIZED) && kprobe_aggrprobe(p)) {
			
 
				-		op = container_of(p, struct optimized_kprobe, kp);
			
 
				-		if (!list_empty(&op->list))
			
 
				-			/* Dequeue from the optimization queue */
			
 
				+	if (!kprobe_aggrprobe(p) || kprobe_disarmed(p))
			
 
				+		return; /* This is not an optprobe nor optimized */
			
 
				+
			
 
				+	op = container_of(p, struct optimized_kprobe, kp);
			
 
				+	if (!kprobe_optimized(p)) {
			
 
				+		/* Unoptimized or unoptimizing case */
			
 
				+		if (force && !list_empty(&op->list)) {
			
 
				+			/*
			
 
				+			 * Only if this is unoptimizing kprobe and forced,
			
 
				+			 * forcibly unoptimize it. (No need to unoptimize
			
 
				+			 * unoptimized kprobe again :)
			
 
				+			 */
			
 
				 			list_del_init(&op->list);
			
 
				-		else
			
 
				-			/* Replace jump with break */
			
 
				-			arch_unoptimize_kprobe(op);
			
 
				-		op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
			
 
				+			force_unoptimize_kprobe(op);
			
 
				+		}
			
 
				+		return;
			
 
				+	}
			
 
				+
			
 
				+	op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
			
 
				+	if (!list_empty(&op->list)) {
			
 
				+		/* Dequeue from the optimization queue */
			
 
				+		list_del_init(&op->list);
			
 
				+		return;
			
 
				+	}
			
 
				+	/* Optimized kprobe case */
			
 
				+	if (force)
			
 
				+		/* Forcibly update the code: this is a special case */
			
 
				+		force_unoptimize_kprobe(op);
			
 
				+	else {
			
 
				+		list_add(&op->list, &unoptimizing_list);
			
 
				+		kick_kprobe_optimizer();
			
 
				 	}
			
 
				 }
			
 
				 
			
 
				+/* Cancel unoptimizing for reusing */
			
 
				+static void reuse_unused_kprobe(struct kprobe *ap)
			
 
				+{
			
 
				+	struct optimized_kprobe *op;
			
 
				+
			
 
				+	BUG_ON(!kprobe_unused(ap));
			
 
				+	/*
			
 
				+	 * Unused kprobe MUST be on the way of delayed unoptimizing (means
			
 
				+	 * there is still a relative jump) and disabled.
			
 
				+	 */
			
 
				+	op = container_of(ap, struct optimized_kprobe, kp);
			
 
				+	if (unlikely(list_empty(&op->list)))
			
 
				+		printk(KERN_WARNING "Warning: found a stray unused "
			
 
				+			"aggrprobe@%p\n", ap->addr);
			
 
				+	/* Enable the probe again */
			
 
				+	ap->flags &= ~KPROBE_FLAG_DISABLED;
			
 
				+	/* Optimize it again (remove from op->list) */
			
 
				+	BUG_ON(!kprobe_optready(ap));
			
 
				+	optimize_kprobe(ap);
			
 
				+}
			
 
				+
			
 
				 /* Remove optimized instructions */
			
 
				 static void __kprobes kill_optimized_kprobe(struct kprobe *p)
			
 
				 {
			
 
				 	struct optimized_kprobe *op;
			
 
				 
			
 
				 	op = container_of(p, struct optimized_kprobe, kp);
			
 
				-	if (!list_empty(&op->list)) {
			
 
				-		/* Dequeue from the optimization queue */
			
 
				+	if (!list_empty(&op->list))
			
 
				+		/* Dequeue from the (un)optimization queue */
			
 
				 		list_del_init(&op->list);
			
 
				-		op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
			
 
				-	}
			
 
				-	/* Don't unoptimize, because the target code will be freed. */
			
 
				+
			
 
				+	op->kp.flags &= ~KPROBE_FLAG_OPTIMIZED;
			
 
				+	/* Don't touch the code, because it is already freed. */
			
 
				 	arch_remove_optimized_kprobe(op);
			
 
				 }
			
 
				 
			
@@ -543,16 +732,6 @@ static __kprobes void prepare_optimized_kprobe(struct kprobe *p)
 
				 	arch_prepare_optimized_kprobe(op);
			
 
				 }
			
 
				 
			
 
				-/* Free optimized instructions and optimized_kprobe */
			
 
				-static __kprobes void free_aggr_kprobe(struct kprobe *p)
			
 
				-{
			
 
				-	struct optimized_kprobe *op;
			
 
				-
			
 
				-	op = container_of(p, struct optimized_kprobe, kp);
			
 
				-	arch_remove_optimized_kprobe(op);
			
 
				-	kfree(op);
			
 
				-}
			
 
				-
			
 
				 /* Allocate new optimized_kprobe and try to prepare optimized instructions */
			
 
				 static __kprobes struct kprobe *alloc_aggr_kprobe(struct kprobe *p)
			
 
				 {
			
@@ -587,7 +766,8 @@ static __kprobes void try_to_optimize_kprobe(struct kprobe *p)
 
				 	op = container_of(ap, struct optimized_kprobe, kp);
			
 
				 	if (!arch_prepared_optinsn(&op->optinsn)) {
			
 
				 		/* If failed to setup optimizing, fallback to kprobe */
			
 
				-		free_aggr_kprobe(ap);
			
 
				+		arch_remove_optimized_kprobe(op);
			
 
				+		kfree(op);
			
 
				 		return;
			
 
				 	}
			
 
				 
			
@@ -631,21 +811,16 @@ static void __kprobes unoptimize_all_kprobes(void)
 
				 		return;
			
 
				 
			
 
				 	kprobes_allow_optimization = false;
			
 
				-	printk(KERN_INFO "Kprobes globally unoptimized\n");
			
 
				-	get_online_cpus();	/* For avoiding text_mutex deadlock */
			
 
				-	mutex_lock(&text_mutex);
			
 
				 	for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
			
 
				 		head = &kprobe_table[i];
			
 
				 		hlist_for_each_entry_rcu(p, node, head, hlist) {
			
 
				 			if (!kprobe_disabled(p))
			
 
				-				unoptimize_kprobe(p);
			
 
				+				unoptimize_kprobe(p, false);
			
 
				 		}
			
 
				 	}
			
 
				-
			
 
				-	mutex_unlock(&text_mutex);
			
 
				-	put_online_cpus();
			
 
				-	/* Allow all currently running kprobes to complete */
			
 
				-	synchronize_sched();
			
 
				+	/* Wait for unoptimizing completion */
			
 
				+	wait_for_kprobe_optimizer();
			
 
				+	printk(KERN_INFO "Kprobes globally unoptimized\n");
			
 
				 }
			
 
				 
			
 
				 int sysctl_kprobes_optimization;
			
@@ -669,44 +844,60 @@ int proc_kprobes_optimization_handler(struct ctl_table *table, int write,
 
				 }
			
 
				 #endif /* CONFIG_SYSCTL */
			
 
				 
			
 
				+/* Put a breakpoint for a probe. Must be called with text_mutex locked */
			
 
				 static void __kprobes __arm_kprobe(struct kprobe *p)
			
 
				 {
			
 
				-	struct kprobe *old_p;
			
 
				+	struct kprobe *_p;
			
 
				 
			
 
				 	/* Check collision with other optimized kprobes */
			
 
				-	old_p = get_optimized_kprobe((unsigned long)p->addr);
			
 
				-	if (unlikely(old_p))
			
 
				-		unoptimize_kprobe(old_p); /* Fallback to unoptimized kprobe */
			
 
				+	_p = get_optimized_kprobe((unsigned long)p->addr);
			
 
				+	if (unlikely(_p))
			
 
				+		/* Fallback to unoptimized kprobe */
			
 
				+		unoptimize_kprobe(_p, true);
			
 
				 
			
 
				 	arch_arm_kprobe(p);
			
 
				 	optimize_kprobe(p);	/* Try to optimize (add kprobe to a list) */
			
 
				 }
			
 
				 
			
 
				-static void __kprobes __disarm_kprobe(struct kprobe *p)
			
 
				+/* Remove the breakpoint of a probe. Must be called with text_mutex locked */
			
 
				+static void __kprobes __disarm_kprobe(struct kprobe *p, bool reopt)
			
 
				 {
			
 
				-	struct kprobe *old_p;
			
 
				+	struct kprobe *_p;
			
 
				 
			
 
				-	unoptimize_kprobe(p);	/* Try to unoptimize */
			
 
				-	arch_disarm_kprobe(p);
			
 
				+	unoptimize_kprobe(p, false);	/* Try to unoptimize */
			
 
				 
			
 
				-	/* If another kprobe was blocked, optimize it. */
			
 
				-	old_p = get_optimized_kprobe((unsigned long)p->addr);
			
 
				-	if (unlikely(old_p))
			
 
				-		optimize_kprobe(old_p);
			
 
				+	if (!kprobe_queued(p)) {
			
 
				+		arch_disarm_kprobe(p);
			
 
				+		/* If another kprobe was blocked, optimize it. */
			
 
				+		_p = get_optimized_kprobe((unsigned long)p->addr);
			
 
				+		if (unlikely(_p) && reopt)
			
 
				+			optimize_kprobe(_p);
			
 
				+	}
			
 
				+	/* TODO: reoptimize others after unoptimized this probe */
			
 
				 }
			
 
				 
			
 
				 #else /* !CONFIG_OPTPROBES */
			
 
				 
			
 
				 #define optimize_kprobe(p)			do {} while (0)
			
 
				-#define unoptimize_kprobe(p)			do {} while (0)
			
 
				+#define unoptimize_kprobe(p, f)			do {} while (0)
			
 
				 #define kill_optimized_kprobe(p)		do {} while (0)
			
 
				 #define prepare_optimized_kprobe(p)		do {} while (0)
			
 
				 #define try_to_optimize_kprobe(p)		do {} while (0)
			
 
				 #define __arm_kprobe(p)				arch_arm_kprobe(p)
			
 
				-#define __disarm_kprobe(p)			arch_disarm_kprobe(p)
			
 
				+#define __disarm_kprobe(p, o)			arch_disarm_kprobe(p)
			
 
				+#define kprobe_disarmed(p)			kprobe_disabled(p)
			
 
				+#define wait_for_kprobe_optimizer()		do {} while (0)
			
 
				+
			
 
				+/* There should be no unused kprobes can be reused without optimization */
			
 
				+static void reuse_unused_kprobe(struct kprobe *ap)
			
 
				+{
			
 
				+	printk(KERN_ERR "Error: There should be no unused kprobe here.\n");
			
 
				+	BUG_ON(kprobe_unused(ap));
			
 
				+}
			
 
				 
			
 
				 static __kprobes void free_aggr_kprobe(struct kprobe *p)
			
 
				 {
			
 
				+	arch_remove_kprobe(p);
			
 
				 	kfree(p);
			
 
				 }
			
 
				 
			
@@ -732,11 +923,10 @@ static void __kprobes arm_kprobe(struct kprobe *kp)
 
				 /* Disarm a kprobe with text_mutex */
			
 
				 static void __kprobes disarm_kprobe(struct kprobe *kp)
			
 
				 {
			
 
				-	get_online_cpus();	/* For avoiding text_mutex deadlock */
			
 
				+	/* Ditto */
			
 
				 	mutex_lock(&text_mutex);
			
 
				-	__disarm_kprobe(kp);
			
 
				+	__disarm_kprobe(kp, true);
			
 
				 	mutex_unlock(&text_mutex);
			
 
				-	put_online_cpus();
			
 
				 }
			
 
				 
			
 
				 /*
			
@@ -942,7 +1132,7 @@ static int __kprobes add_new_kprobe(struct kprobe *ap, struct kprobe *p)
 
				 	BUG_ON(kprobe_gone(ap) || kprobe_gone(p));
			
 
				 
			
 
				 	if (p->break_handler || p->post_handler)
			
 
				-		unoptimize_kprobe(ap);	/* Fall back to normal kprobe */
			
 
				+		unoptimize_kprobe(ap, true);	/* Fall back to normal kprobe */
			
 
				 
			
 
				 	if (p->break_handler) {
			
 
				 		if (ap->break_handler)
			
@@ -993,19 +1183,21 @@ static void __kprobes init_aggr_kprobe(struct kprobe *ap, struct kprobe *p)
 
				  * This is the second or subsequent kprobe at the address - handle
			
 
				  * the intricacies
			
 
				  */
			
 
				-static int __kprobes register_aggr_kprobe(struct kprobe *old_p,
			
 
				+static int __kprobes register_aggr_kprobe(struct kprobe *orig_p,
			
 
				 					  struct kprobe *p)
			
 
				 {
			
 
				 	int ret = 0;
			
 
				-	struct kprobe *ap = old_p;
			
 
				+	struct kprobe *ap = orig_p;
			
 
				 
			
 
				-	if (!kprobe_aggrprobe(old_p)) {
			
 
				-		/* If old_p is not an aggr_kprobe, create new aggr_kprobe. */
			
 
				-		ap = alloc_aggr_kprobe(old_p);
			
 
				+	if (!kprobe_aggrprobe(orig_p)) {
			
 
				+		/* If orig_p is not an aggr_kprobe, create new aggr_kprobe. */
			
 
				+		ap = alloc_aggr_kprobe(orig_p);
			
 
				 		if (!ap)
			
 
				 			return -ENOMEM;
			
 
				-		init_aggr_kprobe(ap, old_p);
			
 
				-	}
			
 
				+		init_aggr_kprobe(ap, orig_p);
			
 
				+	} else if (kprobe_unused(ap))
			
 
				+		/* This probe is going to die. Rescue it */
			
 
				+		reuse_unused_kprobe(ap);
			
 
				 
			
 
				 	if (kprobe_gone(ap)) {
			
 
				 		/*
			
@@ -1039,23 +1231,6 @@ static int __kprobes register_aggr_kprobe(struct kprobe *old_p,
 
				 	return add_new_kprobe(ap, p);
			
 
				 }
			
 
				 
			
 
				-/* Try to disable aggr_kprobe, and return 1 if succeeded.*/
			
 
				-static int __kprobes try_to_disable_aggr_kprobe(struct kprobe *p)
			
 
				-{
			
 
				-	struct kprobe *kp;
			
 
				-
			
 
				-	list_for_each_entry_rcu(kp, &p->list, list) {
			
 
				-		if (!kprobe_disabled(kp))
			
 
				-			/*
			
 
				-			 * There is an active probe on the list.
			
 
				-			 * We can't disable aggr_kprobe.
			
 
				-			 */
			
 
				-			return 0;
			
 
				-	}
			
 
				-	p->flags |= KPROBE_FLAG_DISABLED;
			
 
				-	return 1;
			
 
				-}
			
 
				-
			
 
				 static int __kprobes in_kprobes_functions(unsigned long addr)
			
 
				 {
			
 
				 	struct kprobe_blackpoint *kb;
			
@@ -1098,34 +1273,33 @@ static kprobe_opcode_t __kprobes *kprobe_addr(struct kprobe *p)
 
				 /* Check passed kprobe is valid and return kprobe in kprobe_table. */
			
 
				 static struct kprobe * __kprobes __get_valid_kprobe(struct kprobe *p)
			
 
				 {
			
 
				-	struct kprobe *old_p, *list_p;
			
 
				+	struct kprobe *ap, *list_p;
			
 
				 
			
 
				-	old_p = get_kprobe(p->addr);
			
 
				-	if (unlikely(!old_p))
			
 
				+	ap = get_kprobe(p->addr);
			
 
				+	if (unlikely(!ap))
			
 
				 		return NULL;
			
 
				 
			
 
				-	if (p != old_p) {
			
 
				-		list_for_each_entry_rcu(list_p, &old_p->list, list)
			
 
				+	if (p != ap) {
			
 
				+		list_for_each_entry_rcu(list_p, &ap->list, list)
			
 
				 			if (list_p == p)
			
 
				 			/* kprobe p is a valid probe */
			
 
				 				goto valid;
			
 
				 		return NULL;
			
 
				 	}
			
 
				 valid:
			
 
				-	return old_p;
			
 
				+	return ap;
			
 
				 }
			
 
				 
			
 
				 /* Return error if the kprobe is being re-registered */
			
 
				 static inline int check_kprobe_rereg(struct kprobe *p)
			
 
				 {
			
 
				 	int ret = 0;
			
 
				-	struct kprobe *old_p;
			
 
				 
			
 
				 	mutex_lock(&kprobe_mutex);
			
 
				-	old_p = __get_valid_kprobe(p);
			
 
				-	if (old_p)
			
 
				+	if (__get_valid_kprobe(p))
			
 
				 		ret = -EINVAL;
			
 
				 	mutex_unlock(&kprobe_mutex);
			
 
				+
			
 
				 	return ret;
			
 
				 }
			
 
				 
			
@@ -1229,67 +1403,121 @@ fail_with_jump_label:
 
				 }
			
 
				 EXPORT_SYMBOL_GPL(register_kprobe);
			
 
				 
			
 
				+/* Check if all probes on the aggrprobe are disabled */
			
 
				+static int __kprobes aggr_kprobe_disabled(struct kprobe *ap)
			
 
				+{
			
 
				+	struct kprobe *kp;
			
 
				+
			
 
				+	list_for_each_entry_rcu(kp, &ap->list, list)
			
 
				+		if (!kprobe_disabled(kp))
			
 
				+			/*
			
 
				+			 * There is an active probe on the list.
			
 
				+			 * We can't disable this ap.
			
 
				+			 */
			
 
				+			return 0;
			
 
				+
			
 
				+	return 1;
			
 
				+}
			
 
				+
			
 
				+/* Disable one kprobe: Make sure called under kprobe_mutex is locked */
			
 
				+static struct kprobe *__kprobes __disable_kprobe(struct kprobe *p)
			
 
				+{
			
 
				+	struct kprobe *orig_p;
			
 
				+
			
 
				+	/* Get an original kprobe for return */
			
 
				+	orig_p = __get_valid_kprobe(p);
			
 
				+	if (unlikely(orig_p == NULL))
			
 
				+		return NULL;
			
 
				+
			
 
				+	if (!kprobe_disabled(p)) {
			
 
				+		/* Disable probe if it is a child probe */
			
 
				+		if (p != orig_p)
			
 
				+			p->flags |= KPROBE_FLAG_DISABLED;
			
 
				+
			
 
				+		/* Try to disarm and disable this/parent probe */
			
 
				+		if (p == orig_p || aggr_kprobe_disabled(orig_p)) {
			
 
				+			disarm_kprobe(orig_p);
			
 
				+			orig_p->flags |= KPROBE_FLAG_DISABLED;
			
 
				+		}
			
 
				+	}
			
 
				+
			
 
				+	return orig_p;
			
 
				+}
			
 
				+
			
 
				 /*
			
 
				  * Unregister a kprobe without a scheduler synchronization.
			
 
				  */
			
 
				 static int __kprobes __unregister_kprobe_top(struct kprobe *p)
			
 
				 {
			
 
				-	struct kprobe *old_p, *list_p;
			
 
				+	struct kprobe *ap, *list_p;
			
 
				 
			
 
				-	old_p = __get_valid_kprobe(p);
			
 
				-	if (old_p == NULL)
			
 
				+	/* Disable kprobe. This will disarm it if needed. */
			
 
				+	ap = __disable_kprobe(p);
			
 
				+	if (ap == NULL)
			
 
				 		return -EINVAL;
			
 
				 
			
 
				-	if (old_p == p ||
			
 
				-	    (kprobe_aggrprobe(old_p) &&
			
 
				-	     list_is_singular(&old_p->list))) {
			
 
				+	if (ap == p)
			
 
				 		/*
			
 
				-		 * Only probe on the hash list. Disarm only if kprobes are
			
 
				-		 * enabled and not gone - otherwise, the breakpoint would
			
 
				-		 * already have been removed. We save on flushing icache.
			
 
				+		 * This probe is an independent(and non-optimized) kprobe
			
 
				+		 * (not an aggrprobe). Remove from the hash list.
			
 
				 		 */
			
 
				-		if (!kprobes_all_disarmed && !kprobe_disabled(old_p))
			
 
				-			disarm_kprobe(old_p);
			
 
				-		hlist_del_rcu(&old_p->hlist);
			
 
				-	} else {
			
 
				+		goto disarmed;
			
 
				+
			
 
				+	/* Following process expects this probe is an aggrprobe */
			
 
				+	WARN_ON(!kprobe_aggrprobe(ap));
			
 
				+
			
 
				+	if (list_is_singular(&ap->list) && kprobe_disarmed(ap))
			
 
				+		/*
			
 
				+		 * !disarmed could be happen if the probe is under delayed
			
 
				+		 * unoptimizing.
			
 
				+		 */
			
 
				+		goto disarmed;
			
 
				+	else {
			
 
				+		/* If disabling probe has special handlers, update aggrprobe */
			
 
				 		if (p->break_handler && !kprobe_gone(p))
			
 
				-			old_p->break_handler = NULL;
			
 
				+			ap->break_handler = NULL;
			
 
				 		if (p->post_handler && !kprobe_gone(p)) {
			
 
				-			list_for_each_entry_rcu(list_p, &old_p->list, list) {
			
 
				+			list_for_each_entry_rcu(list_p, &ap->list, list) {
			
 
				 				if ((list_p != p) && (list_p->post_handler))
			
 
				 					goto noclean;
			
 
				 			}
			
 
				-			old_p->post_handler = NULL;
			
 
				+			ap->post_handler = NULL;
			
 
				 		}
			
 
				 noclean:
			
 
				+		/*
			
 
				+		 * Remove from the aggrprobe: this path will do nothing in
			
 
				+		 * __unregister_kprobe_bottom().
			
 
				+		 */
			
 
				 		list_del_rcu(&p->list);
			
 
				-		if (!kprobe_disabled(old_p)) {
			
 
				-			try_to_disable_aggr_kprobe(old_p);
			
 
				-			if (!kprobes_all_disarmed) {
			
 
				-				if (kprobe_disabled(old_p))
			
 
				-					disarm_kprobe(old_p);
			
 
				-				else
			
 
				-					/* Try to optimize this probe again */
			
 
				-					optimize_kprobe(old_p);
			
 
				-			}
			
 
				-		}
			
 
				+		if (!kprobe_disabled(ap) && !kprobes_all_disarmed)
			
 
				+			/*
			
 
				+			 * Try to optimize this probe again, because post
			
 
				+			 * handler may have been changed.
			
 
				+			 */
			
 
				+			optimize_kprobe(ap);
			
 
				 	}
			
 
				 	return 0;
			
 
				+
			
 
				+disarmed:
			
 
				+	BUG_ON(!kprobe_disarmed(ap));
			
 
				+	hlist_del_rcu(&ap->hlist);
			
 
				+	return 0;
			
 
				 }
			
 
				 
			
 
				 static void __kprobes __unregister_kprobe_bottom(struct kprobe *p)
			
 
				 {
			
 
				-	struct kprobe *old_p;
			
 
				+	struct kprobe *ap;
			
 
				 
			
 
				 	if (list_empty(&p->list))
			
 
				+		/* This is an independent kprobe */
			
 
				 		arch_remove_kprobe(p);
			
 
				 	else if (list_is_singular(&p->list)) {
			
 
				-		/* "p" is the last child of an aggr_kprobe */
			
 
				-		old_p = list_entry(p->list.next, struct kprobe, list);
			
 
				+		/* This is the last child of an aggrprobe */
			
 
				+		ap = list_entry(p->list.next, struct kprobe, list);
			
 
				 		list_del(&p->list);
			
 
				-		arch_remove_kprobe(old_p);
			
 
				-		free_aggr_kprobe(old_p);
			
 
				+		free_aggr_kprobe(ap);
			
 
				 	}
			
 
				+	/* Otherwise, do nothing. */
			
 
				 }
			
 
				 
			
 
				 int __kprobes register_kprobes(struct kprobe **kps, int num)
			
@@ -1607,29 +1835,13 @@ static void __kprobes kill_kprobe(struct kprobe *p)
 
				 int __kprobes disable_kprobe(struct kprobe *kp)
			
 
				 {
			
 
				 	int ret = 0;
			
 
				-	struct kprobe *p;
			
 
				 
			
 
				 	mutex_lock(&kprobe_mutex);
			
 
				 
			
 
				-	/* Check whether specified probe is valid. */
			
 
				-	p = __get_valid_kprobe(kp);
			
 
				-	if (unlikely(p == NULL)) {
			
 
				+	/* Disable this kprobe */
			
 
				+	if (__disable_kprobe(kp) == NULL)
			
 
				 		ret = -EINVAL;
			
 
				-		goto out;
			
 
				-	}
			
 
				 
			
 
				-	/* If the probe is already disabled (or gone), just return */
			
 
				-	if (kprobe_disabled(kp))
			
 
				-		goto out;
			
 
				-
			
 
				-	kp->flags |= KPROBE_FLAG_DISABLED;
			
 
				-	if (p != kp)
			
 
				-		/* When kp != p, p is always enabled. */
			
 
				-		try_to_disable_aggr_kprobe(p);
			
 
				-
			
 
				-	if (!kprobes_all_disarmed && kprobe_disabled(p))
			
 
				-		disarm_kprobe(p);
			
 
				-out:
			
 
				 	mutex_unlock(&kprobe_mutex);
			
 
				 	return ret;
			
 
				 }
			
@@ -1927,36 +2139,27 @@ static void __kprobes disarm_all_kprobes(void)
 
				 	mutex_lock(&kprobe_mutex);
			
 
				 
			
 
				 	/* If kprobes are already disarmed, just return */
			
 
				-	if (kprobes_all_disarmed)
			
 
				-		goto already_disabled;
			
 
				+	if (kprobes_all_disarmed) {
			
 
				+		mutex_unlock(&kprobe_mutex);
			
 
				+		return;
			
 
				+	}
			
 
				 
			
 
				 	kprobes_all_disarmed = true;
			
 
				 	printk(KERN_INFO "Kprobes globally disabled\n");
			
 
				 
			
 
				-	/*
			
 
				-	 * Here we call get_online_cpus() for avoiding text_mutex deadlock,
			
 
				-	 * because disarming may also unoptimize kprobes.
			
 
				-	 */
			
 
				-	get_online_cpus();
			
 
				 	mutex_lock(&text_mutex);
			
 
				 	for (i = 0; i < KPROBE_TABLE_SIZE; i++) {
			
 
				 		head = &kprobe_table[i];
			
 
				 		hlist_for_each_entry_rcu(p, node, head, hlist) {
			
 
				 			if (!arch_trampoline_kprobe(p) && !kprobe_disabled(p))
			
 
				-				__disarm_kprobe(p);
			
 
				+				__disarm_kprobe(p, false);
			
 
				 		}
			
 
				 	}
			
 
				-
			
 
				 	mutex_unlock(&text_mutex);
			
 
				-	put_online_cpus();
			
 
				 	mutex_unlock(&kprobe_mutex);
			
 
				-	/* Allow all currently running kprobes to complete */
			
 
				-	synchronize_sched();
			
 
				-	return;
			
 
				 
			
 
				-already_disabled:
			
 
				-	mutex_unlock(&kprobe_mutex);
			
 
				-	return;
			
 
				+	/* Wait for disarming all kprobes by optimizer */
			
 
				+	wait_for_kprobe_optimizer();
			
 
				 }
			
 
				 
			
 
				 /*
			
--- a/kernel/perf_event.c
+++ b/kernel/perf_event.c
@@ -13,6 +13,7 @@
 
				 #include <linux/mm.h>
			
 
				 #include <linux/cpu.h>
			
 
				 #include <linux/smp.h>
			
 
				+#include <linux/idr.h>
			
 
				 #include <linux/file.h>
			
 
				 #include <linux/poll.h>
			
 
				 #include <linux/slab.h>
			
@@ -21,7 +22,9 @@
 
				 #include <linux/dcache.h>
			
 
				 #include <linux/percpu.h>
			
 
				 #include <linux/ptrace.h>
			
 
				+#include <linux/reboot.h>
			
 
				 #include <linux/vmstat.h>
			
 
				+#include <linux/device.h>
			
 
				 #include <linux/vmalloc.h>
			
 
				 #include <linux/hardirq.h>
			
 
				 #include <linux/rculist.h>
			
@@ -133,6 +136,28 @@ static void unclone_ctx(struct perf_event_context *ctx)
 
				 	}
			
 
				 }
			
 
				 
			
 
				+static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
			
 
				+{
			
 
				+	/*
			
 
				+	 * only top level events have the pid namespace they were created in
			
 
				+	 */
			
 
				+	if (event->parent)
			
 
				+		event = event->parent;
			
 
				+
			
 
				+	return task_tgid_nr_ns(p, event->ns);
			
 
				+}
			
 
				+
			
 
				+static u32 perf_event_tid(struct perf_event *event, struct task_struct *p)
			
 
				+{
			
 
				+	/*
			
 
				+	 * only top level events have the pid namespace they were created in
			
 
				+	 */
			
 
				+	if (event->parent)
			
 
				+		event = event->parent;
			
 
				+
			
 
				+	return task_pid_nr_ns(p, event->ns);
			
 
				+}
			
 
				+
			
 
				 /*
			
 
				  * If we inherit events we want to return the parent event id
			
 
				  * to userspace.
			
@@ -312,9 +337,84 @@ list_add_event(struct perf_event *event, struct perf_event_context *ctx)
 
				 		ctx->nr_stat++;
			
 
				 }
			
 
				 
			
 
				+/*
			
 
				+ * Called at perf_event creation and when events are attached/detached from a
			
 
				+ * group.
			
 
				+ */
			
 
				+static void perf_event__read_size(struct perf_event *event)
			
 
				+{
			
 
				+	int entry = sizeof(u64); /* value */
			
 
				+	int size = 0;
			
 
				+	int nr = 1;
			
 
				+
			
 
				+	if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
			
 
				+		size += sizeof(u64);
			
 
				+
			
 
				+	if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
			
 
				+		size += sizeof(u64);
			
 
				+
			
 
				+	if (event->attr.read_format & PERF_FORMAT_ID)
			
 
				+		entry += sizeof(u64);
			
 
				+
			
 
				+	if (event->attr.read_format & PERF_FORMAT_GROUP) {
			
 
				+		nr += event->group_leader->nr_siblings;
			
 
				+		size += sizeof(u64);
			
 
				+	}
			
 
				+
			
 
				+	size += entry * nr;
			
 
				+	event->read_size = size;
			
 
				+}
			
 
				+
			
 
				+static void perf_event__header_size(struct perf_event *event)
			
 
				+{
			
 
				+	struct perf_sample_data *data;
			
 
				+	u64 sample_type = event->attr.sample_type;
			
 
				+	u16 size = 0;
			
 
				+
			
 
				+	perf_event__read_size(event);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_IP)
			
 
				+		size += sizeof(data->ip);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_ADDR)
			
 
				+		size += sizeof(data->addr);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_PERIOD)
			
 
				+		size += sizeof(data->period);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_READ)
			
 
				+		size += event->read_size;
			
 
				+
			
 
				+	event->header_size = size;
			
 
				+}
			
 
				+
			
 
				+static void perf_event__id_header_size(struct perf_event *event)
			
 
				+{
			
 
				+	struct perf_sample_data *data;
			
 
				+	u64 sample_type = event->attr.sample_type;
			
 
				+	u16 size = 0;
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_TID)
			
 
				+		size += sizeof(data->tid_entry);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_TIME)
			
 
				+		size += sizeof(data->time);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_ID)
			
 
				+		size += sizeof(data->id);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_STREAM_ID)
			
 
				+		size += sizeof(data->stream_id);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_CPU)
			
 
				+		size += sizeof(data->cpu_entry);
			
 
				+
			
 
				+	event->id_header_size = size;
			
 
				+}
			
 
				+
			
 
				 static void perf_group_attach(struct perf_event *event)
			
 
				 {
			
 
				-	struct perf_event *group_leader = event->group_leader;
			
 
				+	struct perf_event *group_leader = event->group_leader, *pos;
			
 
				 
			
 
				 	/*
			
 
				 	 * We can have double attach due to group movement in perf_event_open.
			
@@ -333,6 +433,11 @@ static void perf_group_attach(struct perf_event *event)
 
				 
			
 
				 	list_add_tail(&event->group_entry, &group_leader->sibling_list);
			
 
				 	group_leader->nr_siblings++;
			
 
				+
			
 
				+	perf_event__header_size(group_leader);
			
 
				+
			
 
				+	list_for_each_entry(pos, &group_leader->sibling_list, group_entry)
			
 
				+		perf_event__header_size(pos);
			
 
				 }
			
 
				 
			
 
				 /*
			
@@ -391,7 +496,7 @@ static void perf_group_detach(struct perf_event *event)
 
				 	if (event->group_leader != event) {
			
 
				 		list_del_init(&event->group_entry);
			
 
				 		event->group_leader->nr_siblings--;
			
 
				-		return;
			
 
				+		goto out;
			
 
				 	}
			
 
				 
			
 
				 	if (!list_empty(&event->group_entry))
			
@@ -410,6 +515,12 @@ static void perf_group_detach(struct perf_event *event)
 
				 		/* Inherit group flags from the previous leader */
			
 
				 		sibling->group_flags = event->group_flags;
			
 
				 	}
			
 
				+
			
 
				+out:
			
 
				+	perf_event__header_size(event->group_leader);
			
 
				+
			
 
				+	list_for_each_entry(tmp, &event->group_leader->sibling_list, group_entry)
			
 
				+		perf_event__header_size(tmp);
			
 
				 }
			
 
				 
			
 
				 static inline int
			
@@ -1073,7 +1184,7 @@ static int perf_event_refresh(struct perf_event *event, int refresh)
 
				 	/*
			
 
				 	 * not supported on inherited events
			
 
				 	 */
			
 
				-	if (event->attr.inherit)
			
 
				+	if (event->attr.inherit || !is_sampling_event(event))
			
 
				 		return -EINVAL;
			
 
				 
			
 
				 	atomic_add(refresh, &event->event_limit);
			
@@ -2289,31 +2400,6 @@ static int perf_release(struct inode *inode, struct file *file)
 
				 	return perf_event_release_kernel(event);
			
 
				 }
			
 
				 
			
 
				-static int perf_event_read_size(struct perf_event *event)
			
 
				-{
			
 
				-	int entry = sizeof(u64); /* value */
			
 
				-	int size = 0;
			
 
				-	int nr = 1;
			
 
				-
			
 
				-	if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_ENABLED)
			
 
				-		size += sizeof(u64);
			
 
				-
			
 
				-	if (event->attr.read_format & PERF_FORMAT_TOTAL_TIME_RUNNING)
			
 
				-		size += sizeof(u64);
			
 
				-
			
 
				-	if (event->attr.read_format & PERF_FORMAT_ID)
			
 
				-		entry += sizeof(u64);
			
 
				-
			
 
				-	if (event->attr.read_format & PERF_FORMAT_GROUP) {
			
 
				-		nr += event->group_leader->nr_siblings;
			
 
				-		size += sizeof(u64);
			
 
				-	}
			
 
				-
			
 
				-	size += entry * nr;
			
 
				-
			
 
				-	return size;
			
 
				-}
			
 
				-
			
 
				 u64 perf_event_read_value(struct perf_event *event, u64 *enabled, u64 *running)
			
 
				 {
			
 
				 	struct perf_event *child;
			
@@ -2428,7 +2514,7 @@ perf_read_hw(struct perf_event *event, char __user *buf, size_t count)
 
				 	if (event->state == PERF_EVENT_STATE_ERROR)
			
 
				 		return 0;
			
 
				 
			
 
				-	if (count < perf_event_read_size(event))
			
 
				+	if (count < event->read_size)
			
 
				 		return -ENOSPC;
			
 
				 
			
 
				 	WARN_ON_ONCE(event->ctx->parent_ctx);
			
@@ -2514,7 +2600,7 @@ static int perf_event_period(struct perf_event *event, u64 __user *arg)
 
				 	int ret = 0;
			
 
				 	u64 value;
			
 
				 
			
 
				-	if (!event->attr.sample_period)
			
 
				+	if (!is_sampling_event(event))
			
 
				 		return -EINVAL;
			
 
				 
			
 
				 	if (copy_from_user(&value, arg, sizeof(value)))
			
@@ -3305,6 +3391,73 @@ __always_inline void perf_output_copy(struct perf_output_handle *handle,
 
				 	} while (len);
			
 
				 }
			
 
				 
			
 
				+static void __perf_event_header__init_id(struct perf_event_header *header,
			
 
				+					 struct perf_sample_data *data,
			
 
				+					 struct perf_event *event)
			
 
				+{
			
 
				+	u64 sample_type = event->attr.sample_type;
			
 
				+
			
 
				+	data->type = sample_type;
			
 
				+	header->size += event->id_header_size;
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_TID) {
			
 
				+		/* namespace issues */
			
 
				+		data->tid_entry.pid = perf_event_pid(event, current);
			
 
				+		data->tid_entry.tid = perf_event_tid(event, current);
			
 
				+	}
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_TIME)
			
 
				+		data->time = perf_clock();
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_ID)
			
 
				+		data->id = primary_event_id(event);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_STREAM_ID)
			
 
				+		data->stream_id = event->id;
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_CPU) {
			
 
				+		data->cpu_entry.cpu	 = raw_smp_processor_id();
			
 
				+		data->cpu_entry.reserved = 0;
			
 
				+	}
			
 
				+}
			
 
				+
			
 
				+static void perf_event_header__init_id(struct perf_event_header *header,
			
 
				+				       struct perf_sample_data *data,
			
 
				+				       struct perf_event *event)
			
 
				+{
			
 
				+	if (event->attr.sample_id_all)
			
 
				+		__perf_event_header__init_id(header, data, event);
			
 
				+}
			
 
				+
			
 
				+static void __perf_event__output_id_sample(struct perf_output_handle *handle,
			
 
				+					   struct perf_sample_data *data)
			
 
				+{
			
 
				+	u64 sample_type = data->type;
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_TID)
			
 
				+		perf_output_put(handle, data->tid_entry);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_TIME)
			
 
				+		perf_output_put(handle, data->time);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_ID)
			
 
				+		perf_output_put(handle, data->id);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_STREAM_ID)
			
 
				+		perf_output_put(handle, data->stream_id);
			
 
				+
			
 
				+	if (sample_type & PERF_SAMPLE_CPU)
			
 
				+		perf_output_put(handle, data->cpu_entry);
			
 
				+}
			
 
				+
			
 
				+static void perf_event__output_id_sample(struct perf_event *event,
			
 
				+					 struct perf_output_handle *handle,
			
 
				+					 struct perf_sample_data *sample)
			
 
				+{
			
 
				+	if (event->attr.sample_id_all)
			
 
				+		__perf_event__output_id_sample(handle, sample);
			
 
				+}
			
 
				+
			
 
				 int perf_output_begin(struct perf_output_handle *handle,
			
 
				 		      struct perf_event *event, unsigned int size,
			
 
				 		      int nmi, int sample)
			
@@ -3312,6 +3465,7 @@ int perf_output_begin(struct perf_output_handle *handle,
 
				 	struct perf_buffer *buffer;
			
 
				 	unsigned long tail, offset, head;
			
 
				 	int have_lost;
			
 
				+	struct perf_sample_data sample_data;
			
 
				 	struct {
			
 
				 		struct perf_event_header header;
			
 
				 		u64			 id;
			
@@ -3338,8 +3492,12 @@ int perf_output_begin(struct perf_output_handle *handle,
 
				 		goto out;
			
 
				 
			
 
				 	have_lost = local_read(&buffer->lost);
			
 
				-	if (have_lost)
			
 
				-		size += sizeof(lost_event);
			
 
				+	if (have_lost) {
			
 
				+		lost_event.header.size = sizeof(lost_event);
			
 
				+		perf_event_header__init_id(&lost_event.header, &sample_data,
			
 
				+					   event);
			
 
				+		size += lost_event.header.size;
			
 
				+	}
			
 
				 
			
 
				 	perf_output_get_handle(handle);
			
 
				 
			
@@ -3370,11 +3528,11 @@ int perf_output_begin(struct perf_output_handle *handle,
 
				 	if (have_lost) {
			
 
				 		lost_event.header.type = PERF_RECORD_LOST;
			
 
				 		lost_event.header.misc = 0;
			
 
				-		lost_event.header.size = sizeof(lost_event);
			
 
				 		lost_event.id          = event->id;
			
 
				 		lost_event.lost        = local_xchg(&buffer->lost, 0);
			
 
				 
			
 
				 		perf_output_put(handle, lost_event);
			
 
				+		perf_event__output_id_sample(event, handle, &sample_data);
			
 
				 	}
			
 
				 
			
 
				 	return 0;
			
@@ -3407,28 +3565,6 @@ void perf_output_end(struct perf_output_handle *handle)
 
				 	rcu_read_unlock();
			
 
				 }
			
 
				 
			
 
				-static u32 perf_event_pid(struct perf_event *event, struct task_struct *p)
			
 
				-{
			
 
				-	/*
			
 
				-	 * only top level events have the pid namespace they were created in
			
 
				-	 */
			
 
				-	if (event->parent)
			
 
				-		event = event->parent;
			
 
				-
			
 
				-	return task_tgid_nr_ns(p, event->ns);
			
 
				-}
			
 
				-
			
 
				-static u32 perf_event_tid(struct perf_event *event, struct task_struct *p)
			
 
				-{
			
 
				-	/*
			
 
				-	 * only top level events have the pid namespace they were created in
			
 
				-	 */
			
 
				-	if (event->parent)
			
 
				-		event = event->parent;
			
 
				-
			
 
				-	return task_pid_nr_ns(p, event->ns);
			
 
				-}
			
 
				-
			
 
				 static void perf_output_read_one(struct perf_output_handle *handle,
			
 
				 				 struct perf_event *event,
			
 
				 				 u64 enabled, u64 running)
			
@@ -3603,61 +3739,16 @@ void perf_prepare_sample(struct perf_event_header *header,
 
				 {
			
 
				 	u64 sample_type = event->attr.sample_type;
			
 
				 
			
 
				-	data->type = sample_type;
			
 
				-
			
 
				 	header->type = PERF_RECORD_SAMPLE;
			
 
				-	header->size = sizeof(*header);
			
 
				+	header->size = sizeof(*header) + event->header_size;
			
 
				 
			
 
				 	header->misc = 0;
			
 
				 	header->misc |= perf_misc_flags(regs);
			
 
				 
			
 
				-	if (sample_type & PERF_SAMPLE_IP) {
			
 
				-		data->ip = perf_instruction_pointer(regs);
			
 
				-
			
 
				-		header->size += sizeof(data->ip);
			
 
				-	}
			
 
				-
			
 
				-	if (sample_type & PERF_SAMPLE_TID) {
			
 
				-		/* namespace issues */
			
 
				-		data->tid_entry.pid = perf_event_pid(event, current);
			
 
				-		data->tid_entry.tid = perf_event_tid(event, current);
			
 
				-
			
 
				-		header->size += sizeof(data->tid_entry);
			
 
				-	}
			
 
				-
			
 
				-	if (sample_type & PERF_SAMPLE_TIME) {
			
 
				-		data->time = perf_clock();
			
 
				-
			
 
				-		header->size += sizeof(data->time);
			
 
				-	}
			
 
				-
			
 
				-	if (sample_type & PERF_SAMPLE_ADDR)
			
 
				-		header->size += sizeof(data->addr);
			
 
				-
			
 
				-	if (sample_type & PERF_SAMPLE_ID) {
			
 
				-		data->id = primary_event_id(event);
			
 
				-
			
 
				-		header->size += sizeof(data->id);
			
 
				-	}
			
 
				-
			
 
				-	if (sample_type & PERF_SAMPLE_STREAM_ID) {
			
 
				-		data->stream_id = event->id;
			
 
				-
			
 
				-		header->size += sizeof(data->stream_id);
			
 
				-	}
			
 
				-
			
 
				-	if (sample_type & PERF_SAMPLE_CPU) {
			
 
				-		data->cpu_entry.cpu		= raw_smp_processor_id();
			
 
				-		data->cpu_entry.reserved	= 0;
			
 
				-
			
 
				-		header->size += sizeof(data->cpu_entry);
			
 
				-	}
			
 
				-
			
 
				-	if (sample_type & PERF_SAMPLE_PERIOD)
			
 
				-		header->size += sizeof(data->period);
			
 
				+	__perf_event_header__init_id(header, data, event);
			
 
				 
			
 
				-	if (sample_type & PERF_SAMPLE_READ)
			
 
				-		header->size += perf_event_read_size(event);
			
 
				+	if (sample_type & PERF_SAMPLE_IP)
			
 
				+		data->ip = perf_instruction_pointer(regs);
			
 
				 
			
 
				 	if (sample_type & PERF_SAMPLE_CALLCHAIN) {
			
 
				 		int size = 1;
			
@@ -3722,23 +3813,26 @@ perf_event_read_event(struct perf_event *event,
 
				 			struct task_struct *task)
			
 
				 {
			
 
				 	struct perf_output_handle handle;
			
 
				+	struct perf_sample_data sample;
			
 
				 	struct perf_read_event read_event = {
			
 
				 		.header = {
			
 
				 			.type = PERF_RECORD_READ,
			
 
				 			.misc = 0,
			
 
				-			.size = sizeof(read_event) + perf_event_read_size(event),
			
 
				+			.size = sizeof(read_event) + event->read_size,
			
 
				 		},
			
 
				 		.pid = perf_event_pid(event, task),
			
 
				 		.tid = perf_event_tid(event, task),
			
 
				 	};
			
 
				 	int ret;
			
 
				 
			
 
				+	perf_event_header__init_id(&read_event.header, &sample, event);
			
 
				 	ret = perf_output_begin(&handle, event, read_event.header.size, 0, 0);
			
 
				 	if (ret)
			
 
				 		return;
			
 
				 
			
 
				 	perf_output_put(&handle, read_event);
			
 
				 	perf_output_read(&handle, event);
			
 
				+	perf_event__output_id_sample(event, &handle, &sample);
			
 
				 
			
 
				 	perf_output_end(&handle);
			
 
				 }
			
@@ -3768,14 +3862,16 @@ static void perf_event_task_output(struct perf_event *event,
 
				 				     struct perf_task_event *task_event)
			
 
				 {
			
 
				 	struct perf_output_handle handle;
			
 
				+	struct perf_sample_data	sample;
			
 
				 	struct task_struct *task = task_event->task;
			
 
				-	int size, ret;
			
 
				+	int ret, size = task_event->event_id.header.size;
			
 
				 
			
 
				-	size  = task_event->event_id.header.size;
			
 
				-	ret = perf_output_begin(&handle, event, size, 0, 0);
			
 
				+	perf_event_header__init_id(&task_event->event_id.header, &sample, event);
			
 
				 
			
 
				+	ret = perf_output_begin(&handle, event,
			
 
				+				task_event->event_id.header.size, 0, 0);
			
 
				 	if (ret)
			
 
				-		return;
			
 
				+		goto out;
			
 
				 
			
 
				 	task_event->event_id.pid = perf_event_pid(event, task);
			
 
				 	task_event->event_id.ppid = perf_event_pid(event, current);
			
@@ -3785,7 +3881,11 @@ static void perf_event_task_output(struct perf_event *event,
 
				 
			
 
				 	perf_output_put(&handle, task_event->event_id);
			
 
				 
			
 
				+	perf_event__output_id_sample(event, &handle, &sample);
			
 
				+
			
 
				 	perf_output_end(&handle);
			
 
				+out:
			
 
				+	task_event->event_id.header.size = size;
			
 
				 }
			
 
				 
			
 
				 static int perf_event_task_match(struct perf_event *event)
			
@@ -3900,11 +4000,16 @@ static void perf_event_comm_output(struct perf_event *event,
 
				 				     struct perf_comm_event *comm_event)
			
 
				 {
			
 
				 	struct perf_output_handle handle;
			
 
				+	struct perf_sample_data sample;
			
 
				 	int size = comm_event->event_id.header.size;
			
 
				-	int ret = perf_output_begin(&handle, event, size, 0, 0);
			
 
				+	int ret;
			
 
				+
			
 
				+	perf_event_header__init_id(&comm_event->event_id.header, &sample, event);
			
 
				+	ret = perf_output_begin(&handle, event,
			
 
				+				comm_event->event_id.header.size, 0, 0);
			
 
				 
			
 
				 	if (ret)
			
 
				-		return;
			
 
				+		goto out;
			
 
				 
			
 
				 	comm_event->event_id.pid = perf_event_pid(event, comm_event->task);
			
 
				 	comm_event->event_id.tid = perf_event_tid(event, comm_event->task);
			
@@ -3912,7 +4017,12 @@ static void perf_event_comm_output(struct perf_event *event,
 
				 	perf_output_put(&handle, comm_event->event_id);
			
 
				 	perf_output_copy(&handle, comm_event->comm,
			
 
				 				   comm_event->comm_size);
			
 
				+
			
 
				+	perf_event__output_id_sample(event, &handle, &sample);
			
 
				+
			
 
				 	perf_output_end(&handle);
			
 
				+out:
			
 
				+	comm_event->event_id.header.size = size;
			
 
				 }
			
 
				 
			
 
				 static int perf_event_comm_match(struct perf_event *event)
			
@@ -3957,7 +4067,6 @@ static void perf_event_comm_event(struct perf_comm_event *comm_event)
 
				 	comm_event->comm_size = size;
			
 
				 
			
 
				 	comm_event->event_id.header.size = sizeof(comm_event->event_id) + size;
			
 
				-
			
 
				 	rcu_read_lock();
			
 
				 	list_for_each_entry_rcu(pmu, &pmus, entry) {
			
 
				 		cpuctx = get_cpu_ptr(pmu->pmu_cpu_context);
			
@@ -4038,11 +4147,15 @@ static void perf_event_mmap_output(struct perf_event *event,
 
				 				     struct perf_mmap_event *mmap_event)
			
 
				 {
			
 
				 	struct perf_output_handle handle;
			
 
				+	struct perf_sample_data sample;
			
 
				 	int size = mmap_event->event_id.header.size;
			
 
				-	int ret = perf_output_begin(&handle, event, size, 0, 0);
			
 
				+	int ret;
			
 
				 
			
 
				+	perf_event_header__init_id(&mmap_event->event_id.header, &sample, event);
			
 
				+	ret = perf_output_begin(&handle, event,
			
 
				+				mmap_event->event_id.header.size, 0, 0);
			
 
				 	if (ret)
			
 
				-		return;
			
 
				+		goto out;
			
 
				 
			
 
				 	mmap_event->event_id.pid = perf_event_pid(event, current);
			
 
				 	mmap_event->event_id.tid = perf_event_tid(event, current);
			
@@ -4050,7 +4163,12 @@ static void perf_event_mmap_output(struct perf_event *event,
 
				 	perf_output_put(&handle, mmap_event->event_id);
			
 
				 	perf_output_copy(&handle, mmap_event->file_name,
			
 
				 				   mmap_event->file_size);
			
 
				+
			
 
				+	perf_event__output_id_sample(event, &handle, &sample);
			
 
				+
			
 
				 	perf_output_end(&handle);
			
 
				+out:
			
 
				+	mmap_event->event_id.header.size = size;
			
 
				 }
			
 
				 
			
 
				 static int perf_event_mmap_match(struct perf_event *event,
			
@@ -4205,6 +4323,7 @@ void perf_event_mmap(struct vm_area_struct *vma)
 
				 static void perf_log_throttle(struct perf_event *event, int enable)
			
 
				 {
			
 
				 	struct perf_output_handle handle;
			
 
				+	struct perf_sample_data sample;
			
 
				 	int ret;
			
 
				 
			
 
				 	struct {
			
@@ -4226,11 +4345,15 @@ static void perf_log_throttle(struct perf_event *event, int enable)
 
				 	if (enable)
			
 
				 		throttle_event.header.type = PERF_RECORD_UNTHROTTLE;
			
 
				 
			
 
				-	ret = perf_output_begin(&handle, event, sizeof(throttle_event), 1, 0);
			
 
				+	perf_event_header__init_id(&throttle_event.header, &sample, event);
			
 
				+
			
 
				+	ret = perf_output_begin(&handle, event,
			
 
				+				throttle_event.header.size, 1, 0);
			
 
				 	if (ret)
			
 
				 		return;
			
 
				 
			
 
				 	perf_output_put(&handle, throttle_event);
			
 
				+	perf_event__output_id_sample(event, &handle, &sample);
			
 
				 	perf_output_end(&handle);
			
 
				 }
			
 
				 
			
@@ -4246,6 +4369,13 @@ static int __perf_event_overflow(struct perf_event *event, int nmi,
 
				 	struct hw_perf_event *hwc = &event->hw;
			
 
				 	int ret = 0;
			
 
				 
			
 
				+	/*
			
 
				+	 * Non-sampling counters might still use the PMI to fold short
			
 
				+	 * hardware counters, ignore those.
			
 
				+	 */
			
 
				+	if (unlikely(!is_sampling_event(event)))
			
 
				+		return 0;
			
 
				+
			
 
				 	if (!throttle) {
			
 
				 		hwc->interrupts++;
			
 
				 	} else {
			
@@ -4391,7 +4521,7 @@ static void perf_swevent_event(struct perf_event *event, u64 nr,
 
				 	if (!regs)
			
 
				 		return;
			
 
				 
			
 
				-	if (!hwc->sample_period)
			
 
				+	if (!is_sampling_event(event))
			
 
				 		return;
			
 
				 
			
 
				 	if (nr == 1 && hwc->sample_period == 1 && !event->attr.freq)
			
@@ -4554,7 +4684,7 @@ static int perf_swevent_add(struct perf_event *event, int flags)
 
				 	struct hw_perf_event *hwc = &event->hw;
			
 
				 	struct hlist_head *head;
			
 
				 
			
 
				-	if (hwc->sample_period) {
			
 
				+	if (is_sampling_event(event)) {
			
 
				 		hwc->last_period = hwc->sample_period;
			
 
				 		perf_swevent_set_period(event);
			
 
				 	}
			
@@ -4811,15 +4941,6 @@ static int perf_tp_event_init(struct perf_event *event)
 
				 	if (event->attr.type != PERF_TYPE_TRACEPOINT)
			
 
				 		return -ENOENT;
			
 
				 
			
 
				-	/*
			
 
				-	 * Raw tracepoint data is a severe data leak, only allow root to
			
 
				-	 * have these.
			
 
				-	 */
			
 
				-	if ((event->attr.sample_type & PERF_SAMPLE_RAW) &&
			
 
				-			perf_paranoid_tracepoint_raw() &&
			
 
				-			!capable(CAP_SYS_ADMIN))
			
 
				-		return -EPERM;
			
 
				-
			
 
				 	err = perf_trace_init(event);
			
 
				 	if (err)
			
 
				 		return err;
			
@@ -4842,7 +4963,7 @@ static struct pmu perf_tracepoint = {
 
				 
			
 
				 static inline void perf_tp_register(void)
			
 
				 {
			
 
				-	perf_pmu_register(&perf_tracepoint);
			
 
				+	perf_pmu_register(&perf_tracepoint, "tracepoint", PERF_TYPE_TRACEPOINT);
			
 
				 }
			
 
				 
			
 
				 static int perf_event_set_filter(struct perf_event *event, void __user *arg)
			
@@ -4932,31 +5053,33 @@ static enum hrtimer_restart perf_swevent_hrtimer(struct hrtimer *hrtimer)
 
				 static void perf_swevent_start_hrtimer(struct perf_event *event)
			
 
				 {
			
 
				 	struct hw_perf_event *hwc = &event->hw;
			
 
				+	s64 period;
			
 
				+
			
 
				+	if (!is_sampling_event(event))
			
 
				+		return;
			
 
				 
			
 
				 	hrtimer_init(&hwc->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
			
 
				 	hwc->hrtimer.function = perf_swevent_hrtimer;
			
 
				-	if (hwc->sample_period) {
			
 
				-		s64 period = local64_read(&hwc->period_left);
			
 
				 
			
 
				-		if (period) {
			
 
				-			if (period < 0)
			
 
				-				period = 10000;
			
 
				+	period = local64_read(&hwc->period_left);
			
 
				+	if (period) {
			
 
				+		if (period < 0)
			
 
				+			period = 10000;
			
 
				 
			
 
				-			local64_set(&hwc->period_left, 0);
			
 
				-		} else {
			
 
				-			period = max_t(u64, 10000, hwc->sample_period);
			
 
				-		}
			
 
				-		__hrtimer_start_range_ns(&hwc->hrtimer,
			
 
				+		local64_set(&hwc->period_left, 0);
			
 
				+	} else {
			
 
				+		period = max_t(u64, 10000, hwc->sample_period);
			
 
				+	}
			
 
				+	__hrtimer_start_range_ns(&hwc->hrtimer,
			
 
				 				ns_to_ktime(period), 0,
			
 
				 				HRTIMER_MODE_REL_PINNED, 0);
			
 
				-	}
			
 
				 }
			
 
				 
			
 
				 static void perf_swevent_cancel_hrtimer(struct perf_event *event)
			
 
				 {
			
 
				 	struct hw_perf_event *hwc = &event->hw;
			
 
				 
			
 
				-	if (hwc->sample_period) {
			
 
				+	if (is_sampling_event(event)) {
			
 
				 		ktime_t remaining = hrtimer_get_remaining(&hwc->hrtimer);
			
 
				 		local64_set(&hwc->period_left, ktime_to_ns(remaining));
			
 
				 
			
@@ -5184,8 +5307,61 @@ static void free_pmu_context(struct pmu *pmu)
 
				 out:
			
 
				 	mutex_unlock(&pmus_lock);
			
 
				 }
			
 
				+static struct idr pmu_idr;
			
 
				+
			
 
				+static ssize_t
			
 
				+type_show(struct device *dev, struct device_attribute *attr, char *page)
			
 
				+{
			
 
				+	struct pmu *pmu = dev_get_drvdata(dev);
			
 
				+
			
 
				+	return snprintf(page, PAGE_SIZE-1, "%d\n", pmu->type);
			
 
				+}
			
 
				+
			
 
				+static struct device_attribute pmu_dev_attrs[] = {
			
 
				+       __ATTR_RO(type),
			
 
				+       __ATTR_NULL,
			
 
				+};
			
 
				+
			
 
				+static int pmu_bus_running;
			
 
				+static struct bus_type pmu_bus = {
			
 
				+	.name		= "event_source",
			
 
				+	.dev_attrs	= pmu_dev_attrs,
			
 
				+};
			
 
				+
			
 
				+static void pmu_dev_release(struct device *dev)
			
 
				+{
			
 
				+	kfree(dev);
			
 
				+}
			
 
				+
			
 
				+static int pmu_dev_alloc(struct pmu *pmu)
			
 
				+{
			
 
				+	int ret = -ENOMEM;
			
 
				+
			
 
				+	pmu->dev = kzalloc(sizeof(struct device), GFP_KERNEL);
			
 
				+	if (!pmu->dev)
			
 
				+		goto out;
			
 
				+
			
 
				+	device_initialize(pmu->dev);
			
 
				+	ret = dev_set_name(pmu->dev, "%s", pmu->name);
			
 
				+	if (ret)
			
 
				+		goto free_dev;
			
 
				+
			
 
				+	dev_set_drvdata(pmu->dev, pmu);
			
 
				+	pmu->dev->bus = &pmu_bus;
			
 
				+	pmu->dev->release = pmu_dev_release;
			
 
				+	ret = device_add(pmu->dev);
			
 
				+	if (ret)
			
 
				+		goto free_dev;
			
 
				+
			
 
				+out:
			
 
				+	return ret;
			
 
				+
			
 
				+free_dev:
			
 
				+	put_device(pmu->dev);
			
 
				+	goto out;
			
 
				+}
			
 
				 
			
 
				-int perf_pmu_register(struct pmu *pmu)
			
 
				+int perf_pmu_register(struct pmu *pmu, char *name, int type)
			
 
				 {
			
 
				 	int cpu, ret;
			
 
				 
			
@@ -5195,13 +5371,38 @@ int perf_pmu_register(struct pmu *pmu)
 
				 	if (!pmu->pmu_disable_count)
			
 
				 		goto unlock;
			
 
				 
			
 
				+	pmu->type = -1;
			
 
				+	if (!name)
			
 
				+		goto skip_type;
			
 
				+	pmu->name = name;
			
 
				+
			
 
				+	if (type < 0) {
			
 
				+		int err = idr_pre_get(&pmu_idr, GFP_KERNEL);
			
 
				+		if (!err)
			
 
				+			goto free_pdc;
			
 
				+
			
 
				+		err = idr_get_new_above(&pmu_idr, pmu, PERF_TYPE_MAX, &type);
			
 
				+		if (err) {
			
 
				+			ret = err;
			
 
				+			goto free_pdc;
			
 
				+		}
			
 
				+	}
			
 
				+	pmu->type = type;
			
 
				+
			
 
				+	if (pmu_bus_running) {
			
 
				+		ret = pmu_dev_alloc(pmu);
			
 
				+		if (ret)
			
 
				+			goto free_idr;
			
 
				+	}
			
 
				+
			
 
				+skip_type:
			
 
				 	pmu->pmu_cpu_context = find_pmu_context(pmu->task_ctx_nr);
			
 
				 	if (pmu->pmu_cpu_context)
			
 
				 		goto got_cpu_context;
			
 
				 
			
 
				 	pmu->pmu_cpu_context = alloc_percpu(struct perf_cpu_context);
			
 
				 	if (!pmu->pmu_cpu_context)
			
 
				-		goto free_pdc;
			
 
				+		goto free_dev;
			
 
				 
			
 
				 	for_each_possible_cpu(cpu) {
			
 
				 		struct perf_cpu_context *cpuctx;
			
@@ -5245,6 +5446,14 @@ unlock:
 
				 
			
 
				 	return ret;
			
 
				 
			
 
				+free_dev:
			
 
				+	device_del(pmu->dev);
			
 
				+	put_device(pmu->dev);
			
 
				+
			
 
				+free_idr:
			
 
				+	if (pmu->type >= PERF_TYPE_MAX)
			
 
				+		idr_remove(&pmu_idr, pmu->type);
			
 
				+
			
 
				 free_pdc:
			
 
				 	free_percpu(pmu->pmu_disable_count);
			
 
				 	goto unlock;
			
@@ -5264,6 +5473,10 @@ void perf_pmu_unregister(struct pmu *pmu)
 
				 	synchronize_rcu();
			
 
				 
			
 
				 	free_percpu(pmu->pmu_disable_count);
			
 
				+	if (pmu->type >= PERF_TYPE_MAX)
			
 
				+		idr_remove(&pmu_idr, pmu->type);
			
 
				+	device_del(pmu->dev);
			
 
				+	put_device(pmu->dev);
			
 
				 	free_pmu_context(pmu);
			
 
				 }
			
 
				 
			
@@ -5273,6 +5486,13 @@ struct pmu *perf_init_event(struct perf_event *event)
 
				 	int idx;
			
 
				 
			
 
				 	idx = srcu_read_lock(&pmus_srcu);
			
 
				+
			
 
				+	rcu_read_lock();
			
 
				+	pmu = idr_find(&pmu_idr, event->attr.type);
			
 
				+	rcu_read_unlock();
			
 
				+	if (pmu)
			
 
				+		goto unlock;
			
 
				+
			
 
				 	list_for_each_entry_rcu(pmu, &pmus, entry) {
			
 
				 		int ret = pmu->event_init(event);
			
 
				 		if (!ret)
			
@@ -5737,6 +5957,12 @@ SYSCALL_DEFINE5(perf_event_open,
 
				 	list_add_tail(&event->owner_entry, &current->perf_event_list);
			
 
				 	mutex_unlock(&current->perf_event_mutex);
			
 
				 
			
 
				+	/*
			
 
				+	 * Precalculate sample_data sizes
			
 
				+	 */
			
 
				+	perf_event__header_size(event);
			
 
				+	perf_event__id_header_size(event);
			
 
				+
			
 
				 	/*
			
 
				 	 * Drop the reference on the group_event after placing the
			
 
				 	 * new event on the sibling_list. This ensures destruction
			
@@ -6089,6 +6315,12 @@ inherit_event(struct perf_event *parent_event,
 
				 	child_event->ctx = child_ctx;
			
 
				 	child_event->overflow_handler = parent_event->overflow_handler;
			
 
				 
			
 
				+	/*
			
 
				+	 * Precalculate sample_data sizes
			
 
				+	 */
			
 
				+	perf_event__header_size(child_event);
			
 
				+	perf_event__id_header_size(child_event);
			
 
				+
			
 
				 	/*
			
 
				 	 * Link it up in the child's context:
			
 
				 	 */
			
@@ -6320,7 +6552,7 @@ static void __cpuinit perf_event_init_cpu(int cpu)
 
				 	mutex_unlock(&swhash->hlist_mutex);
			
 
				 }
			
 
				 
			
 
				-#ifdef CONFIG_HOTPLUG_CPU
			
 
				+#if defined CONFIG_HOTPLUG_CPU || defined CONFIG_KEXEC
			
 
				 static void perf_pmu_rotate_stop(struct pmu *pmu)
			
 
				 {
			
 
				 	struct perf_cpu_context *cpuctx = this_cpu_ptr(pmu->pmu_cpu_context);
			
@@ -6374,6 +6606,26 @@ static void perf_event_exit_cpu(int cpu)
 
				 static inline void perf_event_exit_cpu(int cpu) { }
			
 
				 #endif
			
 
				 
			
 
				+static int
			
 
				+perf_reboot(struct notifier_block *notifier, unsigned long val, void *v)
			
 
				+{
			
 
				+	int cpu;
			
 
				+
			
 
				+	for_each_online_cpu(cpu)
			
 
				+		perf_event_exit_cpu(cpu);
			
 
				+
			
 
				+	return NOTIFY_OK;
			
 
				+}
			
 
				+
			
 
				+/*
			
 
				+ * Run the perf reboot notifier at the very last possible moment so that
			
 
				+ * the generic watchdog code runs as long as possible.
			
 
				+ */
			
 
				+static struct notifier_block perf_reboot_notifier = {
			
 
				+	.notifier_call = perf_reboot,
			
 
				+	.priority = INT_MIN,
			
 
				+};
			
 
				+
			
 
				 static int __cpuinit
			
 
				 perf_cpu_notify(struct notifier_block *self, unsigned long action, void *hcpu)
			
 
				 {
			
@@ -6402,14 +6654,45 @@ void __init perf_event_init(void)
 
				 {
			
 
				 	int ret;
			
 
				 
			
 
				+	idr_init(&pmu_idr);
			
 
				+
			
 
				 	perf_event_init_all_cpus();
			
 
				 	init_srcu_struct(&pmus_srcu);
			
 
				-	perf_pmu_register(&perf_swevent);
			
 
				-	perf_pmu_register(&perf_cpu_clock);
			
 
				-	perf_pmu_register(&perf_task_clock);
			
 
				+	perf_pmu_register(&perf_swevent, "software", PERF_TYPE_SOFTWARE);
			
 
				+	perf_pmu_register(&perf_cpu_clock, NULL, -1);
			
 
				+	perf_pmu_register(&perf_task_clock, NULL, -1);
			
 
				 	perf_tp_register();
			
 
				 	perf_cpu_notifier(perf_cpu_notify);
			
 
				+	register_reboot_notifier(&perf_reboot_notifier);
			
 
				 
			
 
				 	ret = init_hw_breakpoint();
			
 
				 	WARN(ret, "hw_breakpoint initialization failed with: %d", ret);
			
 
				 }
			
 
				+
			
 
				+static int __init perf_event_sysfs_init(void)
			
 
				+{
			
 
				+	struct pmu *pmu;
			
 
				+	int ret;
			
 
				+
			
 
				+	mutex_lock(&pmus_lock);
			
 
				+
			
 
				+	ret = bus_register(&pmu_bus);
			
 
				+	if (ret)
			
 
				+		goto unlock;
			
 
				+
			
 
				+	list_for_each_entry(pmu, &pmus, entry) {
			
 
				+		if (!pmu->name || pmu->type < 0)
			
 
				+			continue;
			
 
				+
			
 
				+		ret = pmu_dev_alloc(pmu);
			
 
				+		WARN(ret, "Failed to register pmu: %s, reason %d\n", pmu->name, ret);
			
 
				+	}
			
 
				+	pmu_bus_running = 1;
			
 
				+	ret = 0;
			
 
				+
			
 
				+unlock:
			
 
				+	mutex_unlock(&pmus_lock);
			
 
				+
			
 
				+	return ret;
			
 
				+}
			
 
				+device_initcall(perf_event_sysfs_init);
			
--- a/kernel/power/suspend.c
+++ b/kernel/power/suspend.c
@@ -22,6 +22,7 @@
 
				 #include <linux/mm.h>
			
 
				 #include <linux/slab.h>
			
 
				 #include <linux/suspend.h>
			
 
				+#include <trace/events/power.h>
			
 
				 
			
 
				 #include "power.h"
			
 
				 
			
@@ -201,6 +202,7 @@ int suspend_devices_and_enter(suspend_state_t state)
 
				 	if (!suspend_ops)
			
 
				 		return -ENOSYS;
			
 
				 
			
 
				+	trace_machine_suspend(state);
			
 
				 	if (suspend_ops->begin) {
			
 
				 		error = suspend_ops->begin(state);
			
 
				 		if (error)
			
@@ -229,6 +231,7 @@ int suspend_devices_and_enter(suspend_state_t state)
 
				  Close:
			
 
				 	if (suspend_ops->end)
			
 
				 		suspend_ops->end();
			
 
				+	trace_machine_suspend(PWR_EVENT_EXIT);
			
 
				 	return error;
			
 
				 
			
 
				  Recover_platform:
			
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -8293,8 +8293,6 @@ void __init sched_init(void)
 
				 		zalloc_cpumask_var(&cpu_isolated_map, GFP_NOWAIT);
			
 
				 #endif /* SMP */
			
 
				 
			
 
				-	perf_event_init();
			
 
				-
			
 
				 	scheduler_running = 1;
			
 
				 }
			
 
				 
			
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -745,21 +745,21 @@ static struct ctl_table kern_table[] = {
 
				 		.extra1		= &zero,
			
 
				 		.extra2		= &one,
			
 
				 	},
			
 
				-#endif
			
 
				-#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86) && !defined(CONFIG_LOCKUP_DETECTOR)
			
 
				 	{
			
 
				-		.procname       = "unknown_nmi_panic",
			
 
				-		.data           = &unknown_nmi_panic,
			
 
				+		.procname       = "nmi_watchdog",
			
 
				+		.data           = &watchdog_enabled,
			
 
				 		.maxlen         = sizeof (int),
			
 
				 		.mode           = 0644,
			
 
				-		.proc_handler   = proc_dointvec,
			
 
				+		.proc_handler   = proc_dowatchdog_enabled,
			
 
				 	},
			
 
				+#endif
			
 
				+#if defined(CONFIG_X86_LOCAL_APIC) && defined(CONFIG_X86)
			
 
				 	{
			
 
				-		.procname       = "nmi_watchdog",
			
 
				-		.data           = &nmi_watchdog_enabled,
			
 
				+		.procname       = "unknown_nmi_panic",
			
 
				+		.data           = &unknown_nmi_panic,
			
 
				 		.maxlen         = sizeof (int),
			
 
				 		.mode           = 0644,
			
 
				-		.proc_handler   = proc_nmi_enabled,
			
 
				+		.proc_handler   = proc_dointvec,
			
 
				 	},
			
 
				 #endif
			
 
				 #if defined(CONFIG_X86)
			
--- a/kernel/sysctl_binary.c
+++ b/kernel/sysctl_binary.c
@@ -136,7 +136,6 @@ static const struct bin_table bin_kern_table[] = {
 
				 	{ CTL_INT,	KERN_IA64_UNALIGNED,		"ignore-unaligned-usertrap" },
			
 
				 	{ CTL_INT,	KERN_COMPAT_LOG,		"compat-log" },
			
 
				 	{ CTL_INT,	KERN_MAX_LOCK_DEPTH,		"max_lock_depth" },
			
 
				-	{ CTL_INT,	KERN_NMI_WATCHDOG,		"nmi_watchdog" },
			
 
				 	{ CTL_INT,	KERN_PANIC_ON_NMI,		"panic_on_unrecovered_nmi" },
			
 
				 	{}
			
 
				 };
			
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -69,6 +69,21 @@ config EVENT_TRACING
 
				 	select CONTEXT_SWITCH_TRACER
			
 
				 	bool
			
 
				 
			
 
				+config EVENT_POWER_TRACING_DEPRECATED
			
 
				+	depends on EVENT_TRACING
			
 
				+	bool "Deprecated power event trace API, to be removed"
			
 
				+	default y
			
 
				+	help
			
 
				+	  Provides old power event types:
			
 
				+	  C-state/idle accounting events:
			
 
				+	  power:power_start
			
 
				+	  power:power_end
			
 
				+	  and old cpufreq accounting event:
			
 
				+	  power:power_frequency
			
 
				+	  This is for userspace compatibility
			
 
				+	  and will vanish after 5 kernel iterations,
			
 
				+	  namely 2.6.41.
			
 
				+
			
 
				 config CONTEXT_SWITCH_TRACER
			
 
				 	bool
			
 
				 
			
--- a/kernel/trace/power-traces.c
+++ b/kernel/trace/power-traces.c
@@ -13,5 +13,8 @@
 
				 #define CREATE_TRACE_POINTS
			
 
				 #include <trace/events/power.h>
			
 
				 
			
 
				-EXPORT_TRACEPOINT_SYMBOL_GPL(power_frequency);
			
 
				+#ifdef EVENT_POWER_TRACING_DEPRECATED
			
 
				+EXPORT_TRACEPOINT_SYMBOL_GPL(power_start);
			
 
				+#endif
			
 
				+EXPORT_TRACEPOINT_SYMBOL_GPL(cpu_idle);
			
 
				 
			
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -21,17 +21,46 @@ typedef typeof(unsigned long [PERF_MAX_TRACE_SIZE / sizeof(unsigned long)])
 
				 /* Count the events in use (per event id, not per instance) */
			
 
				 static int	total_ref_count;
			
 
				 
			
 
				+static int perf_trace_event_perm(struct ftrace_event_call *tp_event,
			
 
				+				 struct perf_event *p_event)
			
 
				+{
			
 
				+	/* No tracing, just counting, so no obvious leak */
			
 
				+	if (!(p_event->attr.sample_type & PERF_SAMPLE_RAW))
			
 
				+		return 0;
			
 
				+
			
 
				+	/* Some events are ok to be traced by non-root users... */
			
 
				+	if (p_event->attach_state == PERF_ATTACH_TASK) {
			
 
				+		if (tp_event->flags & TRACE_EVENT_FL_CAP_ANY)
			
 
				+			return 0;
			
 
				+	}
			
 
				+
			
 
				+	/*
			
 
				+	 * ...otherwise raw tracepoint data can be a severe data leak,
			
 
				+	 * only allow root to have these.
			
 
				+	 */
			
 
				+	if (perf_paranoid_tracepoint_raw() && !capable(CAP_SYS_ADMIN))
			
 
				+		return -EPERM;
			
 
				+
			
 
				+	return 0;
			
 
				+}
			
 
				+
			
 
				 static int perf_trace_event_init(struct ftrace_event_call *tp_event,
			
 
				 				 struct perf_event *p_event)
			
 
				 {
			
 
				 	struct hlist_head __percpu *list;
			
 
				-	int ret = -ENOMEM;
			
 
				+	int ret;
			
 
				 	int cpu;
			
 
				 
			
 
				+	ret = perf_trace_event_perm(tp_event, p_event);
			
 
				+	if (ret)
			
 
				+		return ret;
			
 
				+
			
 
				 	p_event->tp_event = tp_event;
			
 
				 	if (tp_event->perf_refcount++ > 0)
			
 
				 		return 0;
			
 
				 
			
 
				+	ret = -ENOMEM;
			
 
				+
			
 
				 	list = alloc_percpu(struct hlist_head);
			
 
				 	if (!list)
			
 
				 		goto fail;
			
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -27,6 +27,12 @@
 
				 
			
 
				 DEFINE_MUTEX(event_mutex);
			
 
				 
			
 
				+DEFINE_MUTEX(event_storage_mutex);
			
 
				+EXPORT_SYMBOL_GPL(event_storage_mutex);
			
 
				+
			
 
				+char event_storage[EVENT_STORAGE_SIZE];
			
 
				+EXPORT_SYMBOL_GPL(event_storage);
			
 
				+
			
 
				 LIST_HEAD(ftrace_events);
			
 
				 LIST_HEAD(ftrace_common_fields);
			
 
				 
			
--- a/kernel/trace/trace_export.c
+++ b/kernel/trace/trace_export.c
@@ -83,13 +83,19 @@ static void __always_unused ____ftrace_check_##name(void)	\
 
				 
			
 
				 #undef __array
			
 
				 #define __array(type, item, len)					\
			
 
				-	BUILD_BUG_ON(len > MAX_FILTER_STR_VAL);				\
			
 
				-	ret = trace_define_field(event_call, #type "[" #len "]", #item,	\
			
 
				+	do {								\
			
 
				+		BUILD_BUG_ON(len > MAX_FILTER_STR_VAL);			\
			
 
				+		mutex_lock(&event_storage_mutex);			\
			
 
				+		snprintf(event_storage, sizeof(event_storage),		\
			
 
				+			 "%s[%d]", #type, len);				\
			
 
				+		ret = trace_define_field(event_call, event_storage, #item, \
			
 
				 				 offsetof(typeof(field), item),		\
			
 
				 				 sizeof(field.item),			\
			
 
				 				 is_signed_type(type), FILTER_OTHER);	\
			
 
				-	if (ret)							\
			
 
				-		return ret;
			
 
				+		mutex_unlock(&event_storage_mutex);			\
			
 
				+		if (ret)						\
			
 
				+			return ret;					\
			
 
				+	} while (0);
			
 
				 
			
 
				 #undef __array_desc
			
 
				 #define __array_desc(type, container, item, len)			\
			
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -57,6 +57,8 @@ static int __init hardlockup_panic_setup(char *str)
 
				 {
			
 
				 	if (!strncmp(str, "panic", 5))
			
 
				 		hardlockup_panic = 1;
			
 
				+	else if (!strncmp(str, "0", 1))
			
 
				+		no_watchdog = 1;
			
 
				 	return 1;
			
 
				 }
			
 
				 __setup("nmi_watchdog=", hardlockup_panic_setup);
			
@@ -548,13 +550,13 @@ static struct notifier_block __cpuinitdata cpu_nfb = {
 
				 	.notifier_call = cpu_callback
			
 
				 };
			
 
				 
			
 
				-static int __init spawn_watchdog_task(void)
			
 
				+void __init lockup_detector_init(void)
			
 
				 {
			
 
				 	void *cpu = (void *)(long)smp_processor_id();
			
 
				 	int err;
			
 
				 
			
 
				 	if (no_watchdog)
			
 
				-		return 0;
			
 
				+		return;
			
 
				 
			
 
				 	err = cpu_callback(&cpu_nfb, CPU_UP_PREPARE, cpu);
			
 
				 	WARN_ON(notifier_to_errno(err));
			
@@ -562,6 +564,5 @@ static int __init spawn_watchdog_task(void)
 
				 	cpu_callback(&cpu_nfb, CPU_ONLINE, cpu);
			
 
				 	register_cpu_notifier(&cpu_nfb);
			
 
				 
			
 
				-	return 0;
			
 
				+	return;
			
 
				 }
			
 
				-early_initcall(spawn_watchdog_task);
			
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -173,7 +173,8 @@ config LOCKUP_DETECTOR
 
				 	  An NMI is generated every 60 seconds or so to check for hardlockups.
			
 
				 
			
 
				 config HARDLOCKUP_DETECTOR
			
 
				-	def_bool LOCKUP_DETECTOR && PERF_EVENTS && HAVE_PERF_EVENTS_NMI
			
 
				+	def_bool LOCKUP_DETECTOR && PERF_EVENTS && HAVE_PERF_EVENTS_NMI && \
			
 
				+		 !ARCH_HAS_NMI_WATCHDOG
			
 
				 
			
 
				 config BOOTPARAM_SOFTLOCKUP_PANIC
			
 
				 	bool "Panic (Reboot) On Soft Lockups"
			
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -214,17 +214,22 @@ ifdef BUILD_C_RECORDMCOUNT
 
				 # The empty.o file is created in the make process in order to determine
			
 
				 #  the target endianness and word size. It is made before all other C
			
 
				 #  files, including recordmcount.
			
 
				-cmd_record_mcount = if [ $(@) != "scripts/mod/empty.o" ]; then			\
			
 
				-			$(objtree)/scripts/recordmcount "$(@)";			\
			
 
				-		    fi;
			
 
				+sub_cmd_record_mcount =					\
			
 
				+	if [ $(@) != "scripts/mod/empty.o" ]; then	\
			
 
				+		$(objtree)/scripts/recordmcount "$(@)";	\
			
 
				+	fi;
			
 
				 else
			
 
				-cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
			
 
				+sub_cmd_record_mcount = set -e ; perl $(srctree)/scripts/recordmcount.pl "$(ARCH)" \
			
 
				 	"$(if $(CONFIG_CPU_BIG_ENDIAN),big,little)" \
			
 
				 	"$(if $(CONFIG_64BIT),64,32)" \
			
 
				 	"$(OBJDUMP)" "$(OBJCOPY)" "$(CC) $(KBUILD_CFLAGS)" \
			
 
				 	"$(LD)" "$(NM)" "$(RM)" "$(MV)" \
			
 
				 	"$(if $(part-of-module),1,0)" "$(@)";
			
 
				 endif
			
 
				+cmd_record_mcount = 						\
			
 
				+	if [ "$(findstring -pg,$(_c_flags))" = "-pg" ]; then	\
			
 
				+		$(sub_cmd_record_mcount)			\
			
 
				+	fi;
			
 
				 endif
			
 
				 
			
 
				 define rule_cc_o_c
			
--- a/tools/perf/Documentation/perf-annotate.txt
+++ b/tools/perf/Documentation/perf-annotate.txt
@@ -24,12 +24,47 @@ OPTIONS
 
				 --input=::
			
 
				         Input file name. (default: perf.data)
			
 
				 
			
 
				+-d::
			
 
				+--dsos=<dso[,dso...]>::
			
 
				+        Only consider symbols in these dsos.
			
 
				+-s::
			
 
				+--symbol=<symbol>::
			
 
				+        Symbol to annotate.
			
 
				+
			
 
				+-f::
			
 
				+--force::
			
 
				+        Don't complain, do it.
			
 
				+
			
 
				+-v::
			
 
				+--verbose::
			
 
				+        Be more verbose. (Show symbol address, etc)
			
 
				+
			
 
				+-D::
			
 
				+--dump-raw-trace::
			
 
				+        Dump raw trace in ASCII.
			
 
				+
			
 
				+-k::
			
 
				+--vmlinux=<file>::
			
 
				+        vmlinux pathname.
			
 
				+
			
 
				+-m::
			
 
				+--modules::
			
 
				+        Load module symbols. WARNING: use only with -k and LIVE kernel.
			
 
				+
			
 
				+-l::
			
 
				+--print-line::
			
 
				+        Print matching source lines (may be slow).
			
 
				+
			
 
				+-P::
			
 
				+--full-paths::
			
 
				+        Don't shorten the displayed pathnames.
			
 
				+
			
 
				 --stdio:: Use the stdio interface.
			
 
				 
			
 
				 --tui:: Use the TUI interface Use of --tui requires a tty, if one is not
			
 
				 	present, as when piping to other commands, the stdio interface is
			
 
				 	used. This interfaces starts by centering on the line with more
			
 
				-	samples, TAB/UNTAB cycles thru the lines with more samples.
			
 
				+	samples, TAB/UNTAB cycles through the lines with more samples.
			
 
				 
			
 
				 SEE ALSO
			
 
				 --------
			
--- a/tools/perf/Documentation/perf-buildid-list.txt
+++ b/tools/perf/Documentation/perf-buildid-list.txt
@@ -18,6 +18,9 @@ perf report.
 
				 
			
 
				 OPTIONS
			
 
				 -------
			
 
				+-H::
			
 
				+--with-hits::
			
 
				+        Show only DSOs with hits.
			
 
				 -i::
			
 
				 --input=::
			
 
				         Input file name. (default: perf.data)
			
--- a/tools/perf/Documentation/perf-diff.txt
+++ b/tools/perf/Documentation/perf-diff.txt
@@ -19,6 +19,18 @@ If no parameters are passed it will assume perf.data.old and perf.data.
 
				 
			
 
				 OPTIONS
			
 
				 -------
			
 
				+-M::
			
 
				+--displacement::
			
 
				+        Show position displacement relative to baseline.
			
 
				+
			
 
				+-D::
			
 
				+--dump-raw-trace::
			
 
				+        Dump raw trace in ASCII.
			
 
				+
			
 
				+-m::
			
 
				+--modules::
			
 
				+        Load module symbols. WARNING: use only with -k and LIVE kernel
			
 
				+
			
 
				 -d::
			
 
				 --dsos=::
			
 
				 	Only consider symbols in these dsos. CSV that understands
			
@@ -42,7 +54,7 @@ OPTIONS
 
				 --field-separator=::
			
 
				 
			
 
				 	Use a special separator character and don't pad with spaces, replacing
			
 
				-	all occurances of this separator in symbol names (and other output)
			
 
				+	all occurrences of this separator in symbol names (and other output)
			
 
				 	with a '.' character, that thus it's the only non valid separator.
			
 
				 
			
 
				 -v::
			
@@ -50,6 +62,13 @@ OPTIONS
 
				 	Be verbose, for instance, show the raw counts in addition to the
			
 
				 	diff.
			
 
				 
			
 
				+-f::
			
 
				+--force::
			
 
				+       Don't complain, do it.
			
 
				+
			
 
				+--symfs=<directory>::
			
 
				+        Look for files with symbols relative to this directory.
			
 
				+
			
 
				 SEE ALSO
			
 
				 --------
			
 
				 linkperf:perf-record[1]
			
--- a/tools/perf/Documentation/perf-kvm.txt
+++ b/tools/perf/Documentation/perf-kvm.txt
@@ -22,7 +22,7 @@ There are a couple of variants of perf kvm:
 
				   a performance counter profile of guest os in realtime
			
 
				   of an arbitrary workload.
			
 
				 
			
 
				-  'perf kvm record <command>' to record the performance couinter profile
			
 
				+  'perf kvm record <command>' to record the performance counter profile
			
 
				   of an arbitrary workload and save it into a perf data file. If both
			
 
				   --host and --guest are input, the perf data file name is perf.data.kvm.
			
 
				   If there is  no --host but --guest, the file name is perf.data.guest.
			
@@ -40,6 +40,12 @@ There are a couple of variants of perf kvm:
 
				 
			
 
				 OPTIONS
			
 
				 -------
			
 
				+-i::
			
 
				+--input=::
			
 
				+        Input file name.
			
 
				+-o::
			
 
				+--output::
			
 
				+        Output file name.
			
 
				 --host=::
			
 
				         Collect host side performance profile.
			
 
				 --guest=::