Sfoglia il codice sorgente

xen: explicitly create/destroy stop_machine workqueues outside suspend/resume region.

I have observed cases where the implicit stop_machine_destroy() done by
stop_machine() hangs while destroying the workqueues, specifically in
kthread_stop(). This seems to be because timer ticks are not restarted
until after stop_machine() returns.

Fortunately stop_machine provides a facility to pre-create/post-destroy
the workqueues so use this to ensure that workqueues are only destroyed
after everything is really up and running again.

I only actually observed this failure with 2.6.30. It seems that newer
kernels are somehow more robust against doing kthread_stop() without timer
interrupts (I tried some backports of some likely looking candidates but
did not track down the commit which added this robustness). However this
change seems like a reasonable belt&braces thing to do.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com>
Cc: Stable Kernel <stable@kernel.org>
Ian Campbell 15 anni fa
parent
commit
b4606f2165
1 ha cambiato i file con 11 aggiunte e 1 eliminazioni
  1. 11 1
      drivers/xen/manage.c

+ 11 - 1
drivers/xen/manage.c

@@ -79,6 +79,12 @@ static void do_suspend(void)
 
 	shutting_down = SHUTDOWN_SUSPEND;
 
+	err = stop_machine_create();
+	if (err) {
+		printk(KERN_ERR "xen suspend: failed to setup stop_machine %d\n", err);
+		goto out;
+	}
+
 #ifdef CONFIG_PREEMPT
 	/* If the kernel is preemptible, we need to freeze all the processes
 	   to prevent them from being in the middle of a pagetable update
@@ -86,7 +92,7 @@ static void do_suspend(void)
 	err = freeze_processes();
 	if (err) {
 		printk(KERN_ERR "xen suspend: freeze failed %d\n", err);
-		goto out;
+		goto out_destroy_sm;
 	}
 #endif
 
@@ -129,7 +135,11 @@ out_resume:
 out_thaw:
 #ifdef CONFIG_PREEMPT
 	thaw_processes();
+
+out_destroy_sm:
 #endif
+	stop_machine_destroy();
+
 out:
 	shutting_down = SHUTDOWN_INVALID;
 }