xen: explicitly create/destroy stop_machine workqueues outside suspend/resume region.
I have observed cases where the implicit stop_machine_destroy() done by stop_machine() hangs while destroying the workqueues, specifically in kthread_stop(). This seems to be because timer ticks are not restarted until after stop_machine() returns. Fortunately stop_machine provides a facility to pre-create/post-destroy the workqueues so use this to ensure that workqueues are only destroyed after everything is really up and running again. I only actually observed this failure with 2.6.30. It seems that newer kernels are somehow more robust against doing kthread_stop() without timer interrupts (I tried some backports of some likely looking candidates but did not track down the commit which added this robustness). However this change seems like a reasonable belt&braces thing to do. Signed-off-by: Ian Campbell <ian.campbell@citrix.com> Signed-off-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@citrix.com> Cc: Stable Kernel <stable@kernel.org>
parent
65f63384
Please register or sign in to comment