Xen 
 
Home About Xen.org Xen Xen Summit Wiki Mailing List Bug Tracker Xen Downloads
 
   
 

xen-devel

RE: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug

To: "Graham, Simon" <Simon.Graham@xxxxxxxxxxx>, "Keir Fraser" <Keir.Fraser@xxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: RE: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug
From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>
Date: Wed, 31 Jan 2007 13:42:17 +0800
Delivery-date: Tue, 30 Jan 2007 21:42:16 -0800
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
In-reply-to: <342BAC0A5467384983B586A6B0B37671048F8B2B@xxxxxxxxxxxxxxxxxxxxx>
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Thread-index: AcdESFqDCWsISfq5RGeHgxcxVzRqmQACelaDAAAZiDAAAQnwRAATVWmwABVdeuA=
Thread-topic: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug
>From: Graham, Simon [mailto:Simon.Graham@xxxxxxxxxxx]
>Sent: 2007年1月31日 3:29
>> On 30/1/07 09:54, "Tian, Kevin" <kevin.tian@xxxxxxxxx> wrote:
>>
>> > Another simple approach to trigger such warning is to let
>> > __xen_suspend() jumps to smp_resume immediately after
>> > smp_suspend, as a test case for suspend cancel. People can
>> > observe all vcpus except vcpu0 fall into that warning frequently.
>>
>> Do you know if this problem has been observed across many versions
>of
>> Xen or
>> e.g., only after the upgrade to 2.6.18?
>>
>
>I'm not sure but I think that we've been seeing something very similar
>when live migrating domains with 3.0.3/2.6.16.29) -- my understanding is
>that the live migration code takes the domain down to UP, does the
>migration and then restores SMP -- we VERY often see soft lockup
>messages following this (several times per night in our regression
>testing) with stack traces identical to those posted by Kevin.
>
>I also added some instrumentation and in every single case, the 'stolen'
>time is > 5s when we see the soft lockup.
>
>Simon

Hi, Simon,
        You case should be different as what I saw, which may be fixed 
by the original patch I posted which however doesn't apply to latest. 
In 2.6.16 version, it's do_timer to call softlock_tick instead of 
run_local_timers. So the check on "stolen > 5s" is a bit late to still 
allow warning jumped out though adjusted later. Could you try 
attached patch to see whether fixing for your live migration case?

Thanks,
Kevin

Attachment: fix_softlockup_2616.patch
Description: fix_softlockup_2616.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel