Xen 
 
Home About Xen.org Xen Xen Summit Wiki Mailing List Bug Tracker Xen Downloads
 
   
 

xen-devel

[Xen-devel] blktap race against xenstore startup

To: "xen-devel@xxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxx>
Subject: [Xen-devel] blktap race against xenstore startup
From: "Stephen C. Tweedie" <sct@xxxxxxxxxx>
Date: Thu, 28 Sep 2006 23:17:54 +0100
Cc: Julian Chesterfield <jac90@xxxxxxxxx>, Steven Rostedt <rostedt@xxxxxxxxxxx>, Andrew Warfield <andrew.warfield@xxxxxxxxxxxx>
Delivery-date: Thu, 28 Sep 2006 17:30:23 -0700
Envelope-to: www-data@xxxxxxxxxxxxxxxxxx
List-help: <mailto:xen-devel-request@lists.xensource.com?subject=help>
List-id: Xen developer discussion <xen-devel.lists.xensource.com>
List-post: <mailto:xen-devel@lists.xensource.com>
List-subscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=subscribe>
List-unsubscribe: <http://lists.xensource.com/cgi-bin/mailman/listinfo/xen-devel>, <mailto:xen-devel-request@lists.xensource.com?subject=unsubscribe>
Sender: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx
Hi all,

With the various blktap fixes I've recently posted, blktap runs
reliably... the *second* time we start xend.  First time, blktapctrl
just dies on init.

It turns out that get_dom_domid() is SEGVing.  It calls

        e = xs_directory(h, xth, "/local/domain", &num);

and then iterates over the results to find the domain with the right
name (in this case, "Domain-0", which should be easy to find!)  Trouble
is, it's racing with xenstore startup, and when it calls this the first
time, it gets back an ENOENT (easily seen on an strace.)  That returns
e=NULL, and everything falls apart.

I have "fixed" it locally with the following terrible hack:

+       for (i = 0; i < 10; i++) {
+               e = xs_directory(h, xth, "/local/domain", &num);
+               if (e)
+                       break;
+               sleep(1);
+       }
        
-       e = xs_directory(h, xth, "/local/domain", &num);
-       
-       for (i = 0; (i < num) && (domid == NULL); i++) {
+       for (i = 0; e && (i < num) && (domid == NULL); i++) {

which just loops calling xs_directory() with a 1-second pause in between
until it returns something sensible.  

Ugh.  There has got to be a better way to synchronise with the initial
population of the dom0 information into xenstore, surely?  Has no other
component of the Xen stack ever seen this before?

--Stephen



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

<Prev in Thread] Current Thread [Next in Thread>