AgileDelta CEO John Schneider discusses Efficient XML™ with InfoWorld Columnist Jon Udell

Jon Udell: Hi, this is Jon Udell, my guest for today's podcast is John Schneider. He's the CTO of AgileDelta, was one of the driving forces behind E4X, or ECMAScript for XML, and he's currently evangelizing Efficient XML, a proposal for an alternative binary representation of XML. In this conversation we discussed the motivations for Efficient XML, its theoretical foundations, and its practical applications. (start of interview) Hey, John, how are you doing?

John Schneider: I'm doing well, how are you doing, Jon?

Udell: I'm pretty good. I thought that we could use this time to clarify the principles of Efficient XML. I've kind of read through your white paper and the stuff that's out there, and I realized that I'm vague on some of the core concepts here, and so, if I am, then I am sure a lot of other people are too. It's probably a good idea to just review, for people who don't have the context, where this comes from, and what the motivation for it was, and what the near-term application areas that you see are.

Schneider: The core motivation for Efficient XML is really to expand the XML community, and to extend the reach of XML to a wide range of new applications -- applications where it couldn't be used before. XML has been so phenomenally successful, and there's this great thriving community, and tons of tools in the commercial marketplace. There's lots of competition which drives quality and drives prices. It's a really good thing for everybody who can tap into it, but there are a wide range of applications that can't tap into it, that require a certain level of efficiency. Wireless applications, a lot of times, can't really tap into XML because of the bandwidth requirements. Unlike processing power, bandwidth is actually quite expensive, and doesn't evolve at the same pace, so it's somewhat of a persistent problem. It doesn't follow Moore's Law like a lot of other things do.

Udell: Right, but you should probably elaborate a little bit on what you mean by "wireless applications", because I think it's not quite the same as what a lot of people will think, which is "I'm in Starbucks and I have pretty good bandwidth."

Schneider: I'm focused really on all wireless applications, but some of them feel the pain more than others. So, we're talking about hand-held mobile devices and the like -- PDAs, and Smartphones, and mass market consumer phones. Connection points where people would like to get at a wide range of information. If I'm in the airport, I want to know what's happening with my flights while I'm sitting in the restaurant, I don't want to have to get up and go look at the kiosk every time.

Udell: But isn't it the case that, often in that scenario, you might have sufficient bandwidth but the constraint is the small memory of the device, and its lack of the kind of power that's needed to process the XML efficiently.

Schneider: When we talk about Efficient XML, we're talking about several different kinds of efficiency. Fundamentally, we're reducing the resource requirements for XML. So, a big one for people is reducing the requirement for bandwidth, but at the same time, we're reducing the requirement for processing power, we're reducing the requirement for memory, and CPU power. And something that's extremely important to mobile device senders is we're reducing the requirements for battery life. The fewer CPU cycles you use, the less battery life you use, and more importantly, the less time that radio is on, communicating back and forth, the less battery life you use. The processing is certainly important, but actually when you look at the applications, and where they're constrained, more often we find out it's actually bandwidth over processing power.

So, as an example, I mentioned battery life. There were some studies that occurred at Helsinki University to look at how much battery life a radio uses, versus how much battery life the CPU uses. And what they found is, on average, every byte sent over the radio used the same amount of battery life as 100, 000 to one million processor instructions. So, the big thing when you want to preserve battery life is actually to keep that radio off, and have it on less of the time, and send less data. Similarly, it's interesting, there's been a lot of little studies in the past about binary XML and exactly how much it gets you, and most of them seem to be conducted in a lab, where you have a high-bandwidth network connecting two computers, side-by-side. In that environment, you're not going to be network-bound. Your CPUs are going to be operating at full speed and not blocked on a network. But as soon as you add a wireless node, or a node that's remote, or as soon as you add more than one person hitting the server, you find that all of a sudden your applications get network-bound. And when they're network-bound, it doesn't matter how fast you make the CPU, or how fast you make the process, it's going to be sitting there, waiting idle, while it's trying to send things over the network.

Size matters, so if you can make the data really, really small, you're going to get a lot more efficiency, instead of getting something two times as fast, you can actually get something 50 times faster. Several of our customers are getting results right now where systems are 100 times faster, especially on a wireless network.

Udell: So, let's talk about ways of making things smaller, then. I'm looking at your proposal right now and reading through the part about schema-based compression. As I understand it, you will leverage an XML schema if available, but you'll still compress very effectively if there isn't one. Since an awful lot of XML is schema-less, I'd be interested to hear, sort of, on both sides of that. What the story is, first with and then without.

Schneider: Sure. To start with, we did a lot of listening about people's experiences with prior attempts at binary XML. You know from other things that I've worked on, I'm a big fan of things that are late-bound, and I'm not a huge fan of things that are completely early-bound, because a lot of times they tend to be brittle. Early-bound works for some applications, but you need to be flexible. And, so, Efficient XML actually follows that model. You can be as early-bound or as late-bound as you like. You make the decision on how much schema information you want to leverage. But, unlike prior binary XML attempts, everything still works if the data doesn't follow the schemas that you're using. It will handle arbitrary deviations from the schema. If you drop in some new chunks of XML, it's OK, and your applications can leverage those things, or ignore those things, or whatever they choose to do with them.

So, you can have an environment where it's completely schema-less, or you can have an environment where you're leveraging big, hefty enterprise schemas that some people have, or you can have an environment where you're leveraging some schemas, and then some information doesn't have schema about it. So, maybe you choose to just use the SOAP schema for your envelope, but the payload doesn't have any schema at all. Or vice versa, if that's what you want to do.

Udell: OK, in the case where you are completely schema-less, are you then no more effective than Zip, or if you are, how are you?

Schneider: It's actually quite a bit more effective that Zip. The reason for that goes back to the roots of our research in Efficient XML. Information theory, which is what compression, and Zip, and things like that are all based off of, something formed by Claude Shannon back in the '50s, tells us that the minimum size of a piece of information is a function of what you know about the data. Zip, and algorithms like that, actually learn about the data by analyzing it, which of course takes time and processor resources. So, they'll go across the data, look for certain patterns, they'll find things that are occurring frequently, and come up with smaller-bit representations of those frequently occurring things, which on average gives you a big size reduction. But the minimum size of a piece of information is a function of what you know about the data.

We have a head start on Zip, even without the schemas, because we know it's XML, and the XML grammar tells us a lot about what's likely to occur in a document. So, without analyzing the data, we already know what's going to happen. We know, for example, that the first thing that's going to happen in a document is, the XML grammar tells us it's going to be a comment, or a processing instruction, or a doc type, or a start element -- those kinds of things. Where Zip doesn't know anything about that, it actually has to learn about all these concepts by analyzing the data, and if the data's not very big, it doesn't actually learn enough of anything to have any effect on it.

Udell: Just the fact you know you've seen a start tag and you're kind of expecting the companion end tag gives you a little bit of a leg-up.

Schneider: Right. There are some other things that we do as well. One thing that's particularly surprising to people, in fact they just don't believe it when they hear it, if they have a background in information theory, is with Efficient XML we can actually beat the information theoretic lower bound on the size of a piece of information. It sounds like you're breaking the laws of physics, but you're actually not. The reason for that is because with information theory, they define the information theoretic lower bound, and I know we're going kind of deep here, based on a channel. A channel is just a chunk of bits. You can analyze that channel and there's a theoretical minimum bound on how small you can make it. With XML, there are clever ways that we can look, because we understand the structure of the data, we can look at the data and say: "Well, we know what we have here is not really one channel, but it's several channels, and each of those independent channels are more regular than the whole channel taken together.

Udell: So, what would be an example of a sub-channel within a document that somebody might be familiar with?

Schneider: Well, let's take a real simple example. Let's just say we've got a couple of elements in there, and some of them have numbers in them, and some of them have letters in them. And they're just interstrewn throughout the documents, you know, you get some numbers and some letters, and then some letters and some numbers, etc., and they're all in separate elements. So, first of all, the letters and the numbers won't necessarily compress well together, because they are not very similar.

Udell: Right.

Schneider: OK, so if you take the whole thing as one channel, then you will actually find that you do not get very good compression on it, depending on the size and how much data you have. But if you separate those two channels, I have numbers on one channel and I have letters on another channel, both of those things will compress, independently, better than they will together.

Udell: OK. So if you had, for example, namespaces within a document, and what was in one namespace tended to be numeric, then that would be really helpful, right?

Schneider: Exactly. You can divide things up by namespaces, you can divide them up by QName. There are lots of interesting ways you can chop things up to keep much higher compression than if you treat it as a single channel.

Udell: OK, OK.

Schneider: So as a result, what we found is that sometimes not only can we beat the information, theoretical or bound, but we can actually beat it by like five times sometimes.

Udell: Wow.

Schneider: So even in cases where-usually when you have really large, regular documents, very repetitive large documents, WinZip will just tear those things apart, right?

Udell: Mm-hmm.

Schneider: But we can still get about five times smaller or three times smaller even than that, on those documents. And WinZip, in those cases, is actually getting close to the information theoretic lower bound, because of the way the algorithms are structured. So that is very surprising to people.

Udell: That is very interesting. Then we add in schematization, and what then becomes possible, and why?

Schneider: Let me jump back for a second. Efficient XML has a characteristic that none of the other binary XML attempts have had, which is that it is completely data-driven, and it is completely grammar-driven in particular. So schema-based encoding is exactly the same algorithm and exactly the same thing as schema-less encoding. The difference is that in one case you have the XML grammar, and in the other case you have a grammar that is more informed by schema. So it is just some additional information about what is likely to occur in the stream.

Udell: You don't really treat those as different cases.

Schneider: No you don't. They are the same. What we have done is we have unified a whole bunch of different techniques, and that is very important for mobile devices because you can have a very simple, small, data-driven algorithm that you can put on devices that do not have much memory.

If I have to go out and build a separate parser for the schema case, and the schema-less case, and the fragment case, and the case where I have schema deviations, then I am going to end up with this huge pile of complex code that will not ever fit on a mobile device. Effectively, what you end up doing is implementing four or five formats instead of one format. In fact, that is the approach that a lot of people have tried to take in the past and say, "Okay, we need a schema-less encoding. We need a schema encoding. What happens when we get schema deviations? How do we handle that?" And you end up with kind of a big pile of code that will not fit on a mobile device.

Fundamentally, what we are trying to achieve, which is also different from what others were trying to achieve, is to say, "There is no application anymore where XML cannot be used. We want to achieve the efficiency of a hand-optimized binary format with XML." So any application that needs that level of efficiency now has no excuse not to start using the open XML standards, and the tools, and tapping into the XML community. Everything we have been focused on has been aimed in that direction.

It actually works out particularly well for several of our customers that have invested huge amounts of dollars in proprietary binary formats that they needed because they need that level of efficiency; and now they can start using inexpensive open standards in those places, and they can start getting interoperability with all the rest of their XML tools.

Udell: I have a sense of what you mean when you say that this stuff can be internally binary but look to XML processors like what they expect. If you could sort of unpack that for me, I would really appreciate it, and I be a bunch of other people would, too.

Schneider: I imagine so. It is a common question that we get, is how can it be binary and XML both? Really, what we have done with Efficient XML is to provide an alternate syntax for expressing XML data. There is a binary way to express the data or a text way to express the data, and you can always convert from one to the other and back if needed, and there is a straightforward transform for doing that. There is a one-to-one correspondence between what is an element in one form and what is an element in the other form.

We try to make this as invisible as possible, so we embed it in the lowest possible layer. So what you have is now a parser, or a serializer, or we tend to call it a codec, that knows XML, and knows Efficient XML, and your application gets exactly the same stream of events it always did. So if your application sits on top of the standard DOM interface, the Document Object Model interface, then it will get a DOM tree, like it always did, and it does not necessarily know or care whether that data came across in binary form or text form. It gets exactly the same DOM tree.

Its the same thing with SAX. So, it sits underneath the SAX interface, and you get a stream of SAX events, or you generate a stream of SAX events from your program, and you do not necessarily care whether it is coming or going in more efficient form or not.

There is also support for StAX, there is support for JAXP. On the.NET platform there is support for the APIs there as well. The idea is that the programmers have already climbed the learning curve and learned all about XML, all of the XML technologies, all the XML APIs, and so they should just be able to read and write the same way that they always did. It just happens to work more efficiently.

Udell: So the trick here is the "appropriately modified implementation" of DOM or SAX or what have you.

Schneider: That's right. With something like JAXP, for example, in Java, there is a pluggable infrastructure, so you can just say, "I am going to plug in my new parser, and now my application knows Efficient XML and XML.".

Udell: OK.

Schneider: Now, there is a really cool case that we are able to do with web services. A lot of our customers are using standard SOAP web services. The nice thing there is, because of the standards that have been put in place with deployment descriptors and WSDLs and schemas and the like, you can actually just drop in a SOAP handler, and the SOAP handler plugs itself in, and all you really have to do is drop in a library, modify a configuration file, and now your web services can use either XML or Efficient XML.

Our efficient web services solution also has content negotiation built into it. So what happens is, you plug this in, it is invisible, your application logic is the same - you don't have to make a single change to it, and it will automatically go out and figure out what other clients or servers support Efficient XML. It will talk Efficient XML with them, it will talk regular XML with everybody else, and if a new one comes online, it will see that and it will actually switch to Efficient XML.

So the more nodes you have on your network that support Efficient XML, the more scaleable that network becomes, and the faster things go; but it is not required by any node in the network, and it switches invisibly.

Udell: Where do Efficient XML-aware parsers and processors come from? You are supplying some. Other people will be able to supply others at some point. This gets to sort of the standards question and the evolution of this along that dimension. Where are things at the moment, where do you hope things will be in a year or two?

Schneider: Well, in a year or two I hope that we will find it everywhere. The idea is that any XML content, and a year or two is probably a little bit optimistic, but the idea is that you would like to have any XML content to be available in this form, so that you can access that content from any kind of device, or any kind of wireless network, or any kind of embedded system, or automotive system, or set-top box, or for Pete's sake, the refrigerators nowadays are connected.

But from any device you want to, you can access that data, and you can access it with the efficiency needed to support those mobile and wireless applications.

Udell: OK.

Schneider: We are a software company, and we sell software, so we do have several different implementations of Efficient XML that our customers are using now, for Java and Java mobile devices, and .NET and .NET mobile devices. We are heavily involved in the standards process. We proposed Efficient XML a few years ago, so we are about a few years down the road, we have got probably a couple of years left to go before we have a standard in hand.

What we are doing in our implementations is tracking the standard as it progresses, so that our customers can start using the software now if they would like to. They can make sure that it actually works for their applications; they can give us feedback that we drive into the standards process, so that it is based on implementation experience as opposed to a paper standard that has never been implemented.

Udell: To your knowledge, who outside of AgileDelta has so far done an implementation of Efficient XML?

Schneider: Nobody outside AgileDelta has done an implementation yet.

Udell: OK.

Schneider: They have not actually had the wherewithal to do that yet, but that is what we are working towards in the World Wide Web Consortium.

Udell: OK. So right now, the folks who are using say, the .NET implementation, are using it as a licensed binary from you.

Schneider: Yes, that's right.

Udell: OK.

Schneider: It is a license. It is a perpetual license, so they buy it and they own it forever. We also realize that people want to track the standards, so they generally come with free updates to the software for the first year. Then if people want to go beyond that, they can choose to do that as well.

Udell: OK. The uses to which it has been put so far are, it sounds, quite broad, right? -- ranging from optimizing the flow of web services traffic to intermittently connected handheld devices in a military environment.

Schneider: Yes. Our objective with this was really to achieve interoperability. I mean, really, the whole objective is to expand the group of applications, or maximize the group of applications that can use XML to communicate or to interoperate. So we wanted to make sure, if you are going to maximize the number of applications that can actually use this, you need to make sure that it actually works for the wide variety of use cases.

So it has been used for web services, it has been used in high-volume message routers, it has been put on aircraft, it has been put in vehicles, it has been used for satellite broadcast. It has been put in a whole bunch of different mobile devices, PDAs and the like. It has been put in application servers and web services. There are simple HTTP proxies you can actually point your browser at that will now enable those browsers to do Efficient XML or XML.

In fact, we did that on some aircraft with some customers, where they wanted a real quick way to get this into their system. They had some large pieces of data that they wanted to access that we just put in an HTTP proxy on the aircraft and back in their control center, and now all of a sudden they are browsing megabyte documents over an airborne wireless network, but they are going across the network in about 11k, so about 100 times smaller in that case.

Udell: Right, and then sort of on the other end of the spectrum, there is this very interesting possibility. You and I have probably talked about this, but I know I have also discussed it with John Shewchuck at Microsoft a few times. He is very big on this idea that if you can kind of abstract away the differences between inter-process communication on a single box, and inter-process communication across the network, then that starts to get really interesting.

Schneider: Yeah, you have got this very interesting distributed system at that point. A distributed system like that is almost always network-bound, and so you have really got to address the points of pain there.

Udell: Now I am trying to scope out who else is involved.

Schneider: We have got a good list. Some of them are customers of a binary XML format, like Chevron/Texaco is in there. Some of them are application server vendors, like IBM is on the group. Some of them are mobile device vendors, so we have got Nokia and Siemens on the group, for example. I am not going to remember all of them, but we have got a good-sized group.

This is the second group we have stood up to do this in the W3C, and I know in the previous group we had about 27 different companies, which is about as big as you want the group to be.

That group was responsible for identifying our requirements and use cases and the like. Then the second group, that we have got set up now, is to actually define the standard.

So the first group was to really look at it in detail, look at a lot of use cases, look at all the requirements, and make a few decisions. The decisions were: is this really valuable?, is it something that the W3C should do?, and is it possible? -- because there were a lot of people that thought that you could not really come up with a single binary format that actually served all the diverse needs of the XML community and all the different things that people wanted to do. That was actually our objective from the start, was to come up with a single, general-purpose format that was actually very optimal for a wide range of use cases.

Udell: Right.

Schneider: Not easy, by the way. We did have to break from the traditional approaches that people have taken, and take this approach that is kind of a marriage between information theory and formal language theory. Efficient XML is actually the only binary format that has actually gone down that route, and the only one that actually achieves the results it does as well.

Udell: Is Microsoft in this or not in this? I can't find any...

Schneider: Microsoft is not a member of that group. They have been interested, and they are watching very closely; I will say that.

Udell: Right. Well my hat is off to anyone who can navigate the standards process. I know you have been there before, really a number of times.

Schneider: Yes, that is true.

Udell: That is a world that I have observed from the outside. Never been on the inside of it. I really can scarcely imagine the combination of skills, let's just put it that way, that is required.

Schneider: Well thank you, John. It does require a lot of different kinds of hats in that group, and what I call the "coopetition model" is always very interesting and dynamic, where there are lot of different interests from different parties that are getting thrown into the mix. And bringing everybody to a point where they are all nodding up and down and going, "Yes, that is what we want," and then what they want is actually something good is sometimes challenging.

I was very proud with what we were able to accomplish with, you know about the E4X group, the ECMAScript for XML group.

Udell: Yup.

Schneider: That was a case where we had all the big vendors and browser folks involved, and at the end of it they were all nodding up and down, and I was really, really happy with the result. So to me that was kind of an existence proof that says you can have a standards process that works, that everybody gets kind of what they want, and still come up with a really good technology out the backside.

Udell: So from your company's perspective, you are going to be providing this in a variety of forms, or you already are, actually.

Schneider: Yes, we are.

Udell: Maybe you should talk a little bit about what those are, and the different footprints and configurations and so on.

Schneider: Sure. There are a whole bunch of different places where this applies. We have had a bunch of different implementations that are already completed, some that you can come to the web site and download today, there is a trial that you can go and download; and some that you will be able to see pretty soon. There is Java Enterprise Edition support, there is Java Standard Edition support, so that will fit into any application server or Java application.

You can plug it in as a codec, which basically means, there is a whole collection of APIs, all the standard XML APIs that you can just call directly.

Udell: Underneath all the Java normal XML stuff.

Schneider: That is exactly right. So if you are already doing DOM, or you are already doing SAX, or you are already doing StAX or JAXP, it is all in there. So if you already know all those interfaces, you can go in and start using this right away, plugging it into your application. You can access it directly through the APIs.

There is also a little command line interface that you can use as well. It is just kind of fun to play with, get an idea how it works, so you can encode documents, try different options. And it is a very quick way; within a few minutes you can be up and running with it, play with it and see how well it works.

There are also .NET releases that are going to be out very shortly, both for the server side and .NET compact framework. On the Java side, also, there is support for J2ME, very small footprint implementations that you can put on a wide variety of PDAs or Smartphones or whatever you might have that needs to get connected to the network.

Udell: In browsers? I mean, it seems like the DOM overhead is a huge drain on a lot of resources now that everything has gone Ajax.

Schneider: Yes, I certainly think that you will see it built into browsers. In the meantime, I mentioned that have HTTP proxies. So any HTTP application you have, including SOAP or web browsing or whatever, you can put an HTTP proxy up; and with your browser, most people know how to configure a proxy nowadays. You just go up and configure a standard HTTP proxy that drives all your traffic through there.

Udell: Yup.

Schneider: If you are accessing Efficient XML files, or if the server sends you an Efficient XML file, it will automatically unpack that and it looks just like the XML. So the proxies, and that kind of integration, really drive home for people the fact that it is just a separate way to transport the data, because when you click on an Efficient XML file or you click on the XML file, you see exactly the same thing in the browser.

Udell: Is well-formedness the ante here. I mean, in other words, if I was shuttling XHTML around, and I had this proxy, would I be getting the benefit, and if it was not well formed, then I would not?

Schneider: Right now that is the case, yes.

Udell: OK.

Schneider: But it wouldn't be that difficult, actually, to support the other kinds. It is just a matter of time and resources to do it. As I mentioned before, it is a data-driven strategy that is driven off of grammars. So since that is declarative, they are very easy to change and modify to describe HTML just as well as XHTML. But obviously, XML is where most people are headed, so that is the first thing you see.

Udell: The John Shewchuck vision is that everything kind of looks like SOAP and everything plays with the same toolkits. And that if you want to talk to your USB thumb drive you are going to be just regarding it as a peer set of services in the same way that something that was out there in the cloud would look like a peer set of services, and that that unification of programming models would be a big benefit because we still think about things in very different ways. It's a kind of a lofty and somewhat remote idea, I suppose, but we could actually get there.

Schneider: That is exactly the vision that we are chasing after. The tag line that we run around with is: "Any data, any device, anywhere," and what it says is exactly that unification you are talking about there.

Udell: Yeah.

Schneider: There is absolutely no reason why any device should ever use anything other than XML. If you can make Efficient XML, which is what we have done, as efficient as something somebody can do by hand, then you look around and you go, "Why would I build something myself? It is really expensive to build and maintain. I do not have this huge user community and developer community out there. I don't have all the libraries that I can go download for free. I can't use standard web services." Why would you do it?

You no longer really have to sit down and hand-design your assembly language anymore. Now you can get a compiler to do it for you. It is the same thing with Efficient XML. You no longer would have to sit down and hand-design your own binary format; you have got a machine that will do it for you, and it is in general better than you are at it anyway. Machines are good at packing bits.

Udell: A lot of people would say, "Well yeah, but why do we just have to have this one hammer for every nail, and why does everything have to be XML?" Well, tell me what you think about this: I have this one slightly speculative but to me still quite interesting example of how that could matter.

So let's assume that we are in a world where things are nicely service-oriented, and we have started to do a pretty good job of factoring out things like policy from applications, so that let's say you have a certain policy in your organization that is going to control the flow of a certain type of XML document, subject to some compliance regulation.

Schneider: Right.

Udell: Now, you are in pretty good shape to the extent that you have got your desktop and server applications kind of flowing through these XML gateways, and policy application and enforcement. That is all starting to look pretty good. Then one day you say, "Oh crap, there are these USB thumb drives." Right? So now that policy, which I have factored out and made completely general in this other domain, needs to be special-case implemented for that, for this device that can get stuck into the computer and create this covert channel. On the other hand, if the mode of communication to that device is the same style of XML web services traffic, and is subject to the exact same methods of intermediation, then at least in principle, that policy can also govern that channel.

Schneider: That is absolutely right, yes.

Udell: I mean, I have been thinking about this for a couple of years. I have yet to see an example of something like that that I could point to. I don't know if you have, I would be interested to hear about it. But it certainly seems like a good goal.

Schneider: Yes. You think about that problem: "Now I have got this USB thing, and if I can't use the things that I have already invested in and I have to develop something new, Number One, it is going to take a long time; Number Two, it is going to be expensive; Number Three, it is not going to work very well with everything else.

Udell: Yup.

Schneider: And I might have to build a little gateway between it, to communicate with other things, right? And whenever I build a gateway I have got a versioning problem, because versions change on both sides, and I am going to have to continue to maintain that thing, to go from version one to two, and two to one, and three to two, etc., and I might drop some data if one version evolves faster than the other.

So having everything using a common format, and a common way to communicate, is very valuable. It is one language for all devices. A lot of the incentives are really economic; and time-based, right? I can do things a lot faster and I can do things a lot less expensive if I can use XML, because I can use that community -- re-use, or build on top of that community, stand on top of the shoulders of the giants that built XML. Why not leverage all that work? It's fantastic.

So yeah, that is really the core philosophy behind Efficient XML, is to get it out there and say, "There is really no reason not to use XML." It is like I said, we have actually got it plugged into aircraft, and vehicles, and all these things that just used to use proprietary binary formats, because it's all they could do.

Now they can build things so much faster. We have got a couple of customers that are working on airborne web services right now, where they are sharing a lot of different kinds of data off of aircraft, and they do not have to build it all themselves. It used to be a huge, expensive process, and now they are using standard web services engines. Things move a lot faster and they are a lot less expensive when you can leverage that XML community.

Udell: Well I know you have had a fascinating history, working with a variety of military applications. It is probably the case that you can say a lot less about those than you would like to; but it would sure be interesting to know as much as can be known about that.

Schneider: We are very fortunate to have several military and government customers. This is kind of the quintessential massive enterprise, if you will. They use XML all over the place, and they use it for all the same things everybody else does -- SOAP web services and the like. But they do have a lot of mobile devices and a lot of wireless devices, and they do have a lot of needs for getting information very quickly, and for tracking events very closely.

They also operate a large number of wireless networks, which is very expensive when you think about all the radios you have to put out there to create the bandwidth, and all the frequency management and everything else.

So they kind of play the role of the large enterprise. They kind of play the role of a carrier. They kind of play the role of a device manufacturer. They have got the whole ecosystem there. So it is not that surprising that we have had them knocking down our door to get this, if you will.

What I can say about the implementations that we have done with them is that, if you think about location-based services, you have a map that shows you where a whole bunch of different things are and shows you where you are. That is an important kind of application to our military customers, real-time situation awareness.

You can get that on an aircraft using XML. You can get that out at a vehicle at the end of a wireless network using XML. You can get it over a very low-bandwidth satellite communications link using XML. And you can put it in the enterprise, in the web servers, and in the web services, and in the pub-sub systems, and high-speed message routers and the like. So this is really something that they can plug in all over the place. They have spent enormous amounts of money on proprietary binary formats.

Udell: You cooked up a few of them for them at one time yourself, didn't you?

Schneider: Well I used to work with this customer quite a bit, and yes, before XML, I had worked with them to understand what their requirements were for all these different binary formats. So what kind of binary format do you need for a real-time system, for example. What kind of binary format do you need for aircraft? What kind of binary format do you need for vehicles? And at the time, they came up with a whole bunch of different ones, one for vehicles, one for aircraft. It was the same thing everybody has done. "I need a binary format that works for my environment. Let's define one that works there."

A couple of problems come up. Number one, it is just very, very expensive to build those formats and put them out there, and then have to build all your own tools that sit on top of it.

Udell: Oh, sure.

Schneider: I mean, how do you extract data from a binary format? Geez, I'd love to use X-Query or I would love to use E4X but I can't. I have to build that stuff myself. Number one, they are very expensive; and number two, since they are all different, they do not talk to one another without gateways, and then you have very expensive gateways.

Now, as soon as XML came out, I turned around and said, "Well I need to educate my customers on this," and I became kind of the XML evangelist. So I went around talking, I probably gave 100 talks in the first year, saying, "Everybody has got to move to XML. The economic benefits are just huge, and it is going to move a lot faster that anything you have ever done before. And that actually worked. A lot of people did move to XML. The people that could, did.

Udell: Right, and then there were those that couldn't.

Schneider: There were those that couldn't, and then there were also some that did, that were going, "Hey, we moved like you told us to, and now we are going ten times slower than we were before. We don't like you very much anymore."

Probably, in those 100 talks in the first year, no kidding, in 99 percent of them somebody said, "Hey, what about bandwidth? This stuff looks a lot bigger than what we were doing before." And the truth is, everybody was doing something before they were doing XML, and the thing that they were doing was generally more efficient than XML. So the people that have moved have lost some of that efficiency, and that is the price you pay to get the economic benefits of the standard, and to get the kind of tempo of evolution that we have in the XML community.

Udell: And a lot of us can just kind of write that off because a lot of us are not in a situation, maybe the vast majority are not in the situation, where that is going to be an issue.

Schneider: That's right. So a lot of people can write it off, and a lot of people can say, "We did it. We bit the bullet, and now we are moving on." There is another group of people that say, "We did it. We bit the bullet, but we would really like some way to get back to our old performance." And then there were some that just could not move at all.

And in the commercial market there are actually more of those than people who did make the move. If you look at the mobile devices, for example. I just saw the recent IDC report that said this last quarter there were 250 million mobile devices sold. So they are on track to be a billion mobile devices sold by the end of the year. This has the potential to expand XML to five-to-ten times more nodes on the network, and if you think about Metcalfe's Law, and how the value of the network increases exponentially with the number of nodes on the network, that applies. The more people you get connected on their mobile devices, and the more people are able to access the content, the more valuable things get.

Udell: This has been really helpful. I think I have a reasonably good grasp of the technical foundations at this point. I will never have much of a grasp of the political nuances.

(laughter).

You know I'm not sure, really, if I care about that. In broad strokes, this is, to me, obviously a good idea, and I am glad you are on the case, because I do not know too many people that, like I said, have the kind of combination of skills to sort of pull something like this off. It sounds like it is going well.

Schneider: It is going very well. Again, we are very proud of what we have been able to accomplish, and I appreciate you spending the time to talk and understand what we are doing. Thanks very much.

Udell: Sure.

About AgileDelta, Inc.
AgileDelta is the leading provider of software that pushes information to the edge of networks, elevating mobile devices to true 'smart clients'. Our products bridge the gap between high-speed business networks and mobile networks by dramatically improving the efficiency of XML data and reducing the footprint required to run standard web services, increasing the value and capability of mobile devices. AgileDelta is a privately held company with headquarters in Bellevue, Washington.

AgileDelta and Efficient XML are trademarks of AgileDelta, Inc. All other brand and product names are trademarks or registered trademarks of their respective holder.