ONX : Open Node Syntax

The original idea behind ONX was to create an effective solution for some of the issues surrounding the current XML (Extensible Markup Language) recommendation while retaining many of XMLs strengths. When XML was originally released, it was as a subset of SGML (Standard Generalized Markup Language) that could be more easily adopted in the Internet community. Since the primary use of SGML today is for document-oriented markup (e.g. papers, books, publications, etc.), it was reasonable to assume that XML would have a similar goal in mind. Just one of many examples of this usage today is XHTML, which is designed to replace HTML.

What was not expected was that the world saw uses for XML that were not fully anticipated by its creators. Some of the most promising (most hyped, at any rate) uses include standards such as XML-RPC and SOAP. Both of these standards have a common goal: define a standardized way to send requests and receive responses between applications over a network. By using XML as the basis for these standards, several issues are solved when compared to the traditional techniques used in the past. For instance, since XML is text-based standard, incompatibilities between operating systems that tend to revolve around binary format differences are all but totally removed. Since XML is human-readable, this allows for easier development, debugging, and implementation. Since XML is “self-describing”, such concepts as incremental revisions and version control can be implemented without breaking or having to rewrite applications that understand older versions of the standard.

However, there are a few problems with XML that make it a limiting technology in certain areas. The first problem is that XML is not a very compact way of representing information. For instance, it would be possible to take an XML document and rework it into a proprietary binary format that would be much smaller even without the use of modern compression techniques. The second problem is that, while XML is easy to read, it is not in an optimal format for a parser to process and interpret. This means that more processing time is spent than may be necessary to get to the information stored in the document. Certainly, these weaknesses can be a non-issue when editing a document such as a resume or book. However, when XML is used in a standard like SOAP, both of these weaknesses can be problems when considering issues such as bandwidth and processor usage (to name only a few).

For example, when one computer uses SOAP to communicate with another computer over the Internet, a certain amount of bandwidth is used for the transmission. If the recipient computer was a server that handled several requests at a time, there is only so much bandwidth available to receive all of the requests. However, had those same requests arrived in a more compact format, that same bandwidth would potentially be better utilized, which in turn would allow the server to handle more requests over any given period of time.

Further, once the recipient gets the request, processing must be done. The more processing that must go into the parsing of the XML request, the less processing is available for the actual contents of the request. As with the bandwidth issue, a more compact format designed for optimized parsing would free up more processing time to handle the actual request. Again, the server would potentially be able to handle more requests over any given period of time.

So what can be done to solve these issues? Traditionally, most implementations have chosen a compact binary format that were proprietary in nature (e.g. Win32 RPC calls, EDI, custom TCP messages, etc.). It would be possible to go this route still, but then we would lose so many of the advantages of XML that were mentioned above. We could stay with XML and improve the hardware that supports it, but that can be very costly. We could instead take the middle ground: use a markup language that has some of the advantages of XML and some of the advantages of the more compact (and proprietary) binary formats. This concept is what ONX is all about.

Links

ONX Specification : This is the working draft of the ONX specification. At this point, I think any changes left have primarily to do with improving the understanding and readability of the document. As for the technical parts, I don't think any more major changes are in store (with the exceptions mentioned at the end of the document).

ONX Event Parser : This is a C++ parser written to process ONX infoblocks. It is similar (in concept) to SAX and James Clark's expat, but is much more simple. Note that I wrote it in an evening, hence its simplicity. There are also a few example programs that show how it is used. Try it out.