XML and its Weight Problem

XML and its Weight Problem

Due to its weight and price, JSON offers an alternative to XML when transporting data.

I know. I’m playing with fire today. I am going to tell you why I don’t like XML and all the reasons it’s not a good way of transporting data on the factory floor.

XML has a weight problem, but then again, so do I. And just like XML, my weight problem is not because I can’t gain weight. I seem to be really good at that. I wish it were an Olympic sport because I’ve reached a point in my life where I gain weight listening to a Burger King commercial on the radio.

XML has a similar weight problem. It is overly complex, bandwidth and processing-expensive, and not easily integrated with a lot of today’s programming languages. I know this sounds like heresy, but my defense, as always, is the truth.

XML has been touted as THE solution for a long time now. And in the IT world, XML solved a very difficult problem that a lot of people struggled with for a long time. One of the biggest problem was, of course, byte ordering. In the early days of computers there were two giants, and each one ordered bytes differently. Motorola ordered bytes Most Significant Byte (MSB) first, while Intel ordered bytes Least Significant Byte (LSB) first. That made it impossible to know if you could send data from one device to another without swapping bytes.

On top of that, you had different standards for how the data was encoded. For example, some systems encoded Floating Point data one way, and others used a completely different format. There was no way to be sure how any data type was encoded, so everyone was writing very time-consuming and costly drivers to pass data from one system to another.

In the 1990s, with the rise of the Internet, this became a huge problem. It became impossible to code drivers fast enough for all the interfacing that needed to be done. A few innovators at Sun Microsystems led a group development effort to solve this problem, and XML was born.

XML is based on ASCII character transmission. Every computer in the world understands the ASCII character set. By transmitting data as ASCII characters in a well-defined format, you now had a universal way of moving data between any two systems.

Of course, XML is much more than that. XML is a document markup language that includes mechanisms to add attributes to data, structure parent-child relationships, define consistent names across applications, and much more. It is an almost-perfect solution to the data exchange problem in the IT world. And when you add the companion specifications to it like the Query Language, Style Sheets, Simple Object Access Protocol (SOAP) and the rest, you have a very powerful way of moving data between systems.

Unfortunately, all of that power comes with a price, and in Industrial Automation, the bill is as large as President Obama’s green fees and growing larger all the time. People like to disregard the deficiencies of XML by saying, “But you can send anything to anything because everything can parse XML.” That is true, but we’ve moved beyond that. Today, there are more and more requirements for moving data between systems. And not just log data, but sometimes some pretty high performance data. And that data needs to be assimilated into the destination application quickly. XML kind of falls apart when you start talking about those things.

It’s expensive to encode and decode XML files. All that ASCII requires a lot of processing power to create, and lots of buffer space. It was easy to pay that price when it was just different IT systems moving data, but now that that embedded systems for Industrial Automation are also moving a lot of data, that price just got heftier. Decoding and encoding XML and processing all those ASCII characters isn’t a problem for a big Windows server, but it’s really processor and memory-intensive for an embedded system. It’s a price that embedded system designers are unwilling to pay, and, luckily for us, there’s an alternative.

That alternative is JSON, and it stands for JavaScript Object Notation. It was designed out of a need for a stateful (able to understand current operating states) way to send data between a Browser and a Server device. In the early 2000s, a group of people at State Software Inc. developed JSON as part of a non-strict Java implementation. But since that time, it’s been tweaked to be a completely language-independent data format.

Here’s the quick comparison between XML and JSON:

  • Both are Open standards. XML is older and more established, but JSON is growing in popularity. JSON (in my opinion) is going to become the de facto standard for moving embedded data to server and browser applications.
  • Both are interoperable. There are no systems that cannot support either data format.
  • XML is a document markup language. The biggest different between JSON and XML is in how data is structured. In XML, your data has to be mapped to the XML document structure. The receiver has to decode and map that data back to its internal structure. That not only takes time, it’s also inefficient. In JSON, data is structured as arrays and records, the standard way that data is structured in all programming languages. JSON is much closer to how data is normally structured.
  • XML is really a document exchange language. It is really good at describing documents in an open, structured way. JSON is more of a data exchange language. It is much better at exchanging data between applications.
  • JSON is simpler, requires fewer constructs, and in Java applications, is easier to implement than XML. JSON data files can be converted to Java Objects without using a parser.

Like XML, both JSON and XML use plain text that is self-describing and human-readable. Java files use the extension .json, while XML files use the extension .xml. Both have the ability to model parent-child kinds of relationships, and an http request can be used to retrieve a JSON or XML file. And both use a schema to communicate the structure of a data file. But there are also some things that are somewhat different.

JSON is more data-centric and less verbose than XML. Where XML uses start-tag / end-tag notation 

<firstname>Emily<\firstname>

JSON uses name / value pairs:

“firstname”:”Emily”

Both communicate the same kind of information, but JSON’s notation is a bit simpler and easier to parse. Multiple name / value pairs are simply separated by commas:

“firstname”:”Emily”,”Type”:”PHS”

Even multiple-object notation is simpler and more straightforward:

"disbursementlist":[
    {"firstName":"Kristi", "lastName":"Elativ"}, 
    {"firstName":"Emily", "lastName":"Draw"}, 
    {"firstName":"Megan", "lastName":"Snewo"}
]

The beauty of JSON notation for Java programmers is that is can be easily captured by a Java program without being parsed like XML. There are standard constructs in the Java language that can natively capture this notation and store it in Java objects.

Like XML, JSON data can be requested from a web server using an HTTP request. This is one of the easiest ways of displaying data in a web browser and one of the most common uses of JSON. 

You can expect that a lot more embedded application on the factory floor will use JSON in the future as one of the mechanisms for exchanging data.

As for XML, at this point in its life, it’s not going to lose all the extra weight. (That makes two of us!)

John S. Rinaldi is president of Real Time Automation, which provides industrial networking technology to system integrators, machine builders and product designers in a variety of industries.  He is author of four books, including two technology books; Industrial Ethernet and  OPC UA: The Basics: An OPC UA Overview For Those Who May Not Have a Degree in Embedded Programming. There are a limited number of free copies for Machine Design Readers.  To request a free copy, visit the “Contact Us” link at http://www.rtaautomation.com/.

Hide comments

Comments

  • Allowed HTML tags: <em> <strong> <blockquote> <br> <p>

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
Publish