Tuesday, June 19, 2012

JSON versus XML: Is JSON Really Better than XML?

JSON is a new human readable data format that has become very popular in the last few years, especially in web development.

JSON is very similar to XML. They both try to solve the same problem by creating a simple, human readable format for storing data. Up until recently, XML was used for any type of system that needed to send small portions of data quickly without a database attached. Think API calls that ask for information from the server. For the most part, XML does the job just fine. So what’ the need for JSON?

JSON was designed with web in mind, so it works really well with JavaScript. Using methods like eval() (with jQuery enhancing this call considerably), you can easily fill a web page with all of your JSON information.

JSON is claimed to have many benefits over XML, including:

  • “Easier” to read
  • Faster to parse
  • Takes up less space

Although “‘easier’ to read” is a point that’s difficult to measure, the other two points are not.

It’s very easy to see that JSON does indeed require less space to store the same information. After a quick look on the JSON website, you can find several examples that compare the two formats. Just by looking at the page, it’s easy to determine that the JSON representation takes far less characters to describe than its XML counterpart. For instance, the first example (a glossary data structure) requires 502 characters in XML, while only 345 characters in JSON (about 30 % less space).

Now for the “faster to parse” point, which is a bit harder to really test. For this point, I wrote up a quick test to determine how fast I could parse a XML and JSON string into a Java object.

For XML parsing, I used the build in SAX Parser. The SAX Parser allows me to iterate through the XML file and assign XML values to the appropriate value in the object. This method is a bit more cumbersome than what I used for JSON parsing, but certainly not unreasonable.

For JSON parsing, I loaded up the GSON library, which easily converts between JSON and java objects with just a one liner. All that was needed was the class definition itself (i.e. the Book class with properly named fields). This does, however, couple the class variables to the JSON instance. If either the class instance names change, or the JSON field names change, problems will arise.

To begin, I took a fairly simple data structure and created a XML and JSON representation of it. The following two XML and JSON files were created using information from Programming Pearls.

XML Version

   1: <book>
   2:     <type>textbook</type>
   3:     <pages>256</pages>
   4:     <title>Programming Pearls 2nd Edition</title>
   5:     <description>The first edition of Programming Pearls was one of the most influential books I read early in my career...</description>
   6:     <rating>4.5</rating>
   7:     <coverType>paperback</coverType><genre>Computer Science</genre><author>Jon Bentley</author><publisher>Addison-Wesley Professional</publisher><copyright>1999</copyright>
   8: </book>

JSON Version

   1: {
   2: "book": {
   3:     "type": "textbook",
   4:     "pages": "256",
   5:     "title": "Programming Pearls 2nd Edition",
   6:     "description": "The first edition of Programming Pearls was one of the most influential books I read early in my career...",
   7:     "rating": "4.5",
   8:     "coverType": "paperback",
   9:     "genre": "Computer Science",
  10:     "author": "Jon Bentley",
  11:     "publisher": "Addison-Wesley Professional",
  12:     "copyright": "1999"
  13:     }
  14: }

Results

The parsing test was run on both the above XML and JSON files 10,000,000 times. The results are not surprising. JSON is parsed and converted into a Java object about 30% faster than XML.

  • Average JSON Run time:  3.647208974029518E-5
  • Average XML Run time: 5.011537916910817E-5

My findings are that JSON runs 30% faster and takes up 30% less space than XML. These results seem to be in line with what much of the development community believes in regards to the two formats. The switch to JSON for data handling can net a fairly large increase in performance, while also reducing the amount of space required.

11 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. What I find interesting is that you wrote a custom SAX parser (probably one of the fasted and most efficient ways to parse an XML document) to compare to GSON, which not only parses but goes through all the reflection to build objects. Not really a fair test.

    A better test would have been using one of the JAXB implementations to bind XML to a Java object.

    ReplyDelete
    Replies
    1. That would just give JSON a bigger edge anyways as it's already 30% faster.

      Delete
  3. JSON looks great for storing arrays, but XML can store attributes, not only values within tags. Seems to me there are some tasks where JSON is better, but where are some where XML is the best.

    ReplyDelete
  4. This comment has been removed by the author.

    ReplyDelete
  5. JSON can handle XML attributes pretty well really.

    aObject: {
    attribute = "x",
    value = 1
    }

    versus

    <aObject attribute=x>1</aObject>

    Which one is more readable to you?

    ReplyDelete
  6. What do you think the results will be using Google's GSON? I am a newbie and have not tested Gson's performance.

    ReplyDelete
  7. Hi - your findings are very interesting, but can you post your code so we can check how you are parsing the documents? Thanks!

    ReplyDelete
  8. Hey guys.

    @Fidelio, the JSON parser used above was Google's GSON.

    Stay tuned for the code itself. I'll get it up on Github shortly.

    ReplyDelete
  9. XML I think support XSD (XML schema definition) with automatic validation of XMLs toward the schema.

    XML support namespaces that help a lot when you have a very complex data to handle.

    XML support mixing of text and tag and combining effect like in HTML: bold bold and italic bold.

    Overall (but this might change), XML has far more library and tooling available and that can make for a big difference.

    So if you just need to transmit simple autodocumenting, directly readable data you can use json.

    If you want more flexibily and still be readable, you will benefit of using XML. But this is for very complex, plugin aware or extensible data format. If you want to have lot of tooling available to help you, XML is also a good bet.

    If you have any performance or space requirement you might take a look at non self describing, binary format.

    ReplyDelete
  10. Instead of using those two tiny documents, try with a much larger file (1MB+). I'm quite certain that, for larger files, a JSON DOM parser is going to be slower than XML SAX.

    ReplyDelete