Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is JSON replacing XML as a data format? [closed]

Tags:

json

xml

When I first saw XML, I thought it was basically a representation of trees. Then I thought: the important thing isn't that it's a particularly good representation of trees, but that it is one that everyone agrees on. Just like ASCII. And once established, it's hard to displace due to network effects. The new alternative would have to be much better (maybe 10 times better) to displace it. Of course, ASCII has been (mostly) replaced by Unicode, for internationalization.

According to google trends, XML has a x43 lead, but is declining - while JSON grows.

[edited] How and why will JSON replace XML as a data format?

  1. for which tasks?
  2. for which programmers/industries?

NOTES: S-expressions (from lisp) are another representation of trees, but which has not gained mainstream adoption. There are many, many other proposals, such as YAML and Protocol Buffers (for binary formats).

I can see JSON dominating the space of communicating with client-side AJAX (AJAJ?), and this possibly could back-spread into other systems transitively.

XML, being based on SGML, is better than JSON as a document format. I'm interested in XML as a data format.

XML has an established ecosystem that JSON lacks, especially ways of defining formats (XML Schema) and transforming them (XSLT). XML also has many other standards, esp for web services - but their weight and complexity can arguably count against XML, and make people want a fresh start (similar to "web services" beginning as a fresh start over CORBA).

[edited Mar2010] Like NoSQL, JSON is schemaless.

like image 798
13ren Avatar asked Mar 26 '09 04:03

13ren


People also ask

Can we replace XML with JSON?

XML has done a lot for software for data transmission by its easy and readable format, JSON has not fully replaced XML, however JSON has its own features but can't replace XML where loads of XML data still exist in this world.

What happened to XML?

XML was used for many years, but gradually JSON has taken over as the data format of choice in many applications.

Which is better XML or JSON Why?

Key Difference Between JSON and XML JSON has no display capabilities whereas XML offers the capability to display data. JSON is less secured whereas XML is more secure compared to JSON. JSON supports only UTF-8 encoding whereas XML supports various encoding formats.


3 Answers

Short answer: yes and no (EDITED as per comments below)

There are fundamental differences and trade-offs. XML is a markup language, particularly suitable for textual documents (xhtml, docbook, various kinds of office docs). And good enough for many other tasks. Problems mostly arise for it having hierarchic model (instead of, relational as in SQL, or object-graph as in oo languages).

JSON is an object notation, meaning it has bit more natural fit for handling data-oriented use cases; cases where xml sort of works, but where there is more cost in overcoming impedance between object and hierarchic models. JSON is not a perfect fit -- it's still data, not objects (no identity, can't do full graphs) -- but it is more natural than XML. And as such, it is easier to build tools to do good decent and simple data binding.

So: there's plenty room for both, and I would expect both to be used for long time to come. Not always in optimal way, but both can do plenty of use cases well enough.

For what it is worth, since writing my original answer, I have seen JSON absolutely annihilate XML for data-oriented/data-interchange use cases for companies I have worked for. SOAP (etc) will start significantly shrinking, and "plain old JSON" data interchange (esp. with RESTish frameworks, JAX-RS for Java for example) will take over.

And yet XML is much better for textual markup.

like image 177
StaxMan Avatar answered Sep 19 '22 14:09

StaxMan


My bold thesis is that such replacement is impossible after all, since these data-formats (JSON and XML) are different.

Short version: XML is not equivalent to JSON (or similar) format since XML nodes (tags) support attribute notation and namespacing. It turns out to be crucial.

So, the best way to answer this question is actually to show how these formats are different, i.e. to complete the comparison. Forgive me for stating the obvious but I only hope this will be interesting or even useful. It will help if we first agree with simple terminology that:

  1. Data-format is actually a formal language, which governs how data can be recognized (in its representation, i.e. how to "read/write" it from memory according to the way it is stored there).
  2. Data-structure is an abstract way of modeling (describing) how this data is organized or linked.

So, actually both concepts address different aspects of data maintenance (e.g. IO). For example, indexed array of a particular data-type is a (homogenues) structure and it can be accessed (read/written) as a serial sequence (contiguous format).

Wikipedia has a great article about JSON containing a lot of alternatives like (already named lisp's) S-Expressions, Python Nested Structures, PHP arrays, YAML, etc (note we are not considering dictionaries like .ini files since they lack multiple nesting). All these formats can be seen as representation of a certain data-structure - a tree. We can state that they are isomorphic in that sense. Each representation can be mapped to a tree in such manner that no extra processing should be done (e.g. grammar of a formal language is not changed). Also there exists a reverse mapping.

Well you may say that's "some" theory but what does it mean for practice? Implications are that if we compare XML and JSON by:

  • design purpose and motivation
  • application domain - set of task a format is used to solve
  • syntactical complexity (well, simplicity - to which extend format is more readable/writable/human friendly/etc)
  • maturity (like how many versions the format is around)
  • and so on

we will discover further practical differences. Major of them all is that XML is a MARKUP language (as been mentioned). Yes, to do folding it is able to mix namespaces and attributes which results in a higher-order of "parallel" nesting.

For the past two years I was busy transforming XML representation into python nested structures back and forth. To my only bitter conclusion they are very poorly compatible. To represent attributes and namespacing one should escape (e.g. with prefixes) this information in the tree representation. So once again XML is definitely not a tree ;-) it immediately (without the need to encode, encapsulate or escape) allows representation of much more sophisticated structures than trees due to "markup" capabilities, i.e. typed trees. Trees with specialized types of vertexes (again by namespaces and attributes).

There are other difficulties and dangers like parsing and mapping

<body>The <strong>marked up</strong> text</body>

into a tree without some pre-decided convention (How to break "The .. text"?) or preserving order followed in XML.

Obviously things which are not equivalent are naturally having trouble to substitute each other. In that sense XML is more complex than nested structures.

The part of the question regarding industries seems pretty well answered by a prognoses that XML will stay server-side and document-oriented technology. Mainly because of its superior data-typing abilities. Also there have been done a lot of research motivated by XML solely as a markup language.

Excuse me for being far off the topic further, discussing the popularity of JSON but it seems partially relevant ;)

I want to emphasize that JSON (being an object notation) completely fails to grasp any of the custom typing information (it enumerates the type without providing a "runtime"-reference or a context) by design (it is JavaScript), hence fails to pass highly-coupled objectified data. Type information will be always abstracted to JSON native types. This limits the abilities for type oriented development (type checking, constraining, casting, delegation, etc.). But IMHO this very crucial problem is shared with JSON by the most of modern programming languages (I know), which lack sophisticated nested custom data-typing as XML does (objects or functions are not documents). It seems that XML itself is doing this only by accident and not by design.

As the result while working with JSON one applies similar tactics as by processing "duck"- typed data in popular dynamical languages. So this is another characteristics for JSON - allows fast coding but risks to get bulky when is growing too big (nested and complex).

JSON is more of a swiss-knife than XML since it is simpler.

So, JSON does not help to interoperate with strongly-typed languages like Java but on the other hand it allows to lower the coupling by encouraging abstract decomposition. Since losing type information sometimes may be a good thing (reduction factor) it allows simpler architectures. ActionScript prefers to communicate de-facto in JSON (but they have also proposed own AMF). Finally, JSON works great with KISS (e.g. RESTful) designs. JSON buys with speed and simplicity. But what one usually tends to ignore is when KISS is impossible and domain logic is too complicated - designing DTDs and XSDs, thinking formats through and so on - is the work that should be done by someone (often later on when cool KISS approach failed because of lack of designing competence and experience). The point is JSON is a great tool which lacks application scale.

like image 22
Yauhen Yakimovich Avatar answered Sep 19 '22 14:09

Yauhen Yakimovich


I think JSON has already largely replaced XML for client-side communications with a web server, but that will likely be the extent of its dominance. As you stated, XML provides advantages that are appropriate for server-to-server interactions.

like image 27
Beep beep Avatar answered Sep 19 '22 14:09

Beep beep