What are the relative merits of CSV, JSON and XML for a REST API?
Question
We're currently planning a new API
for an application and debating the various data formats we should use for interchange. There's a fairly intense discussion going on about the relative merits of CSV
, JSON
and XML
.
Basically, the crux of the argument is whether we should support CSV
at all because of the lack of recursion (i.e. having a document which has multiple authors
and multiple references
would require multiple API calls
to obtain all the information
).
In the experiences you may have had when working with information from Web APIs
and things we can do to make the lives easier for the developers working with our API
.
Our decision:
We've decided to provide
XML
andJSON
due to the difficulty in recursion inCSV
needing multiple calls for a single logical operation.JSON
doesn't have a parser inQt
andProtocol Buffers
doesn't seem to have anon-alpha PHP
implementation so they are out for the moment too but will probably be supported eventually.
Solution
CSV is right out. JSON is a more compact object notation than XML, so if you're looking for high volumes it has the advantage. XML has wider market penetration (I love that phrase) and is supported by all programming languages and their core frameworks. JSON is getting there (if not already there).
Personally, I like the brackets. I would bet more devs are comfortable with working with xml data than with json.
OTHER TIPS
Advantages:
- XML - Lots of libraries, Devs are familiar with it, XSLT, Can be easiily Validated by both client and server (XSD, DTD), Hierarchical Data
- JSON - easily interpreted on client side, compact notation, Hierarchical Data
- CSV - Opens in Excel(?)
Disadvantages:
- XML - Bloated, harder to interpret in JavaScript than JSON
- JSON - If used improperly can pose a security hole (don't use eval), Not all languages have libraries to interpret it.
- CSV - Does not support hierarchical data, you'd be the only one doing it, it's actually much harder than most devs think to parse valid csv files (CSV values can contain new lines as long as they are between quotes, etc).
Given the above, I wouldn't even bother supporting CSV. The client can generate it from either XML or JSON if it's really needed.
CSV has so many problems as a complex data model that I wouldn't use it. XML is very flexible and easy to program with - clients will have no problem coding XML generators and parsers, you can even provide sample parsers using SAX.
Have you checked out Google's network data format? It's called Protocol Buffers. Don't know if it is useful for a REST service however as it skips that whole HTTP layer too.
XML can be a bit heavyweight at times. JSON is quite nice, though, has good language support, and JSON data can be translated directly to native objects on many playforms.
I don't have any experience with JSON, CSV works up to a point when your data is very tabular and evenly structured. XML can become unwieldy very quickly, especially if you don't have a tool that creates the bindings to your objects automatically.
I have not tried this either but Google's Protocol Buffers look really good, simple format, creates automatic bindings to C++, Java and Python and implements serialisation and deserialisation of the created objects.
Asides from what Allain Lalonde already said, one additional advantage of CSV is that it tends to be more compact than XML or even JSON. So, if your data is strictly tabular, with a completely flat hyerarchy, CSV may be a correct choice. Additonal disadvantages of CSV is that it may use different delimiters and decimal separators, depeding on which tool (and even country!) generated it.