Question

While writing a JSON parser in Java I ran into a "cosmetic" problem:

In the JSON specification it's clearly said that Javascript control characters are the same as e.g. in C and Java, like \n or \t. The problem I was running into, is that when there are control codes within a JSON string (so within the quotes: "property":"value"), then the displayed JSON code is messed up because the control characters are changing the print, e.g. \n creates a new line or \t creates a tab.

An example:

String s = "{\n\t\"property1\": \"The quick brown fox\njumps over the lazy dog\",\n\t\"property2\":\"value2\"\n}"

Printing as:

{ "property1": "The quick brown fox jumps over the lazy dog", "property2": "value2" }

The solution would look like this:

String s = "{\n\t\"property1\": \"The quick brown fox\\njumps over the lazy dog\",\n\t\"property2\": \"value2\"\n}"

Printing "correctly" as:

{ "property1": "The quick brown fox\njumps over the lazy dog", "property2": "value2" }

So my question: Is it correct to treat control code outside strings differently than the control code within strings? And is it correct to add within JSON strings another backslash \ before any control characters, creating strings like "\n" or "\t" that won't have any effect on the look of JSON strings?

Was it helpful?

Solution

Is it correct to treat control code outside strings differently than the control code within strings?

The JSON specification states

A JSON text is a sequence of tokens. The set of tokens includes six structural characters, strings, numbers, and three literal names.

These are {, [, }, ], :, and ,. It then states

Insignificant whitespace is allowed before or after any of the six structural characters.

Your \n, \t and others (actually the spec defines 4 of them) are considered white space, so you can put as many of them as you want around the above characters.

There is no notion of control characters outside JSON strings. These are just whitespace characters. Yes, they are treated differently.

And is it correct to add within JSON strings another backslash \ before any control characters, creating strings like "\n" or "\t" that won't have any effect on the look of JSON strings?

In your example, you are writing String literals. If you literally want to write \n in the JSON string, you need to write \\n in the Java String literal and similarly for the other escape sequences. The JSON generator must find any whitespace in the Java String it is converting to a JSON string and escape it accordingly. The JSON parser must find the literal \n (or whatever else) in the JSON string it parses and convert it appropriately in the Java String it creates.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top