Why do escape code vary with language?

https://stackoverflow.com/questions/8221194

06-03-2021
|

Question

For example it looks like escape chars depend on language - HTML and Javascript for example do not have the same escape chars. Why not? It would be easier since sometimes it's difficult to know whether you are dealing with HTML or Javascript and therefore it would simplify if escape sequences would be same for many languages.

Thank you for any comment or answer

Solution

It's mostly a matter of using a character that isn't needed for much in the rest of the language, combined with legacies arising from languages that had no idea they would eventually coexist. SGML, the predecessor of HTML, is older than Javascript. Javascript, in turn, wants to look like Java/C++/C, so it wants to use & to mean "and". &tc...

When something new comes along, there's a very high premium placed on it being similar to whatever's already in use, to reduce the learning curve, thereby increasing adoption rate, thereby avoiding new-thing infant mortality. It's also easier to implement a language that's similar to one that already has implementations.

OTHER TIPS

Generally it has to do with the history of the language. HTML began life as an application of SGML. In SGML, an "entity reference" (like HTML's & or <) isn't just a way of escaping a single character; it can actually be a whole block of text. The same is true of XML, which is a newer subset (technically an "application profile") of SGML. In programming languages, as opposed to document languages, that sort of notation is less convenient; \n, for example, occurs so much in C (a language from which JavaScript ultimately takes much of its syntax) that it would be very inconvenient to have to define a special "entity" for it.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow