Question

I need a character to separate two or more URIs in one string. Later I will the split the string to get each URI separately.

The problem is I'm not sure what character to pick here. Is there a good character to choose here that definitely can't be part of a URI itself? Or is ultimately pretty much all characters allowed in a URI?

I know certain characters are illegal in certain parts of the URI, but I'm talking about a URI as a whole, like this:

scheme://username:password@domain.tld/path/to/file.ext?key=value#blah

I'm thinking maybe space, although technically I suppose that could be part of the password, or would it be escaped as %20 in that case?

Was it helpful?

Solution

Any of the control characters should be good for this, such as TAB, FF and so on.

RFC3986 (a) controls the URI specification and Appendix A of that RFC states that the characters are limited to:

ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
0123456789-._~:/?#[]@!$&'()*+,;=

(and the % encoding character, of course, for all other characters not listed above).

So, basically, any other character should be okay as a delimiter.


(a) This has actually been augmented by RFC6874 which has to do with changes to the IPv6 part of the URI, adding a zone identifier. Since the zone ID consists of % and "unreserved" characters already included above, it doesn't change the set of characters allowed.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top