
How do you produce a regex that matches only valid URI. The description for URIs can be found here: It doesn't need to extract any parts, just test if a URI is valid.

(preferred format is .Net RegularExpression) (.Net Version 1.1)

  • Doesn't neet to check for a known protocol, just a valid one.

Current Solution:

Was it helpful?


This site looks promising:

They propose following regex:



Does Uri.IsWellFormedUriString work for you?

The URI specification says:

The following line is the regular expression for breaking-down a well-formed URI reference into its components.


(I guess that's the same regex as in the STD66 link given in another answer.)

But breaking-down is not validating. To correctly validate a URI, one would have to translate the BNF for URIs to a regex. While some BNFs cannot be expressed as regular expressions, I think with this one it could be done. But it shouldn't be done - it would be a huge mess. It's better to use a library function.

The best and most definitive guide to this I have found is here: (In answer to your question, see the URI table entry)

All of these rules from RFC3986 are reproduced in Table 2 along with a regular expression implementation for each rule.

A javascript implementation of this is available here:

For reference, the URI regex is repeated below:

# RFC-3986 URI component:  URI
[A-Za-z][A-Za-z0-9+\-.]* :                                      # scheme ":"
(?: //                                                          # hier-part
  (?: (?:[A-Za-z0-9\-._~!$&'()*+,;=:]|%[0-9A-Fa-f]{2})* @)?
        (?:                                                    (?:[0-9A-Fa-f]{1,4}:)    {6}
        |                                                   :: (?:[0-9A-Fa-f]{1,4}:)    {5}
        | (?:                            [0-9A-Fa-f]{1,4})? :: (?:[0-9A-Fa-f]{1,4}:)    {4}
        | (?: (?:[0-9A-Fa-f]{1,4}:){0,1} [0-9A-Fa-f]{1,4})? :: (?:[0-9A-Fa-f]{1,4}:)    {3}
        | (?: (?:[0-9A-Fa-f]{1,4}:){0,2} [0-9A-Fa-f]{1,4})? :: (?:[0-9A-Fa-f]{1,4}:)    {2}
        | (?: (?:[0-9A-Fa-f]{1,4}:){0,3} [0-9A-Fa-f]{1,4})? ::    [0-9A-Fa-f]{1,4}:
        | (?: (?:[0-9A-Fa-f]{1,4}:){0,4} [0-9A-Fa-f]{1,4})? ::
        ) (?:
            [0-9A-Fa-f]{1,4} : [0-9A-Fa-f]{1,4}
          | (?: (?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?) \.){3}
      |   (?: (?:[0-9A-Fa-f]{1,4}:){0,5} [0-9A-Fa-f]{1,4})? ::    [0-9A-Fa-f]{1,4}
      |   (?: (?:[0-9A-Fa-f]{1,4}:){0,6} [0-9A-Fa-f]{1,4})? ::
    | [Vv][0-9A-Fa-f]+\.[A-Za-z0-9\-._~!$&'()*+,;=:]+
  | (?:(?:25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}
  | (?:[A-Za-z0-9\-._~!$&'()*+,;=]|%[0-9A-Fa-f]{2})*
  (?: : [0-9]* )?
  (?:/ (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})* )*
| /
  (?:    (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})+
    (?:/ (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})* )*
|        (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})+
    (?:/ (?:[A-Za-z0-9\-._~!$&'()*+,;=:@]|%[0-9A-Fa-f]{2})* )*
(?:\? (?:[A-Za-z0-9\-._~!$&'()*+,;=:@/?]|%[0-9A-Fa-f]{2})* )?   # [ "?" query ]
(?:\# (?:[A-Za-z0-9\-._~!$&'()*+,;=:@/?]|%[0-9A-Fa-f]{2})* )?   # [ "#" fragment ]

Are there some specific URIs you care about or are you trying to find a single regex that validates STD66?

I was going to point you to this regex for parsing a URI. You could then, in theory, check to see if all of the elements you care about are there.

But I think bdukes answer is better.

The best regex I came up with according to RFC 3986 ( was the following:

Flow diagram of regex using

// named groups

// unnamed groups

capture groups

  1. scheme
  2. authority
  3. userinfo
  4. host
  5. port
  6. path
  7. query
  8. fragment
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top