How should this regular expression mentioned in the App Engine documentation be interpreted?

StackOverflow https://stackoverflow.com/questions/23674074

  •  23-07-2023
  •  | 
  •  

سؤال

While reading through the App Engine documentation for Java, I came across this regular expression: [0-9A-Za-z._-]{0,100}. I read the Wikipedia page for regular expressions but still could not properly decode this one.

The App Engine documentation mentions the following about valid strings for namespaces:

If you do not specify a value for namespace, the namespace is set to an empty string. The namespace string is arbitrary, but also limited to a maximum of 100 alphanumeric characters, periods, underscores, and hyphens. More explicitly, namespace strings must match the regular expression [0-9A-Za-z._-]{0,100}.

Can someone please help in breaking down the regular expression to help me understand how the pattern mentioned in the regular expression satisfies the prerequisites for a namespace mentioned above?

As always, thanks a lot for helping out!!

هل كانت مفيدة؟

المحلول 2

Square brackets indicate that any of the characters inside the brackets can be used. This is called a character class.

[abc] would match "a", "b" or "c" but not "d".

You can also specify a range within a character class to indicate that any of the characters in the range should match.

[a-e] means the same as [abcde]

In your regular expression, [0-9A-Za-z._-] matches an alphanumeric character, period, underscore or hyphen. The three ranges 0-9, A-Z and a-z cover the numerals, lowercase and uppercase letters respectively.

Curly brackets indicate that the preceding character can be matched multiple times.

a{3,5} means "the character 'a', repeated 3-5 times".

I.e. it matches "aaa" and "aaaaa" but not "aa" or "aaaaaa".

We can combine the curly braces with the character class to indicate we want to match any character in the character class multiple times.

[ab]{0, 5} means "a mix of 'a' and 'b', between zero and five characters long"

I.e. it matches "aa", "bbb", "ababa" and "" but not "ababab" or "abc"

Combining these two concepts we can see how the regex matches the text description

[0-9A-Za-z._-]{0,100} means "a mix of 0-9, A-Z, a-z, ., _ and -, between zero and a hundred characters long"

نصائح أخرى

Teach a man how to fish

Everyone here will probably tell you to dump this expression into a tool such as regex101.

You will not only learn what your expression means, but also see how tweaking parts of it changes the result.

regex101

Another popular online tool here is the Debuggex visualizations.

Regular expression visualization

Debuggex Demo

Generally the square brackets mean "one of the contents"

0-9, A-Z, a-z, you could probably figure out what they mean. These are ranges that you can configure (so if you wanted you can do 3-7, etc.)

._- means "period, underscore, or hyphen"

So [0-9A-Za-z._-] should mean "one of either an alphanumeric character, period, underscore, or hyphen"

{0,100} just gives the number of times the preceding group (I think that might be the term?) can appear (so in this case, 0 to 100 times, inclusive (I think))

Edit: Take a look at @zx81's answer too! His suggestion will be a lot more useful in the long run than my answer.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top