Question

Given a regexp, I would like to generate random data x number of time to test something.

e.g.

>>> print generate_date('\d{2,3}')
13
>>> print generate_date('\d{2,3}')
422

Of course the objective is to do something a bit more complicated than that such as phone numbers and email addresses.

Does something like this exists? If it does, does it exists for Python? If not, any clue/theory I could use to do that?

Was it helpful?

Solution

Pyparsing includes this regex inverter, which returns a generator of all permutations for simple regexes. Here are some of the test cases from that module:

[A-C]{2}\d{2}
@|TH[12]
@(@|TH[12])?
@(@|TH[12]|AL[12]|SP[123]|TB(1[0-9]?|20?|[3-9]))?
@(@|TH[12]|AL[12]|SP[123]|TB(1[0-9]?|20?|[3-9])|OH(1[0-9]?|2[0-9]?|30?|[4-9]))?
(([ECMP]|HA|AK)[SD]|HS)T
[A-CV]{2}
A[cglmrstu]|B[aehikr]?|C[adeflmorsu]?|D[bsy]|E[rsu]|F[emr]?|G[ade]|H[efgos]?|I[nr]?|Kr?|L[airu]|M[dgnot]|N[abdeiop]?|Os?|P[abdmortu]?|R[abefghnu]|S[bcegimnr]?|T[abcehilm]|Uu[bhopqst]|U|V|W|Xe|Yb?|Z[nr]
(a|b)|(x|y)

Edit:

To do your random selection, create a list (once!) of your permutations, and then call random.choice on the list each time you want a random string that matches the regex, something like this (untested):

class RandomString(object):
    def __init__(self, regex):
        self.possible_strings = list(invRegex.invert(regex))
    def random_string(self):
        return random.choice(self.possible_strings)

OTHER TIPS

There is a post on the Python mailing list about a module that generates all permutations of a regex. I'm not so sure how you might go about randomising it though. I'll keep checking.

I will probably be flogged for suggesting this, but perl has a module that does exactly this. You might want to take a look at the code how to implement it in python:

http://p3rl.org/String::Random

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top