Match urlencoded data from file

https://stackoverflow.com/questions/21441314

04-10-2022
|

Frage

Is there a way to match a HTTP POST urlencoded data (Content-Type: application/x-www-form-urlencoded) in a file? The matched strings will contain only printable characters and urlencoded characters like % A-F and the variable sign & in HTTP POST/GET data and of course the = between variable name and variable content. As an example a random text with the data I need to match:

Death there mirth way the noisy merit. Piqued shy spring nor six though mutual living ask extent. Replying of dashwood advanced ladyship smallest disposal or. Attempt offices own improve now see. Called person are around county talked her esteem. Those fully these way nay thing seems. website=http%3A%2F%2Fwww.test.com%2F&number=1037319821&comment=Test+mea&gender=male&submit=Submit Ye on properly handsome returned throwing am no whatever. In without wishing he of picture no exposed talking minutes. Curiosity continual belonging offending so explained it exquisite. Do remember to followed yourself material mr recurred carriage. High drew west we no or at john. About or given on witty event. Or sociable up material bachelor bringing landlord confined. Busy so many in hung easy find well up. So of exquisite my an explained remainder. Dashwood denoting securing be on perceive my laughing so. id=1234&variable=test&firstname=John&lastname=Doe&gender=male&submit=Submit

The data to match is in bold. Tried many ways but couldn't find a regex with %[A-F]{2} & = or something to generically match them.

Lösung

This ought to get you most of the way there.

x = re.compile("([A-Za-z0-9%./]+=[^\s]+)")
out = x.findall(input_str)

# out = ['website=http%3A%2F%2Fwww.test.com%2F&number=1037319821&comment=Test+mea&gender=male&submit=Submit', 'id=1234&variable=test&firstname=John&lastname=Doe&gender=male&submit=Submit']'

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow