How do you parse and match the keyword in search engine url using python re module?

StackOverflow https://stackoverflow.com/questions/12831537

  •  06-07-2021
  •  | 
  •  

سؤال

Example from Google:

http://www.google.com.co/url?sa=t&rct=j&q=pedro%20gomez%20proyecto%20en%20la%20ciudad%20de%20valledupar&source=web&cd=10&ved=0CFsQFjAJ&url=http%3A%2F%2Fwww.21molino.com%2F1410%2F8911.html

or from Bing search:

http://www.bing.com/search?q=10%2F30+Sand&src=IE-SearchBox&FORM=IE8SRC

I want parse and match ?q= or q= keywords, using (?<=)? with the python re module. How can you can pass the multiple parameters in encode the ascii url to utf-8 so that it can be read?

Need some help here, thanks very much : )

هل كانت مفيدة؟

المحلول

Try this:

[?&]q=([^&#]*)

Or, better yet:

import urlparse
pr = urlparse.urlparse(url)
qs = urlparse.parse_qs(pr.query)['q']

The latter automatically decodes %-escapes, too.

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top