a
in your loop is not a string, it's a dictionary (or, specifically, a BeautifulSoup.Tag). In your if
statement you correctly get the href
string from the dictionary to compare with, but when matching the regex you're not.
Simply using the string a['href']
instead of the dictionary a
in the regex match will fix your runtime error;
for a in soup.findAll('a'):
if 'http://sport.detik.com/sepakbola/read/' in a['href']:
urls = re.findall('http[s]?://(?:[a-zA-Z]|[0-9]|[$-_@.&+]|[!*\(\),]|(?:%[0-9a-fA-F][0-9a-fA-F]))+', a['href'])