Question

I'm doing some web scraping in python and I want to delete the element "." from each element of a list. I have two approaches, but just one gives the correct output. The code is above.

import urllib2
from bs4 import BeautifulSoup
first=urllib2.urlopen("http://www.admision.unmsm.edu.pe/res20130914/A.html").read()
soup=BeautifulSoup(first)
w=[]
for q in soup.find_all('tr'):
    for link in q.find_all('a'):
        w.append(link["href"])

s = [ i.replace(".","") for i in w ]

l=[]

for t in w:
    l=t.replace(".","")

If I run print s , the output is the right output , but if I run print l, it isn't.

I would like to know why s gives the correct ouput and l doesn't.

Was it helpful?

Solution

In the loop, you replace the whole list in each iteration, instead of appending to it as in the single line example.

Instead, try:

for t in w:
    l.append(t.replace(".",""))

OTHER TIPS

You are replacing the list each time and it'e getting overwritten. As a result, you are getting the last element after the iterations! Hope, it helps!

import urllib2
from bs4 import BeautifulSoup
first=urllib2.urlopen("http://www.admision.unmsm.edu.pe/res20130914/A.html").read()
soup=BeautifulSoup(first)
w=[]
for q in soup.find_all('tr'):
    for link in q.find_all('a'):
        w.append(link["href"])

s = [ i.replace(".","") for i in w ]
print s
l=[]

for t in w:
    l.append(t.replace(".",""))
print l

Cheers!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top