Question

Background:

  • I'm on Mac OSX, 10.8.5

  • python -V says I'm running 2.7.2

  • pip freeze says I have beautifulsoup4==4.3.2 installed.

I'm trying to use Beautiful Soup 4 to scrape a web page, per this tutorial: http://www.pythonforbeginners.com/python-on-the-web/web-scraping-with-beautifulsoup/

I followed the instructions on my work laptop, and everything worked as intended. So I've done it successfully once.

But this isn't a work project, so I tried it again on my personal laptop. Same script, but on my personal laptop (and also on my wife's identically-configured laptop) here's what happens:

Melissas-MacBook:scripts Melissa$ ./spider2.py 
from: can't read /var/mail/bs4
./spider2.py: line 3: import: command not found
./spider2.py: line 4: import: command not found
./spider2.py: line 6: syntax error near unexpected token `('
./spider2.py: line 6: `for i in range(1,10): '

Here's my script:

from bs4 import BeautifulSoup

import requests
import time

for i in range(1,10): 
    url = "http://memegenerator.net/Futurama-Fry/images/popular/alltime/page/%d" % (i)
    r = requests.get(url)
    data = r.text
    soup = BeautifulSoup(data)
    results = ""
    for link in soup.find_all('img'):
        print(link.get('alt'))

I tried uninstalling via pip, and reinstalling with easy_install. Again, the installation appeared to work (according to pip freeze) but the script threw the same error again.

The error does say, "can't read /var/mail/bs4". Why would it expect to find bs4 there? I confirmed with "ls" that /var/mail/ is indeed empty. Just getting desperate, I tried "sudo find / -atime +1 | grep bs4" but that didn't reveal anything interesting (or even the location of bs4, for that matter).

Is the error saying that python doesn't understand what the import command is? If so, how would that happen? Is import not standard, does it have a dependency on some library?

What am I missing? Where should I look next? Is this an easy answer? (usually is, but I just can't see it.) I am a relative newb to python, and eager but not too knowledgable with bash yet. Also my first time posting a stackoverflow question, so thanks in advance for any suggestions/help.

Was it helpful?

Solution

Script should be executed as -

python spider2.py

instead of -

./spider2.py

OTHER TIPS

To be able to execute the script directly from the terminal using ./spider2.py you have to specify an interpreter for it using the so called shebang line at the very start of the script. For Python, that would be:

#!/usr/bin/env python

from bs4 import BeautifulSoup
# ...

Without the interpreter being specified, the script is executed using the terminal interpreter, probably bash in this case, which of course cannot run Python code.

The file also has to be marked as executable of course.

Or you can execute the script using the Python interpreter, without the need for the shebang line, as was recommended by @theharshest:

python spider2.py

I myself prefer the latter option.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top