문제

I'm having a problem in Python with BeautifulSoup. I need to extract all files on the page that end in ".php", but they also have to be local files. They can't be from another website. This is what I have so far:

    from bs4 import BeautifulSoup
    import mechanize
    import sys

    url = sys.argv[1]

    br = mechanize.Browser()
    code = br.open(url)
    html = code.read()
    soup = BeautifulSoup(html)

This is where I get stuck on what to do. I imagine using soup.findall to get all the "a href" tags.

도움이 되었습니까?

해결책

Try like this,

page=urllib2.urlopen(url)
soup=BeautifulSoup(page.read())

for a in soup.findAll('a'):
  if a['href'].endswith('.php'):
     print a['href']

다른 팁

import glob,os
path=input("Enter Your Path in "" =")+"//"
print path
for i in glob.glob(os.path.join(str(path),"*.php")):
                   print i
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top