문제

I'm tryng to verify if all my page links are valid, and also something similar to me if all the pages have a specified link like contact. i use python unit testing and selenium IDE to record actions that need to be tested. So my question is can i verify the links in a loop or i need to try every link on my own? i tried to do this with __iter__ but it didn't get any close ,there may be a reason that i'm poor at oop, but i still think that there must me another way of testing links than clicking them and recording one by one.

도움이 되었습니까?

해결책

I would just use standard shell commands for this:

  • You can use wget to detect broken links
  • If you use wget to download the pages, you can then scan the resulting files with grep --files-without-match to find those that don't have a contact link.

If you're on windows, you can install cygwin or install the win32 ports of these tools.

EDIT: Embed Info from the use wget to detect broken links link above:

When ever we release a public site its always a good idea to run a spider on it, this way we can check for broken pages and bad urls. WGET has a recursive download command and mixed with --spider option it will just crawl the site.

1) Download WGET

    Mac:
    http://www.statusq.org/archives/2008/07/30/1954/
    Or use macports and download wget.

    Windows:
    http://gnuwin32.sourceforge.net/packages/wget.htm

    Linux:
    Comes built in
    ----------------------------------------

2) In your console / terminal, run (without the $):

    $ wget --spider -r -o log.txt http://yourdomain.com

3) After that just locate you "log.txt" file and at the very bottom
 of the file will be a list of broken links, how many links there 
are, etc.

다른 팁

Though the tool is in Perl, have you checked out linklint? It's a tool which should fit your needs exactly. It will parse links in an HTML doc and will tell you when they are broken.

If you're trying to automate this from a Python script, you'd need to run it as a subprocess and get the results, but I think it would get you what you're looking for.

What exactly is "Testing links"?

If it means they lead to non-4xx URIs, I'm afraid You must visit them.

As for existence of given links (like "Contact"), You may look for them using xpath.

You could (as yet another alternative), use BeautifulSoup to parse the links on your page and try to retrieve them via urllib2.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top