Question

I am using selenium RC to cycle through a long list of URLs, sequentially writing the HTML from each URL to a csv file. Problem: the program frequently exits at various points in list due to URL "Timed out after 30000ms" exceptions. Instead of stopping the program when it hits a URL time-out, I was trying to have the program simply write a note of the time-out in the CSV file (in the row where the HTML for the URL would have gone) and move on to the next URL in the list. I attempted to add an 'else' clause to my program but it doesnt seem to help (see below) -- ie: the program still stops every time it hits a timeout. I also seem to get 30000ms timeout exceptions even when I open selenium-server with a 60000ms timeout window --eg: "java -jar selenium-server.jar -timeout 600000" ???

Any advice would be much appreciated. Thank you.

from selenium import selenium
import unittest, time, re, csv, logging

class Untitled(unittest.TestCase):
    def setUp(self):
        self.verificationErrors = []
        self.selenium = selenium("localhost", 4444, "*firefox", "http://www.MainDomain.com")
        self.selenium.start()

    def test_untitled(self):
        sel = self.selenium
        spamReader = csv.reader(open('SubDomainList.csv', 'rb'))
        for row in spamReader:
            sel.open(row[0])
            sel.wait_for_page_to_load("400000")
            time.sleep(5)
            html = sel.get_html_source()
            ofile = open('output4001-5000.csv', 'ab')
            ofile.write(html + '\n')
            ofile.close
        else:
            ofile = open('outputTest.csv', 'ab')
            ofile.write("URL Timeout" + '\n')
            ofile.close

     def tearDown(self):
        self.selenium.stop()
        self.assertEqual([], self.verificationErrors)

if __name__ == "__main__":
     unittest.main()
Was it helpful?

Solution

Try the following:

from selenium import selenium
import unittest, time, re, csv, logging

class Untitled(unittest.TestCase):
    def setUp(self):
        self.verificationErrors = []
        self.selenium = selenium("localhost", 4444, "*firefox", "http://example.com")
        self.selenium.start()
        self.selenium.set_timeout("60000")

    def test_untitled(self):
        sel = self.selenium
        spamReader = csv.reader(open('SubDomainList.csv', 'rb'))
        for row in spamReader:
            try:
                sel.open(row[0])
            except Exception, e:
                ofile = open('outputTest.csv', 'ab')
                ofile.write("error on %s: %s" % (row[0],e))
            else:
                time.sleep(5)
                html = sel.get_html_source()
                ofile = open('output4001-5000.csv', 'ab')
                ofile.write(html.encode('utf-8') + '\n')
            ofile.close()

    def tearDown(self):
        self.selenium.stop()
        self.assertEqual([], self.verificationErrors)

if __name__ == "__main__":
     unittest.main()

Some comments:

  • You don't need a wait_for_page_to_load after an open, that will cause you timeouts because once the page is loaded after the opeen, it will start waiting again and the page will not be loading.
  • Most of the failures you get from selenium (timeouts, object not found) can be caught with try-except statements
  • You should set the timeout in your tests withing the test itself (using set_timeout), that way it doesn't depend on the way you start the server, it will always wait the time you wanted
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top