Frage

This is the first time, I am posting a question. Please forgive me if I do something incorrect.

I am trying to create a python-selenium script to get the source code of MULTIPLE web pages.

I am running the script in the following manner (via command line on windows 7)

python program.py < input.txt > output.htm

This does creates the result, however since I am using a loop function it is appending the same file with all the results.

Is there a way, I can create a NEW FILE FOR EACH result/print

Thanks in advance.

Here is my code,

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
from selenium.webdriver.common.action_chains import ActionChains

path_to_chromedriver = '/Users/office/Desktop/chromedriver' # change path as needed
browser = webdriver.Chrome(executable_path = path_to_chromedriver)
while(True):
url = raw_input("")
url2 = raw_input("")
browser.get(url)
time.sleep(10)
browser.get(url2)
time.sleep(10)

element_to_hover_over = browser.find_element_by_xpath('//*[@id="personSummaryTable"]/tbody/tr/td[2]/div[5]/div/span[1]/a')
hover = ActionChains(browser).move_to_element(element_to_hover_over)
hover.perform()
time.sleep(5)
stuff = browser.page_source.encode('ascii', 'ignore')
print stuff

Jan's idea worked great,

All it needed was to let python decide a filename, Thanks Jan

import datetime
suffix = ".html"
basename = datetime.datetime.now().strftime("%y%m%d_%H%M%S")
fname = "_".join([basename, suffix]) # e.g. 'mylogfile_120508_171442'
print fname
with open(fname, "w") as f:
    f.write(stuff)
War es hilfreich?

Lösung

Welcome at SO

You have two options

Let Python code decide on name of output file

This will likely be based on current time, e.g.

import time
# here get somehow your page content
page_content = ?????
prefix = "outut-"
suffix = ".html"
fname = "{prefix}{now:d}{suffix}".format(now=time.time())
print fname
with open(fname, "w") as f:
    f.write(page_content)

Let your external loop (out of Python) create the file name

This file name can be e.g. on Linux created by some form of date command.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top