Question

I wanted to access the translation results of the following url

http://translate.google.com/translate?hl=en&sl=en&tl=ar&u=http%3A%2F%2Fwww.saltycrane.com%2Fblog%2F2008%2F10%2Fhow-escape-percent-encode-url-python%2F

the translation is displayed in the bottom content frame out of the two frames. I am interested in retrieving only the bottom content frame to get the translations

selenium for python allows us to fetch page contents via web automation:

browser.get('http://translate.google.com/#en/ar/'+hurl)

The required frame is an iframe :

<div id="contentframe" style="top:160px"><iframe   src="/translate_p?hl=en&am... name=c frameborder="0" style="height:100%;width:100%;position:absolute;top:0px;bottom:0px;"></div></iframe>

but how to get the bottom content frame element to retrieve the translations using web automation?

Came to know that PyQuery also allows us to browse the contents using the JQuery formalism

Update:

An answer mentioned that Selenium provides a method where you can do that.

frame = browser.find_element_by_tag_name('iframe')
browser.switch_to_frame(frame)
# get page source
browser.page_source

but it does not work in the above example. It returns an empty page .

Was it helpful?

Solution

You can use driver.switchTo.frame(1); here, the digit 1 inside frame() is the index of frames present in the webpage. as your requirement is to switch to second frame and the index starts with 0, you should use driver.switchTo.frame(1);

But the above code is in Java. In Python, you can use the below line.

driver.switch_to_frame(1);

UPDATE

 driver.get("http://translate.google.com/translate?hl=en&sl=en&tl=ar&u=http://www.saltycrane.com/blog/2008/10/how-escape-percent-encode-url-python/");
 driver.switchTo().frame(0);
 System.out.println(driver.findElement(By.xpath("/html/body/div/div/div[3]/h1/span/a")).getText());

Output: SaltyCrane ???????

I have just tried to print the title name SaltCrane that is present inside the iframe. It worked for me except for the ? symbols after the SaltCrane. As it was arabic, it was unable to decode the same.

The above code is in Java. Same logic should also work in Python.

OTHER TIPS

Selenium provides a method where you can do that.

frame = browser.find_element_by_tag_name('iframe')
browser.switch_to_frame(frame)
# get page source
browser.page_source
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top