Question

I'm in the very early stages of attempting to automate data entry and collection from a website. I have a 16,000 line CSV file. For each line, I'd like to enter data from that line into a textarea on a webpage. The webpage can then perform some calculations with that data and spit out an answer that I'd collect. Specifically, on the webpage http://www.mirbase.org/search.shtml, I'd like to enter a sequence in the sequence text box at the bottom, press the "Search miRNAs" button and then collect results on the next page.

My plan as of right now is to use a C# WebBrowser. My understanding is that I can access the individual elements in the HtmlDocument either by id, name or coordinate. The last option is not ideal, because if I distribute this program to other people I can't be sure they'd be using at the same coordinates. As for the other 2 options, the textarea has a name, but it's the same as the form name, so I don't know how to access it. The button I'd like to click has neither a name nor an id.

Does anyone have any ideas as to how to access the elements I need? I am by no means set on this method, so if there's an easier/better way I'm certainly open to suggestions.

Was it helpful?

Solution

The WebBrowser class is not designed for this, hence why you are coming up with your problems.

You need to look into a tool that is designed for web automation.

Since you are using C#, Selenium has a wonderful set of C# bindings, and it can solve your problems because you'll be to use different locators (locating an element by a CSS selector or XPath specifically).

http://docs.seleniumhq.org/

OTHER TIPS

Check mshtml - Mshtml on msdn

You can use it with the WebBrowser object.

Add Microsoft.mshtml reference to your project and the using mshtml declaration in your class.

Using mshtml you can easily set and get elements properties.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top