Question

i have a page,say abc.html, that is having a small form with some fields.

<form name="form" method="post" action="abc.html">.......................</form>

when we submit the form it again comes back to abc.html with some data posted and shows the resulted names on the page which came after processing the posted data.

in the whole procedure the page url remains same.Now i want to parse this abc.html containing data after the submission of the form.I have done parsing in which the original url contains all the data but not like this on which after submission the data gets displayed on the page.Please tell me how can i parse such page??

Was it helpful?

Solution

Well, to get the correct HTML from the server, you have to send a POST request containing the form data. Then you can parse the server response.

OTHER TIPS

Parsing the HTML file is same as us seeing it. So the HTML page rendered after posting the data will have some or any HTML element in which the additional text is displayed. When you parse the page chek if this or a container exists if so then read the rest of the data. The HTML page displayed without the posted data will not have this additional or container.

Edit: Look at this question : PHP Screen Scraping and Sessions

First of all. Your page should be abc.php. Otherwise it will not parse any php.

Second. Here is some code that will help you out (I Hope). Copy/Paste this example and place it in abc.php

<html>
<head></head>
<body>
<?php 
if (isset($_POST['submit'])) {
  echo 'you posted the following value :'.$_POST['foo'];
}
?>
<form name="form" action="abc.php" method="post">
<input type="text" name="foo" value="" />
<input type="submit" name="submit" value="Press Me" />
</form>
</body>
</html>

If this is not the case. And you want to parse HTML like parsing XML you should use the DOMDocument class of PHP

$oDom = new DOMDocument();
$oDom->loadHTML($sHTMLstring);
// or 
$oDom->loadHTMLFile($sFileName);
// now you can walk the dom like
$oDomElement = $oDom->getElementByTagName('form');

http://nl.php.net/manual/en/domdocument.loadhtml.php http://nl.php.net/manual/en/domdocument.loadhtmlfile.php http://nl.php.net/manual/en/domdocument.getelementsbytagname.php

Hope this helps

Good question, but I think it's not possible with PHP. My company doing that with very advanced tool in C. It just grab any page and send the any form and get rsponse HTML. But You can found maybe some tools. Don't know.

I think the point here is that you can't just open the URL and read the HTML that comes back. You will have to play the part of the browser in order to interact with the server side form. To do this, you'll have to write your own code to HTTP POST the form input data. The HTTP response to your POST will contain the generated HTML, which you can then parse for the processed results.

If you want to send the form to the web server (i.e. "fill" it first) you need something similar to Perls WWW::Mechanize. See this question for possible solutions to do this. Afterwards, you need to parse the resulting page, and that heavily depends on the site in question itself: one site might use named elements you can easily retrieve using regular expressions, a different site might not, making it much harder to get the values you're interested in.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top