
I try retrive POST data from site and trying do it many times/combination with nokogiri, uri, mechanize but i only retrive data from get request. I dont see content from interested me div.

Below is body of get from this site. Im looking for content div id="list2". There is table with user and their telephones numbers.

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
<html xmlns="">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<meta name="Description" content="Wyszukiwarka"  />
<meta name="Author" content="LR"  />
<link href="styleblue.css" rel="stylesheet" type="text/css" />
<script type="text/javascript" src="includes/scripts.js"></script>
<script type="text/javascript" src="includes/jquery-1.6.1.min.js"></script>
<script type="text/javascript" src="includes/jquery.form.js"></script>
<link rel="stylesheet" type="text/css" href="img/themes/blue/style.css" />
<link rel="stylesheet" type="text/css" href="img/themes/ui/smoothness/jquery-ui-1.8.13.custom.css" media="screen"/>
<script type="text/javascript" src="includes/jquery-ui-1.8.13.custom.min.js"></script>
<script type="text/javascript" src="includes/ui.datepicker-pl.js"></script>

<script type="text/javascript">
<body><table style="width: 100%; margin: 0px; padding: 0px; vertical-align:top" cellpadding="0" cellspacing="0">
  <tr class="hideen">
    <td style="width: 100%"><table cellpadding="0" cellspacing="0" style="width:100%; margin:0px; padding:0px;">
          <td id="top_left_login" style="height: 101px"></td>
          <td style="height: 101px"><img alt="" src="img/top.jpg" /></td>
          <td id="top_right_login" style="height: 101px"><div style="position:relative; width:194px; left:-207px; bottom:36px; text-align:right ">Czwartek&nbsp;&nbsp;&nbsp;<span style="color:#FFFFFF;">03-04-2014</span></div></td>
  <tr  class="hideen">
    <td id="menu"><div >
        <img src="img/blue/mline.jpg" border="0" alt="" /><a href="index.php">Wyszukiwarka</a><img src="img/blue/mline.jpg" border="0" alt="" /><a href="aktualizacja.php">Aktualizacja danych</a><img src="img/blue/mline.jpg" border="0" alt="" /><a href="pomoc.php">Pomoc</a><img src="img/blue/mline.jpg" border="0" alt="" />     

      <br /><br />
        <div id="list2">I LOOKING FOR THIS DIV</div>

        <br />
      <blockquote style="font-size:10px ">
        * aktualizacje <br/>
        <img src="img/plus.gif" width="18" height="18" /> 

  <tr class="hideen">
    <td style="width: 100%"><div id="bottom" align="center"><img src="img/bzit.jpg" width="225" height="42" border="0" alt="" /></div></td>

When i inspect site in firebug i see GET url/index.php and POST url/grids/search.php. This site is in local web. When i go to tab XHR where is POST search.php i see

Connection Keep-Alive Content-Type text/html Date Thu, 03 Apr 2014 05:31:44 GMT Keep-Alive timeout=15, max=100 Server Apache Transfer-Encoding chunked X-Powered-By PHP/5.2.5 and Accept */* Accept-Encoding gzip, deflate Accept-Language pl,en-US;q=0.7,en;q=0.3 Cache-Control no-cache Connection keep-alive Content-Length 99 Content-Type application/x-www-form-urlencoded; charset=UTF-8 Host url Pragma no-cache Referer url/index.php User-Agent Mozilla/5.0 (Windows NT 5.1; rv:28.0) Gecko/20100101 Firefox/28.0 X-Requested-With XMLHttpRequest

Next there is tab response where is interested me response

    `<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "">
    <html xmlns="">
    <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
    <meta name="Description" content="Wyszukiwarka telefonów"  />
    <meta name="Author" content="LR"  />
    <link rel="stylesheet" type="text/css" href="/img/themes/blue/style.css" />


 <div id="contenttable">
    <table class="scroll" cellpadding="0" cellspacing="0" width="100%" >
      <thead >
          <td colspan="11">Lista wyników *</td>

      <tbody >

    <table class="scroll" cellpadding="0" cellspacing="0" width="100%" >
      <tbody >
      <tfoot align="center">
          <td colspan="11" style="text-align:left"><img src="img/themes/blue/images/first.png"  onclick="jQuery('#page').val(1);gridReloadTel()" /> <img src="img/themes/blue/images/prev.png" onclick="jQuery('#page').val(1);gridReloadTel()" />
            <input id="page" type="text" value="2" size="3" maxlength="5"  onkeydown="doSearchTel(arguments[0]||event)" />
            / 802 <img src="img/themes/blue/images/next.png" onclick="jQuery('#page').val(3);gridReloadTel()" /> <img src="img/themes/blue/images/last.png" onclick="jQuery('#page').val(802);gridReloadTel()" /> | wyświetl
            <select id="rows" name="rows" onchange="gridReloadTel()">
              <option value="15" selected >15</option>
              <option value="25"  >25</option>
              <option value="50"  >50</option>
              <option value="200"  >200</option> 
            | 12016 wierszy</td>

    <div style="position:absolute; top:140px; right:20px;"  class="hideen"><form action="export.php" method="post" target="_blank" id="exportform" name="exportform" >
        <a href="javascript:document.exportform.submit();" onmouseout="MM_swapImgRestore()" onmouseover="MM_swapImage('xlsex','','img/xls_down.jpg',1)"><img src="img/xls_up.jpg" name="xlsex"  border="0" id="xlsex" title="Wygeneruj spis wyb" /></a>
        <input name="sord" type="hidden" value="PRNazwa asc" /><input name="where" type="hidden" value=" 1=1 " />
        <input type="hidden" name="start" value="15" />
        <input type="hidden" name="limit" value="15" />

    <script type="text/javascript">

      var _gaq = _gaq || [];
      _gaq.push(['_setAccount', '']);

      (function() {
        var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true;
        ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '';
        var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s);


How can i retrive this data from div id='contenttable' ? Any answer, idea could be very helpfull for me.

Était-ce utile?

La solution

try mechanize

@agent = do |a|
      a.user_agent_alias = 'Windows Chrome'
      a.log = "activity.log"
      a.get 'url/index.php'

now, you can submit a post request with'url/grids/search.php', "foo" => "bar", headers go here)

to get the query params & header, see request headers in Developer tools

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top