Comment puis-je gratter les informations de sites Web ASP.NET lorsque les liens de pagination et JavaScript sont utilisés?

StackOverflow https://stackoverflow.com/questions/2449328

Question

J'ai donné une liste du personnel qui est censé être à jour, mais il ne correspond pas à un intranet People Finder qui est écrit dans ASP.NET.

Comme l'information est sensible, je ne suis pas en mesure d'accéder à la base de données du People Finder utilise la seule façon que je peux obtenir l'information est en grattant la structure à partir de hauts gradés en haut puis en passant par chaque niveau à son tour.

Chaque personne a un numéro de personnel qui forme l'URL http://intranet/peoplefinder/index.aspx?srn=ABC1234 et toutes les personnes qui leur rapportent sont énumérés underneth dans le format <a id="gvEmployees_ctl03_lnkFullName" href="index.aspx?srn=ABC4321" target="_self"> où chaque URL indique le numéro du personnel et fournit un lien vers leur équipe.

Le problème se pose lorsque les équipes sont grandes que la pagination est mis en œuvre dans le GridView avec une URL telle que <a href="javascript:__doPostBack('gvEmployees','Page$2')">2</a>.

Comment puis-je gratter cette page, capturer le SRN et d'autres détails ainsi que les personnes qui relèvent de la personne sur toutes les pages du GridView ensuite en boucle à travers chaque reportée et faire le même processus jusqu'à ce que toute la liste est complète?

Exemple HTML de résultat

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

<html xmlns="http://www.w3.org/1999/xhtml" >
<head><title>
    People Finder: Name Surname
</title><link rel="stylesheet" href="/path/to/style.css" type="text/css" /><link rel="stylesheet" href="/path/to/anotherStyle.css" type="text/css" />
    <script type="text/javascript" src="/path/to/peoplefinder.js"></script>
</head>
<body>
    <form name="form1" method="post" action="/path/to/index.aspx" id="form1">
<div>
<input type="hidden" name="__EVENTTARGET" id="__EVENTTARGET" value="" />
<input type="hidden" name="__EVENTARGUMENT" id="__EVENTARGUMENT" value="" />
<input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="### ViewState ###" />
</div>

<script type="text/javascript">
<!--
var theForm = document.forms['form1'];
if (!theForm) {
    theForm = document.form1;
}
function __doPostBack(eventTarget, eventArgument) {
    if (!theForm.onsubmit || (theForm.onsubmit() != false)) {
        theForm.__EVENTTARGET.value = eventTarget;
        theForm.__EVENTARGUMENT.value = eventArgument;
        theForm.submit();
    }
}
// -->
</script>


<script src="/path/to/WebResource.axd?d=AueXWrgAf8xSxMTAt1Q4AA2&amp;t=633311832634916698" type="text/javascript"></script>

        <div class="HP3CHeader">
            <div id="LWHPBanner">
                <h1><span id="lblName">Name Surname</span></h1>
            </div>
        </div>

        <div id='CPMain'>
            <div id="mainBox">

            <div id="pnlEmployeeDetails">

                <div id='basicData'>
                    <img id="imgPhoto" class="photo" src="/path/to/photo.jpg" style="height:69px;width:69px;border-width:0px;" />
                    <span id="lblBusinessUnit">Business Unit</span>
                    <span id="lblCostCentreName">Cost Centre</span>
                    <span id="lblLocation">Location</span>

                    <a href='/path/to/checkcontactdetails.htm' target='_blank' onclick='return OpenCheckContactDetails();' >Find out how to change your details/photo.</a>
                    <div id="manager">
        <strong>Reports to: </strong><a id="hlManager" href="/path/to/index.aspx?srn=ABC1234">Name Surname</a>
    </div>
                </div>

                <div id='contactData'>

                    <div id="pnlSrn">
        <strong>Staff number:</strong> <span id="lblSrn">ABC1234</span>
    </div>


                    <div id="pnlEmailAddress">
        <strong>Email Address:</strong> <span id="lblEmailAddress">Email</span>
    </div>
                    <div style="clear: both"></div>
                </div>

</div>

            <div id="pnlGrid">

                <h3><span id="lblGridTitle">Name's team</span></h3>
            <div>
        <table class="subordinates" cellspacing="0" cellpadding="2" rules="cols" border="1" id="gvEmployees" style="border-style:None;border-collapse:collapse;">
            <tr style="color:Black;background-color:#EFF3FB;border-style:None;font-weight:bold;">
                <th scope="col"><a href="javascript:__doPostBack('gvEmployees','Sort$SRN')" style="color:Black;">SRN</a></th><th scope="col"><a href="javascript:__doPostBack('gvEmployees','Sort$FullName')" style="color:Black;">Full name</a></th><th scope="col"><a href="javascript:__doPostBack('gvEmployees','Sort$RACFID')" style="color:Black;">RACFID</a></th>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl02_lnkFullName" href="index.aspx?srn=1K5932" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl03_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl04_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl05_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl06_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl07_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl08_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl09_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:White;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl10_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="reports" style="background-color:#EFF3FB;border-style:None;">
                <td style="width:70px;">ABC1234</td><td>
                            <a id="gvEmployees_ctl11_lnkFullName" href="/path/to/index.aspx?srn=ABC1234" target="_self">Name Surname</a> 
                        </td><td>ABCD</td>
            </tr><tr class="PagerStyle" style="color:#000039;border-style:None;">
                <td colspan="3"><table border="0">
                    <tr>
                        <td><span>1</span></td><td><a href="javascript:__doPostBack('gvEmployees','Page$2')" style="color:#000039;">2</a></td>
                    </tr>
                </table></td>
            </tr>
        </table>
    </div>

</div>
            </div>

            <div id="searchBox">
                <strong>Search People Finder:</strong>
                <br /><br />
                <span>Forename:</span><br/>
                <span><input name="txtFirstname" type="text" id="txtFirstname" /></span><br/>
                <span>Surname:</span><br/>
                <span><input name="txtSurname" type="text" id="txtSurname" /></span><br/>
                <span>RACFID:</span><br/>
                <span><input name="txtRacfid" type="text" id="txtRacfid" /></span><br/>
                <span>Staff number:</span><br/>
                <span><input name="txtSrn" type="text" id="txtSrn" /></span><br/>
                <div class="searchBoxItem" style="text-align:center;width:100%"><input type="submit" name="btnFind" value="Search" onclick="javascript:WebForm_DoPostBackWithOptions(new WebForm_PostBackOptions(&quot;btnFind&quot;, &quot;&quot;, false, &quot;&quot;, &quot;index.aspx&quot;, false, false))" id="btnFind" title="Search for employees member" class="button" style="border-style:Outset;" /></div><br/> 
                <div>People Finder searches only UK staff.</div> 
               <!-- <div><a class="execBoardLink" href="/path/to/index.aspx?srn=ABC1234">Show Executive Board</a></div> -->
                <div style="margin-top:5px;"><a href="/path/to/phonebook" target="phoneBook" onclick='return OpenPhonebook();' title="Open Phonebook in new window">Open Phonebook</a></div>
            </div>
        </div>

        <div class="contentFooter"  style="text-align:center;">
            <table width="100%" cellpadding="0" cellspacing="0" border="0" summary="Navigation layout table">
                <tr>
                    <td align="left"><span class="linkArrow">&lt;</span> <a href="javascript:history.back();">Back</a></td>
                    <td align="center"></td>
                    <td align="right"><span class="linkArrow">^ </span><a href="#top">Top</a></td>
                </tr>
            </table>
        </div> 

<div>

    <input type="hidden" name="__PREVIOUSPAGE" id="__PREVIOUSPAGE" value="vy066Txz34y1E515UsTSTDabHKEmdBRCsq7xM0lpJls1" />
    <input type="hidden" name="__EVENTVALIDATION" id="__EVENTVALIDATION" value="/wEWCgKM3uTTAgLP/83pDwLfwaTTAQKNguzjCAKt98LeCwLZh62pDwKKqdGpBwLd2q7jAwKa+5aMBAL5zb65C42zY4GBEUKujhjtZ/hZ8sLESfiF" />
</div></form>
</body>
</html>
Était-ce utile?

La solution

Vous pouvez poster une variable à la page HTML pour passer par la pagination.

string lcUrl = "http://www.mysite.com/page.aspx";

HttpWebRequest loHttp =

   (HttpWebRequest) WebRequest.Create(lcUrl);


// *** Send any POST data

string lcPostData =

   "gvEmployees=" + HttpUtility.UrlEncode("Page$2");

loHttp.Method="POST";

byte [] lbPostBuffer = System.Text.           

                       Encoding.GetEncoding(1252).GetBytes(lcPostData);

loHttp.ContentLength = lbPostBuffer.Length;

Stream loPostData = loHttp.GetRequestStream();

loPostData.Write(lbPostBuffer,0,lbPostBuffer.Length);

loPostData.Close();

HttpWebResponse loWebResponse = (HttpWebResponse) loHttp.GetResponse();

Encoding enc = System.Text.Encoding.GetEncoding(1252);

StreamReader loResponseStream =

   new StreamReader(loWebResponse.GetResponseStream(),enc);

string lcHtml = loResponseStream.ReadToEnd();

loWebResponse.Close();

loResponseStream.Close();

Ensuite, analyser les données dont vous avez besoin de la chaîne.

- EDIT -

Voici ce que je voudrais essayer (quelque chose de similaire) où toutes les données post est envoyé:

string lcPostData =

       "__EVENTTARGET" + HttpUtility.UrlEncode("gvEmployees"); &
"__EVENTARGUMENT" + HttpUtility.UrlEncode("Page%242"); &
"__VIEWSTATE" + HttpUtility.UrlEncode("<Value of _Viewstate>");

Autres conseils

Vous ouvrez le Fiddler et ouvrez la deuxième page du site asp.net table.Go à l'onglet webforms dans Fiddler pour cette session de page particulière et de vérifier dans le corps quelles sont les variables sont posting.Concat toutes les variables dans la même format de séquence et données post utilisant HttpWebRequest. Dans mon cas, il était:

string PostData = "__EVENTTARGET=" 
    + HttpUtility.UrlEncode("ctl00$ContentPlaceHolder2$grdDirectory") 
    + "&"
    + "__EVENTARGUMENT="+HttpUtility.UrlEncode("Page$2") 
    + "&"
    + "__VIEWSTATE="+ HttpUtility.UrlEncode(view_state)
    + "&"
    + "__VIEWSTATEGENERATOR=" 
    + HttpUtility.UrlEncode(viewstategenerator)
    + "&"
    + "__VIEWSTATEENCRYPTED=" 
    + HttpUtility.UrlEncode(viewstateencrypted) 
    + "&" 
    + "__EVENTVALIDATION=" + HttpUtility.UrlEncode(eventvalidation);

Espérons que cela fonctionnera.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top