How to convert a link that has javascript:__doPostBack in href to normal url which wget/curl/lynx can understand?

https://stackoverflow.com/questions/23210403

07-07-2023
|

Question

I searched in SO first, and found this question. How to click a link that has javascript:__doPostBack in href? but it gives the answer in python only.

What I need is, when go through a website, some pages (2,3,4, etc) with links like below:

javascript:__doPostBack('AspNetPager1','2')
javascript:__doPostBack('AspNetPager1','3')
javascript:__doPostBack('AspNetPager1','4')

If I click it, and it will display the next page, but the real url isn't displayed in browser.

So my questions is, how can I convert the javascript link into traceable real url and feed to wget/curl/lynx?

My purpose is to use the tools (wget/curl/lynx) to download these pages one by one by scripting. But because of these javascript:__doPostBack, I can't find a good way to do it.

Solution

You can't really do it analytically. __doPostBack could be arbitrarily complex.

What you should do instead is install Firebug (assuming you are using Firefox), activate the Network tab, press "Persist", and then click that button. The Network tab will show you the actual network traffic, and you can deduce the real URLs from that. In fact, you can just right-click on the particular network request that interests you and select "Copy as cURL" and it will put the curl instructions -- complete with things like cookies and headers -- in your paste buffer.

There is a similar function built-in to Chrome.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow