Question

I'm currently using CyberNeko in an attempt to grab information I want from a website. However, I believe the website checks the user agent/browser version to keep from just grabbing the url content.

I am aware of using htmlunit to change the browser version, but not sure if I can go about this using CyberNeko.

Does anyone know if it's possible to do such a thing?

Was it helpful?

Solution

I've never used CyberNeko, but I thought it was just a HTML parser, i.e. I didn't think you could use it to issue the HTTP requests and actually download the web page.

It could be the fact that the HTTP request issued by CyberNeko is missing various headers such as the user agent header. An easy way to ensure that the HTTP request looks like a request sent from a browser is to use HttpClient instead of CyberNeko to download the web page. There's some example code available here.

Once you've successfully downloaded the page, use CyberNeko to parse out the bits you're interested in.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top