문제

I'm currently using CyberNeko in an attempt to grab information I want from a website. However, I believe the website checks the user agent/browser version to keep from just grabbing the url content.

I am aware of using htmlunit to change the browser version, but not sure if I can go about this using CyberNeko.

Does anyone know if it's possible to do such a thing?

도움이 되었습니까?

해결책

I've never used CyberNeko, but I thought it was just a HTML parser, i.e. I didn't think you could use it to issue the HTTP requests and actually download the web page.

It could be the fact that the HTTP request issued by CyberNeko is missing various headers such as the user agent header. An easy way to ensure that the HTTP request looks like a request sent from a browser is to use HttpClient instead of CyberNeko to download the web page. There's some example code available here.

Once you've successfully downloaded the page, use CyberNeko to parse out the bits you're interested in.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top