質問

I started using Crawler4j and played around with the BasicCrawler Example for a while. I deleted all output from the BasicCrawler.visit() method. Then I added some url processing I already had. When I start the programm now, it suddenly prints an enourmous amount of internal processing information I don't really need. See example below

Auth cache not set in the context
Target auth state: UNCHALLENGED
Proxy auth state: UNCHALLENGED
Attempt 1 to execute request
Sending request: GET /section.aspx?cat=7 HTTP/1.1
"GET /section.aspx?cat=7 HTTP/1.1[\r][\n]"
>> "Accept-Encoding: gzip[\r][\n]"
>> "Host: www.dailytech.com[\r][\n]"
>> "Connection: Keep-Alive[\r][\n]"
>> "User-Agent: crawler4j (http://code.google.com/p/crawler4j/)[\r][\n]"
>> "Cookie: DTLASTVISITED=11/20/2013 6:16:52 AM; DTLASTVISITEDSYS=11/20/2013 6:16:48 AM;     MF2=vaxc1b832fex; dtusession=dcef3fc0-dc04-4f13-8028-186aea942c3f[\r][\n]"
>> "[\r][\n]"
>> GET /section.aspx?cat=7 HTTP/1.1
>> Accept-Encoding: gzip
>> Host: www.dailytech.com
>> Connection: Keep-Alive
>> User-Agent: crawler4j (http://code.google.com/p/crawler4j/)
>> Cookie: DTLASTVISITED=11/20/2013 6:16:52 AM; DTLASTVISITEDSYS=11/20/2013 6:16:48 AM;     MF2=vaxc1b832fex; dtusession=dcef3fc0-dc04-4f13-8028-186aea942c3f
<< "HTTP/1.1 200 OK[\r][\n]"
<< "Cache-Control: private[\r][\n]"
<< "Content-Type: text/html; charset=utf-8[\r][\n]"
<< "Content-Encoding: gzip[\r][\n]"
<< "Vary: Accept-Encoding[\r][\n]"
<< "Server: Microsoft-IIS/7.5[\r][\n]"
<< "X-AspNet-Version: 4.0.30319[\r][\n]"
<< "Set-Cookie: DTLASTVISITED=11/20/2013 6:16:54 AM; domain=dailytech.com; expires=Tue,     20-Nov-2018 11:16:54 GMT; path=/[\r][\n]"
<< "Set-Cookie: DTLASTVISITEDSYS=11/20/2013 6:16:48 AM; domain=dailytech.com; path=/[\r][\n]"
<< "X-UA-Compatible: IE=EmulateIE7[\r][\n]"
<< "Date: Wed, 20 Nov 2013 11:16:54 GMT[\r][\n]"
<< "Content-Length: 8235[\r][\n]"
<< "[\r][\n]"
Receiving response: HTTP/1.1 200 OK
<< HTTP/1.1 200 OK
<< Cache-Control: private
<< Content-Type: text/html; charset=utf-8
<< Content-Encoding: gzip
<< Vary: Accept-Encoding
<< Server: Microsoft-IIS/7.5
<< X-AspNet-Version: 4.0.30319
<< Set-Cookie: DTLASTVISITED=11/20/2013 6:16:54 AM; domain=dailytech.com;
expires=Tue,20-Nov-2018 11:16:54 GMT; path=/
<< Set-Cookie: DTLASTVISITEDSYS=11/20/2013 6:16:48 AM; domain=dailytech.com; path=/
<< X-UA-Compatible: IE=EmulateIE7
<< Date: Wed, 20 Nov 2013 11:16:54 GMT
<< Content-Length: 8235
Cookie accepted: "[version: 0][name: DTLASTVISITED][value: 11/20/2013 6:16:5
AM][domain:dailytech.com][path: /][expiry: Tue Nov 20 12:16:54 CET 2018]".
Cookie accepted: "[version: 0][name: DTLASTVISITEDSYS][value: 11/20/2013 6:16:48
AM][domain: dailytech.com][path: /][expiry: null]". 
Connection can be kept alive indefinitely
<< "[0x1f]"
<< "[0x8b]"
<< "[0x8]"
<< "[0x0]"
<< "[0x0][0x0][0x0][0x0][0x4][0x0]"
<< "[0xed][0xbd][0x7]`[0x1c]I[0x96]%&/m[0xca]{J[0xf5]J[0xd7][0xe0]t[0xa1]
[0x8][0x80]`[0x13]$[0xd8][0x90]@[0x10][0xec][0xc1][0x88][0xcd][0xe6][0x92][0xec]
[0x1d]iG#)[0xab]*[0x81][0xca]eVe]f[0x16]@[0xcc][0xed][0x9d][0xbc][0xf7][0xde]{[0xef]
[0xbd][0xf7][0xde]{[0xef][0xbd][0xf7][0xba];[0x9d]N'[0xf7][0xdf][0xff]?\fd[0x1]l[0xf6]
[0xce]J[0xda][0xc9][0x9e]![0x80][0xaa][0xc8][0x1f]?~|[0x1f]?"~[0xe3][0xe4]7N[0x1e]
[0xff][0xae]O[0xbf]<y[0xf3][0xfb][0xbc]<M[0xe7][0xed][0xa2]L_~[0xf5][0xe4][0xf9]

Is there a way to disable all output? Or does anyone know what causes this? Might this even be a bug I should post as an issue to the community?

Thanks for your time

役に立ちましたか?

解決

I found an answer to my question. I changed the method-name from main(string[] args) to crawl(). Then crawler4j started to print ort debug stuff. When I changed my logger4j.properties, they disappeared.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top