Domanda

I'm using some crawler code from http://code.google.com/p/crawler4j/.

Now, what I'm trying to do is to access every URLs found in the MyCrawler class from another class.

I start the crawler with :

// * Start the crawl. This is a blocking operation, meaning that your code
// * will reach the line after this only when crawling is finished.
controller.start(MyCrawler.class, numberOfCrawlers); 

When I try to use "return" to get my URLs, I get this error :

The return type is incompatible with WebCrawler.visit(Page)

and it asks me to change the type to 'void' but, of course, I don't want to.

Here's the function that I have trouble with :

@Override
public  String visit(Page page) {          
        url = page.getWebURL().getURL();
        System.out.println("URL: " + url);

        if (page.getParseData() instanceof HtmlParseData) {
                HtmlParseData htmlParseData = (HtmlParseData) page.getParseData();
                String text = htmlParseData.getText();
                String html = htmlParseData.getHtml();
                List<WebURL> links = htmlParseData.getOutgoingUrls();

                System.out.println("Text length: " + text.length());
                System.out.println("Html length: " + html.length());
                System.out.println("Number of outgoing links: " + links.size());

              return url;  

        }

I also tried to use a getter but since it is a "blocking operation", it doesn't work. I'm running out of ideas.

È stato utile?

Soluzione

You can't override a method if you change the method signature. If you change the signature you are making a new method. If all you want is the list of urls you visited, instead of returning the urls, try storing them in an ArrayList and make a getter which returns the list.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top