Pergunta

I have two machines. One machine runs HBase 0.92.2 in pseudo-distributed mode, while the other one is using Nutch 2.x crawler. How can I configure these two machines so that one machine with HBase-0.92.2 acts as back end storage and the other with Nutch-2.x acts as a crawler?

Foi útil?

Solução

I finally did it.I was easy to do. i am sharing my experience here. May be it can help someone.

1- change the configuration file of hbase-site.xml for pseudo distributed mode.

2- MOST IMPORTANT THING: on hbase machine, replace localhost ip in /etc/hosts with your real network ip like this

10.11.22.189 master localhost

hbase machine's ip = 10.11.22.189 (note: if you won't change your hbase machine's localhost ip, remote nutch crawler won't be able to connect to it)

4- copy/symlink hbase-site.xml into $NUTCH_HOME/conf

5- start your crawler and see it working

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top