Duplicates when using nutch -> elasticsearch solution
-
12-11-2019 - |
题
I have crawled some data using nutch and managed to inject it into elasticsearch. But I have one problem: If I inject the crawled data again it will create duplicates. Is there any way of disallowing this?
Has anyone managed to solve this or have any suggestions on how to solve it?
/Samus
没有正确的解决方案
不隶属于 StackOverflow