Pregunta

I need you pro help to figure what is my issue with the ES & Logstash.

I'm using Elasticsearch 1.1.0 and logstash 1.4.0 to push logs to ES & Kibana .

My servers are located in AWS ( Master -> V.Core (4) , ECU (8) , RAM (15 GB) ) Node is the same .

My logstash configuration : enter image description here

I'm taking the log files from S3 and put the locally in the Server after that the logstash take them and push them to ES cluster . Logstash + Kibana + ES Master (Located in 1 server) those files are something like 12M size and I have more than 20,000 files.

MY ES configuration (Master)

cluster.name: MY-CLUSTER-NAME
node.name: MY-NODE-NAME
node.master: true
node.data: true
path.data: /PATH_TO_DATA/data
path.logs: /PATH_TO_LOGS/logs

ES Configuration (Node)

cluster.name: MY-CLUSTER-NAME (Same name)
node.name: MY-NODE-NAME (Another name)
node.master: false (This is node not master)
node.data: true
path.data: /PATH_TO_DATA/data
path.logs: /PATH_TO_LOGS/logs

For checking the cluster status :

http://MASTER_IP:9200/_cluster/health

this is the result :

{
  "cluster_name": "es-cluster-onetagv2",
  "status": "green",
  "timed_out": false,
  "number_of_nodes": 2,
  "number_of_data_nodes": 2,
  "active_primary_shards": 5,
  "active_shards": 10,
  "relocating_shards": 0,
  "initializing_shards": 0,
  "unassigned_shards": 0
}

my Java version ( Don't know if this is important ) :

java version "1.7.0_51" OpenJDK Runtime Environment (amzn-2.4.4.1.36.amzn1-x86_64 u51-b02) OpenJDK 64-Bit Server VM (build 24.45-b08, mixed mode)

MY issue is that I'm trying to push more than 400 million hits for 1 day , and I can push in 24 hours something like 60 million. so I'm always behind.

I can see also that the ES taking 100% CPU USAGE. but I don't know if this is the problem.

Maybe you can guide me what I'm doing wrong and how I can push big logs to ES fast.

¿Fue útil?

Solución

Split the input file and give each file different name so that multiple inputs work on it, which helps logstash to read multiple files parallelly by different nodes.

Please increase the number of ES DataNodes to 4, and 2 ES Masters. This can be achieved by running two logstash instances( provides 2 Masters and 2 DataNodes) and 2 Elasticsearch Nodes(provides 2 Data Nodes).

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top