Question

I have an elasticsearch index which I am using to index a set of documents.

These documents are originally in csv format and I am looking parse these using logstash as this has powerful regular expression tools such as grok.

My problem is that I have something along the following lines

field1,field2,field3,number@number#number@number#number@number

In the last column I have key value pairs key@value separated by # and there can be any number of these

Is there a way for me to use logstash to parse this and get it to store the last column as the following json in elasticsearch (or some other searchable format) so I am able to search it

[
  {"key" : number, "value" : number},
  {"key" : number, "value" : number},
  ...
]
Was it helpful?

Solution

First, You can use CSV filter to parse out the last column. Then, you can use Ruby filter to write your own code to do what you need.

input {
    stdin {
    }
}

filter {
    ruby {
        code => '
            b = event["message"].split("#");
            ary = Array.new;
            for c in b;
                keyvar = c.split("@")[0];
                valuevar = c.split("@")[1];
                d = "{key : " << keyvar << ", value : " << valuevar << "}";
                ary.push(d);
            end;
            event["lastColum"] = ary;
        '
    }
}


output {
    stdout {debug => true}
}

With this filter, When I input

1@10#2@20

The output is

    "message" => "1@10#2@20",
  "@version" => "1",
"@timestamp" => "2014-03-25T01:53:56.338Z",
 "lastColum" => [
    [0] "{key : 1, value : 10}",
    [1] "{key : 2, value : 20}"
]

FYI. Hope this can help you.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top