using logstash to parse csv file

https://stackoverflow.com/questions/22614839

20-06-2023
|

Question

I have an elasticsearch index which I am using to index a set of documents.

These documents are originally in csv format and I am looking parse these using logstash as this has powerful regular expression tools such as grok.

My problem is that I have something along the following lines

field1,field2,field3,number@number#number@number#number@number

In the last column I have key value pairs key@value separated by # and there can be any number of these

Is there a way for me to use logstash to parse this and get it to store the last column as the following json in elasticsearch (or some other searchable format) so I am able to search it

[
  {"key" : number, "value" : number},
  {"key" : number, "value" : number},
  ...
]

Solution

First, You can use CSV filter to parse out the last column. Then, you can use Ruby filter to write your own code to do what you need.

input {
    stdin {
    }
}

filter {
    ruby {
        code => '
            b = event["message"].split("#");
            ary = Array.new;
            for c in b;
                keyvar = c.split("@")[0];
                valuevar = c.split("@")[1];
                d = "{key : " << keyvar << ", value : " << valuevar << "}";
                ary.push(d);
            end;
            event["lastColum"] = ary;
        '
    }
}


output {
    stdout {debug => true}
}

With this filter, When I input

1@10#2@20

The output is

    "message" => "1@10#2@20",
  "@version" => "1",
"@timestamp" => "2014-03-25T01:53:56.338Z",
 "lastColum" => [
    [0] "{key : 1, value : 10}",
    [1] "{key : 2, value : 20}"
]

FYI. Hope this can help you.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow