Вопрос

Logstash's grok is a string parsing tool which built on top of regex, it provides many patterns that make string parsing jobs so much easier, I just fell in love with it the first time I used it. But unfortunately, it's written in Ruby, makes it impossible to be used in my Python projects, so I'm wondering is there any Python implementation of grok, or is there any Python alternative that can simplify string parsing like grok do?

Это было полезно?

Решение 2

I'm not aware on any python ports of grok, but this functionality seems pretty straightforward to implement:

import re

types = {
    'WORD': r'\w+',
    'NUMBER': r'\d+',
    # todo: extend me
}


def compile(pat):
    return re.sub(r'%{(\w+):(\w+)}', 
        lambda m: "(?P<" + m.group(2) + ">" + types[m.group(1)] + ")", pat)


rr = compile("%{WORD:method} %{NUMBER:bytes} %{NUMBER:duration}")

print re.search(rr, "hello 123 456").groupdict()
# {'duration': '456', 'bytes': '123', 'method': 'hello'}

Другие советы

I built a project in github called pygrok based on @georg 's answer to meet my log pattern parsing requirements in python codes.I think pygrok may be helpful for you,Let me introduce it in brief:

pygrok

A Python library to parse strings and extract information from structured/unstructured data

What can I use Grok for?

  • parsing and matching patterns in a string(log, message etc.)
  • relieving from complex regular expressions.
  • extracting information from structured/unstructured data

You can find it here.

Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top