What are these lines of log-parsing Perl doing and how can I come up with something that might work?

StackOverflow https://stackoverflow.com//questions/9660941

  •  12-12-2019
  •  | 
  •  

Question

This problem comes under the context of pop-before-smtp / Postfix / Dovecot, but if I knew Perl string parsing, I could come up with an answer myself. However, I'm so lost I don't even know the precise question. To wit:

We've been using Postfix for a LONG time now and are kind of hooked on it. Now we need to "move into the modern era" and let people SEND email from our SMTP server(s) even when they're outside our network. So, tasked with this job, I've found pop-before-smtp.

You can find it here.

So, I've got it all configured but it fails in testing. I've troubleshot it using the directions here, and determined that the Perl that's trying to parse the log appears to be incorrect. We're using Dovecot as our IMAP / POP server, and there are three choices given in the configuration file. Here is an excerpt from the config file showing the three sets:

# For Dovecot POP3/IMAP when using syslog.
#$pat = '^[LOGTIME] \S+ (?:dovecot: )?(?:imap|pop3)-login: ' .
#    'Login: .*? (?:\[|rip=)[:f]*(\d+\.\d+\.\d+\.\d+)[],]';
#$out_pat = '^[LOGTIME] \S+ (?:dovecot: )?(?:imap|pop3)-login: ' .
#    'Disconnected.*? (?:\[|rip=)[:f]*(\d+\.\d+\.\d+\.\d+)[],]';

# For Dovecot POP3/IMAP when it does its own logging.
##$logtime_pat = '(\d\d\d\d-\d+-\d+ \d+:\d+:\d+)';
#$pat = '^dovecot: [LOGTIME] Info: (?:imap|pop3)-login: ' .
#    'Login: .+? rip=[:f]*(\d+\.\d+\.\d+\.\d+),';
#$out_pat = '^dovecot: [LOGTIME] Info: (?:imap|pop3)-login: ' .
#    'Disconnected.*? rip=[:f]*(\d+\.\d+\.\d+\.\d+),';

# For older Dovecot POP3/IMAP when it does its own logging.
#$pat = '^(?:imap|pop3)-login: [LOGTIME] Info: ' .
#    'Login: \S+ \[[:f]*(\d+\.\d+\.\d+\.\d+)\]';
#$out_pat = '^(?:imap|pop3)-login: [LOGTIME] Info: ' .
#    'Disconnected.*? \[[:f]*(\d+\.\d+\.\d+\.\d+)\]';

One is supposed to uncomment the ones that apply, however, none of them work.

I surmise that 'pat' is the pattern for login, and out-pat is the pattern for logging out or otherwise disconnecting.

The actual log record format is clearly different than any of these three, but they're close. Here are an example pair:

Mar 11 17:53:55 imap-login: Info: Login: user=<username>, method=PLAIN, rip=208.54.4.205, lip=192.168.1.1, TLS

Mar 11 17:59:10 IMAP(username): Info: Disconnected: Logged out bytes=352/43743

When using POP, 'imap-login' is replaced by 'pop-login', and on log-out, 'POP' replaces 'IMAP' - why the changes in capitalization I can't say!

Importand data are: The timestamp, the username, and, when logging in, the "remote" ip ("rip").

Given enough time, I may be able to piece together something that works, but since I don't actually know Perl, this is kind of tough. Please help me write new rules to parse the logging output used with our Dovecot package.

Was it helpful?

Solution

The (:?.. portion of a Perl regular expression asks for clustering but not capturing; this allows entire groups to be matched or ignored as as group without influencing the capture group numbers; all the lines capture exactly one field, the IP to allow. (Which is a little odd, I might have expected both username and IP, but this might be easier in the long run.)

# For Dovecot POP3/IMAP when using syslog.
$pat = '^[LOGTIME] \S+ (?:imap|pop3)-login: Info: ' .
    'Login: .*? (?:\[|rip=)[:f]*(\d+\.\d+\.\d+\.\d+)[],]';
# not necessary? see comment header START OF PATTERNS
# $out_pat = '^[LOGTIME] \S+ (?:IMAP|POP3)\(\S+\): Info: ' .
#    'Disconnected.*';

I've removed the dovecot pieces since they weren't in your input. I added the Info: to both lines. I've modified the $out_pat to use IMAP(username) instead of the no-longer-there imap-login from the original. (The use of \S+ will break if usernames have spaces. Since this assumption was made elsewhere in the file, I hope it's fine.)

Since there is no longer any IP address to capture for the logout line, it is probably best to not define $out_pat -- the START OF PATTERNS comment block includes the phrase If the entry of your choice also provides $out_pat, you should uncomment that variable as well, which allows us to keep track of users who are still connected to the server (e.g. Thunderbird caches open IMAP connections).

I haven't tested this but I have good feelings about it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top