Question

Trying to find spammers in exim mainlog. Mainlog has mail IDs and Subjects something like below.

username1@example.com S==thi#s i $s @a Su~bJec%t
username2@example2.com S==thi#s i ^s an*ot+her Su~bj)ec%t

What I am trying to do is take the subject, remove all the symbols, space using sed and grep for keywords. If satisfied, then print mail ID. I am successful in removing all the symbols, space and grep the keywords, but the problem is symbols from mail IDs (@ and .) are also removed. So my question is how to apply sed and grep only to subjects S==thi#s i ^s an*ot+her Su~bj)ec%t and if satisfied print mail ID without affecting its symbols. Thanks in advance.

Était-ce utile?

La solution

This would be tricky with sed, if even possible. If you're ok with awk instead:

awk -F' S==' -v k1=this '{gsub("[][()#$@~% ]", "", $2); if ($2 ~ k1) print $1}'

If you want to remove all non-alphanumeric characters, then it's better to write like this:

awk -F' S==' -v k1=this '{gsub("[^[:alnum:]]", "", $2); if ($2 ~ k1) print $1}'

If your version of awk doesn't support [:alnum:] then you can write like this instead:

awk -F' S==' -v k1=this '{gsub("[^a-zA-Z0-9]", "", $2); if ($2 ~ k1) print $1}'

Explanation:

  • Using S== as the field separator to split mail ID and subject parts
  • Passing in a keyword "this" in the k1 variable. You could use any other keyword or multiple keywords with more -v parameters in the same format, for example -v k2=something
  • Remove all the symbols from the 2nd field with gsub
  • If the 2nd field matches the keyword in k1, then print the first field (= the mail ID)

I hope this helps.

Autres conseils

Before: your grep/sed (could be in your sed treatment but before your action)

sed 's/@/(at)/1
: dot
   s/^\([^ ]*\)\.\([^ ]*\) /\1(dot)\2 /
   t dot'

after your grep sed (could be in your sed treatment but aftyer your action)

sed 's/(dot)/./g;s/(at)/@/g'

assuming there is no (dot) and (at) in your subject. Nearly any other pattern could be used like #at# or §1§ or :a: instead (just not use specal sed char like +.{[$^

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top