Pergunta

I'm running a Apache pig script and I'd like to be able to look for a full string of words in a field. For example I would like to look for "following updates are downloaded and ready for installation"

The issue I'm having is that matches with RegEx only seems to allow me to enter single words to search for. So I end up looking for "following" & "updates" & "downloaded" & "ready" & "installation" and it doesn't matter how far apart they are. I've resorted to just including more words to try to lock down what I'm looking for but I wanted to see if looking for an entire string of consecutive words is possible.

Here is an example of my current filter.

downloadFilter = FILTER windowsLog BY ($16 matches '^(?=.*?(following))(?=.*?(updates))(?=.*?(downloaded))(?=.*?(ready))(?=.*?(installation)).*$');

Example record I'm trying to hit.

3/7/2014    19:15:54:141    972 13c0    Report  REPORT EVENT: {EF338545-61FB-434A-ACB6-F9D17A986677}    2014-03-07 19:15:49:141-0600    1   188 102 {00000000-0000-0000-0000-000000000000}  0   0   AutomaticUpdates    Success Content Install Installation Ready: The following updates are downloaded and ready for installation. This computer is currently scheduled to install these updates on ‎Saturday, ‎March ‎08, ‎2014 at 3:00 AM:  - Update for Microsoft .NET Framework 3.5.1 on Windows 7 and Windows Server 2008 R2 SP1 for x64-based Systems (KB2836943) - Security Update for Microsoft .NET Framework 3.5.1 on Windows 7 and Windows Server 2008 R2 SP1 for x64-based Systems (KB2863240) - Update for User-Mode Driver Framework version 1.11 for Windows 7 for x64-based Systems (KB2685813) - Update for Windows 7 for x64-based Systems (KB2791765)
Foi útil?

Solução

Why not simply search for the following regex:

'.*Download Completed.*'

Demo: http://regex101.com/r/oR7aP3

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top