Question

Say, I have a collection of text files I need to process (e.g. search for a certain label and extract the value). What would be the general way to tackle the problem?

I also read this: "Retrieve Variable Values from Python" but it seems not applicable to some of the cases I face (like tab is used instead of :)

I just want to know the most appropriate way to tackle the problem regardless of the language used.

Say I have something like:

Name: Backup Operators  SID: S-1-5-32-551   Caption: COMMSVR21\Backup Operators Description: Backup Operators can override security restrictions for the sole purpose of backing up or restoring files  Domain: COMMSVR21   
COMMERCE/cabackup
COMMSVR21/sys5erv1c3

I want to be able to access/retrieve the values of Backup Operators and get COMMERCE/cabackup & COMMSVR21/sys5erv1c3 in return.

How would you do it?

What I thought of is to read the whole text file, regex search and probably some if else statements. Is this effective? Or maybe parsing the text file into probably some array and retrieve it? I'm not sure.

Like in another example say:

        GPO: xxx & yyy Servers
            Policy:            MaximumPasswordAge
            Computer Setting:  45

How would you check the text file for Policy = MaximumPasswordAge and return the value 45?

Thanks!

p/s -- I might be doing this in Python (zero knowledge, so picking it up on the fly) or Java

pp/s -- I just realised that there's no spoiler tag. Hmm

--

E.g. of the logs: Log with Directory Permissions:

C:\:
    BUILTIN\Administrators  Allowed:    Full Control
    NT AUTHORITY\SYSTEM Allowed:    Full Control
    BUILTIN\Users   Allowed:    Read & Execute
    BUILTIN\Users   Allowed:    Special Permissions: 
            Create Folders
    BUILTIN\Users   Allowed:    Special Permissions: 
            Create Files
    \Everyone   Allowed:    Read & Execute
    (No auditing)

C:\WINDOWS:
    BUILTIN\Users   Allowed:    Read & Execute
    BUILTIN\Power Users Allowed:    Modify
    BUILTIN\Power Users Allowed:    Special Permissions: 
            Delete
    BUILTIN\Administrators  Allowed:    Full Control
    NT AUTHORITY\SYSTEM Allowed:    Full Control
    (No auditing)

Another one with the following:

    Audit Policy
    ------------
        GPO: xxx & yyy Servers
            Policy:            AuditPolicyChange
            Computer Setting:  Success

        GPO: xxx & yyy Servers
            Policy:            AuditPrivilegeUse
            Computer Setting:  Failure

        GPO: xxx & yyy Servers
            Policy:            AuditDSAccess
            Computer Setting:  No Auditing

This is the tab delimited one:

User Name   Full Name   Description Account Type    SID Domain  PasswordIsChangeable    PasswordExpires PasswordRequired    AccountDisabled AccountLocked   Last Login
53cuR1ty        Built-in account for administering the computer/domain  512 S-1-5-21-2431866339-2595301809-2847141052-500   COMMSVR21   True    False   True    False   False   09/11/2010 7:14:27 PM
ASPNET  ASP.NET Machine Account Account used for running the ASP.NET worker process (aspnet_wp.exe) 512 
Was it helpful?

Solution

I always shove Python into people's faces ;)

I recommend looking at Regex: http://docs.python.org/howto/regex.html, as it might fit your needs. I won't do it for you (because I can't), but I know this will work if your files are colon-delimited key/value pairs separated by newline characters. Here's a quick start (which might work):

regex = '(.*):( *)(.*)\n'

This matches three groups (hopefully): A group before the colon (group 1), the spaces (group 2, which can be thrown away), and the text between that and a new line (group 3).

Play with that (I don't want to have a regex aneurysm, so this is far as I can help for now). Good luck!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top