Regular Expression to parse Common Name from Distinguished Name

https://stackoverflow.com/questions/11582584

22-06-2021
|

Domanda

I am attempting to parse (with sed) just First Last from the following DN(s) returned by the DSCL command in OSX terminal bash environment...

CN=First Last,OU=PCS,OU=guests,DC=domain,DC=edu

I have tried multiple regexs from this site and others with questions very close to what I wanted... mainly this question... I have tried following the advice to the best of my ability (I don't necessarily consider myself a newbie...but definitely a newbie to regex..)

DSCL returns a list of DNs, and I would like to only have First Last printed to a text file. I have attempted using sed, but I can't seem to get the correct function. I am open to other commands to parse the output. Every line begins with CN= and then there is a comma between Last and OU=.

Thank you very much for your help!

Soluzione

Using sed:

sed 's/^CN=\([^,]*\).*/\1/' input_file

^           matches start of line 
CN=         literal string match
\([^,]*\)   everything until a comma
.*          rest

Altri suggerimenti

I think all of the regular expression answers provided so far are buggy, insofar as they do not properly handle quoted ',' characters in the common name. For example, consider a distinguishedName like:

CN=Doe\, John,CN=Users,DC=example,DC=local

Better to use a real library able to parse the components of a distinguishedName. If you're looking for something quick on the command line, try piping your DN to a command like this:

    echo "CN=Doe\, John,CN=Users,DC=activedir,DC=local" | python -c 'import ldap; import sys; print ldap.dn.explode_dn(sys.stdin.read().strip(), notypes=1)[0]'

(depends on having the python-ldap library installed). You could cook up something similar with PHP's built-in ldap_explode_dn() function.

Two cut commands is probably the simplest (although not necessarily the best):

DSCL | cut -d, -f1 | cut -d= -f2

First, split the output from DSCL on commas and print the first field ("CN=First Last"); then split that on equal signs and print the second field.

http://www.gnu.org/software/gawk/manual/gawk.html#Field-Separators

awk -v RS=',' -v FS='=' '$1=="CN"{print $2}' foo.txt

I like awk too, so I print the substring from the fourth char:

DSCL | awk '{FS=","}; {print substr($1,4)}' > filterednames.txt

This regex will parse a distinguished name, giving name and val a capture groups for each match.

When DN strings contain commas, they are meant to be quoted - this regex correctly handles both quoted and unquotes strings, and also handles escaped quotes in quoted strings:

(?:^|,\s?)(?:(?<name>[A-Z]+)=(?<val>"(?:[^"]|"")+"|[^,]+))+

Here is is nicely formatted:

(?:^|,\s?)
(?:
    (?<name>[A-Z]+)=
    (?<val>"(?:[^"]|"")+"|[^,]+)
)+

Here's a link so you can see it in action: https://regex101.com/r/zfZX3f/2

If you want a regex to get only the CN, then this adapted version will do it:

(?:^|,\s?)(?:CN=(?<val>"(?:[^"]|"")+"|[^,]+))

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow