Question

The Problem

I am trying to create an import feature for a VB.NET desktop application (Visual Studio 2012) that would analyze a vCard and distribute all the data throughout a class. The class has been created and the data is being analyzed correctly via regex apart from the name element. Below is the vCard text I am using (this was exported from Microsoft Outlook).

BEGIN:VCARD
VERSION:2.1
N;LANGUAGE=en-gb:Test;Johnny;Stewart;Mr.
FN:Mr. Johnny Stewart Test
ORG:Test Company
TITLE:Software Development
TEL;WORK;VOICE:01210000000
TEL;HOME;VOICE:01211111111
TEL;WORK;FAX:01212222222
ADR;WORK;PREF:;;10 Test St;Teston;Testville;T0 0TT;United Kingdom
LABEL;WORK;PREF;ENCODING=QUOTED-PRINTABLE:10 Test St=0D=0A=
Teston=0D=0A=
Testville=0D=0A=
T0 0TT
X-MS-OL-DEFAULT-POSTAL-ADDRESS:2
URL;WORK:www.webpageaddress.co.uk
EMAIL;PREF;INTERNET:Johnny.Test@TestCo.co.uk
X-MS-IMADDRESS:example.IMAddress@webpageaddress.co.uk
X-MS-CARDPICTURE;TYPE=JPEG;ENCODING=BASE64:
 /9j/4AAQSkZJRgABAQEAYABgAAD/2wBDAAcFBQYFBAcGBQYIBwcIChELCgkJChUPEAwRGBUa
 GRgVGBcbHichGx0lHRcYIi4iJSgpKywrGiAvMy8qMicqKyr/2wBDAQcICAoJChQLCxQqHBgc
 KioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKioqKir/wAAR
 CACUACcDASIAAhEBAxEB/8QAHwAAAQUBAQEBAQEAAAAAAAAAAAECAwQFBgcICQoL/8QAtRAA
 AgEDAwIEAwUFBAQAAAF9AQIDAAQRBRIhMUEGE1FhByJxFDKBkaEII0KxwRVS0fAkM2JyggkK
 FhcYGRolJicoKSo0NTY3ODk6Q0RFRkdISUpTVFVWV1hZWmNkZWZnaGlqc3R1dnd4eXqDhIWG
 h4iJipKTlJWWl5iZmqKjpKWmp6ipqrKztLW2t7i5usLDxMXGx8jJytLT1NXW19jZ2uHi4+Tl
 5ufo6erx8vP09fb3+Pn6/8QAHwEAAwEBAQEBAQEBAQAAAAAAAAECAwQFBgcICQoL/8QAtREA
 AgECBAQDBAcFBAQAAQJ3AAECAxEEBSExBhJBUQdhcRMiMoEIFEKRobHBCSMzUvAVYnLRChYk
 NOEl8RcYGRomJygpKjU2Nzg5OkNERUZHSElKU1RVVldYWVpjZGVmZ2hpanN0dXZ3eHl6goOE
 hYaHiImKkpOUlZaXmJmaoqOkpaanqKmqsrO0tba3uLm6wsPExcbHyMnK0tPU1dbX2Nna4uPk
 5ebn6Onq8vP09fb3+Pn6/9oADAMBAAIRAxEAPwD2gOMYx+tOBHXFVg3Hf8qXf6gGkMsGTsDn
 /gVG8e4/4FVfzDjhBigyHHUj8aAJvNI7kj6UVX3jPUfnRQBEGJ5LH8Vpd59QaqCRc9c/WnBh
 6/rQInL56AUK2OuKi8z0Ofqc00sc/eFAE5f0/nRVcsfY/Q0UAQKx7Gl8zHcGoRJn1P0FBZv7
 2PqDQBKJAT0x9ad5hx1H51Bu4+9+tN389zQBPvJ9B+FFQE/5zRQBCXPf+dLv47VXDgH/ABpS
 /wBBQBNv98fjQJCe5P0NQh89efxoLL6/pQBLu9/zoqHPpiigCLcfb8qUNz2qDf7GnZHr+lAE
 pYetG7jjNRZI7fpTGc+lAE5bnkUVAHPfAooAj3j3o3jHWosg/wD6qNw+v4UAS7qXcfQ/hUBJ
 Pt+FAbtn9KAJt3PP60VDu9/1ooAjyexpdw71CT64pAwFAE2eKTdj/wDXUe8UoYHuaAJN5PUN
 RUJY0UAR0ob1qPPrn86Nw9aAJN2O9G4dzUe4duaTJoAl3j+8KKjz9aKAGfjSE+1R5HelDEdP
 5UAPD47Ub/eoyfUil3jFADi3qP1oqPdz1NFADec9TRRRTAWkyaKKAE3H1ooooA//2Q==

X-MS-OL-DESIGN;CHARSET=utf-8:<card xmlns="http://schemas.microsoft.com/office/outlook/12/electronicbusinesscards" ver="1.0" layout="left" bgcolor="ffffff"><img 

xmlns="" align="fit" area="16" use="cardpicture"/><fld xmlns="" prop="name" align="left" dir="ltr" style="b" color="000000" size="10"/><fld xmlns="" prop="org" align="left" 

dir="ltr" color="000000" size="8"/><fld xmlns="" prop="title" align="left" dir="ltr" color="000000" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="email" 

align="left" dir="ltr" color="000000" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="addrwork" align="left" dir="ltr" color="000000" size="8"/><fld xmlns="" 

prop="addrhome" align="left" dir="ltr" color="000000" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="webhome" align="left" dir="ltr" color="000000" 

size="8"/><fld xmlns="" prop="webwork" align="left" dir="ltr" color="000000" size="8"/><fld xmlns="" prop="blank" size="8"/><fld xmlns="" prop="telwork" align="left" dir="ltr" 

color="000000" size="8"/><fld xmlns="" prop="telhome" align="left" dir="ltr" color="000000" size="8"/><fld xmlns="" prop="faxwork" align="left" dir="ltr" color="000000" 

size="8"/><fld xmlns="" prop="im" align="left" dir="ltr" color="000000" size="8"/></card>
REV:20140318T153016Z
END:VCARD

And below is the line that I want to match up with regex (line 3):

N;LANGUAGE=en-gb:Test;Johnny;Stewart;Mr.

The Attempt

Now I am not great at regex but I did give it a go using online cheat sheets. I got close but I am getting a bit frustrated with it now as I feel I have tried everything. Below is the regex I am using:

(\n(?<strElement>(N))) (;(?<strLang>(LANGUAGE)))* ([^:]*)*  (:(?<strSurname>([^;]*))) (;(?<strGivenName>([^;]*)))  ?(;(?<strMidName>([^\n|^;]*))) ?(;(?<strPrefix>([^\n]*))) ?(;(?<strSuffix>([^\n]*)))

This is close but it puts the prefix (in this case "Mr.") into the suffix group which is obviously incorrect.

Notes

  • As far as I can tell with the research I have done on vCards, the Language section on the name element I am looking at may be optional (I think I have catered for this in the above regex).
  • If there is missing data, like the suffix, it does not export semi colons to indicate empty data fields

Summary

If anyone can give me suggestions I would greatly appreciate it with an explanation as well as I am trying to get used to regex.

Was it helpful?

Solution

A pattern such as this should give you an idea of how to match what you're looking for:

(?<1>\w+);(?<2>.*\w{2}-\w{2}):(?<3>\w+);(?<4>\w+);(?<5>\w+);(?<6>\w+.)

example: http://regex101.com/r/jX6lA3

In vb.net you might code it similar to this:

Imports System.Text.RegularExpressions

Module Example
   Public Sub Main()
      Dim pattern As String = "(?<1>\w+);(?<2>.*\w{2}-\w{2}):(?<3>\w+);(?<4>\w+);(?<5>\w+);(?<6>\w+.)" 
      Dim input As String = vCard.String 
      Dim matches As MatchCollection = Regex.Matches(input, pattern)

      For Each match As Match In matches
         Console.WriteLine("1: ", match.Groups["1"]).Value)
         Console.WriteLine("2: ", match.Groups["2"]).Value)
         Console.WriteLine("3: ", match.Groups["3"]).Value)
         Console.WriteLine("4: ", match.Groups["4"]).Value)
         Console.WriteLine("5: ", match.Groups["5"]).Value)
         Console.WriteLine("6: ", match.Groups["6"]).Value)
         Console.WriteLine()
      Next
      Console.WriteLine()
   End Sub 
End Module 

If of course you wanted to use the regex pattern you already have it's pretty simple to adapt the code to arrange your groups however you want them. So for example, the prefix which is ("strPrefix" in your pattern) could be called and arranged whenever/however you like:

Console.WriteLine("Prefix: ", match.Groups["strPrefix"].Value)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top