
I've got a bunch of web page content in my database with links like this:

<a href="/11ecfdc5-d28d-4121-b1c9-1f898ac0b72e">Link</a>

That Guid unique identifier is the ID of another page in the same database.

I'd like to crawl those pages and check for broken links.

To do that I need a function that can return a list of all the Guids on a page:

Function FindGuids(ByVal Text As String) As Collections.Generic.List(Of Guid)
End Function

I figure that this is a job for a regular expression. But, I don't know the syntax.

Was it helpful?


Function FindGuids(ByVal Text As String) As List(Of Guid)
    Dim Guids As New List(Of Guid)
    Dim Pattern As String = "[a-fA-F0-9]{8}-([a-fA-F0-9]{4}-){3}[a-fA-F0-9]{12}"
    For Each m As Match In Regex.Matches(Text, Pattern)
        Guids.Add(New Guid(m.Value))
    Return Guids
End Function



Suggest you grab a free copy of expresso and learn to build them!

Here's a 10 second attempt with no optimization, checks upper and lower case and creates a numbered capture group:


Then you just have to iterate through the matched groups...

There are easier ways to check for broken links.... for example I think will do it :D

This could also help

static Regex isGuid = 
    new Regex(@"^(\{){0,1}[0-9a-fA-F]{8}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{4}\-[0-9a-fA-F]{12}(\}){0,1}$", RegexOptions.Compiled);

and then

static bool IsGuid(string candidate, out Guid output)
bool isValid = false;

 if (isGuid.IsMatch(candidate))
  output=new Guid(candidate);
  isValid = true;
return isValid;


Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top