Question

I have a string, example below, where I am trying to use VB.NET to grab all the HotelNames in the string.

Here is my code:

    Dim thepage As String = postreqreader.ReadToEnd

    Dim r As New Regex("""HotelName"":"".*""")
    Dim matches As MatchCollection = r.Matches(thepage)
    For Each hotelname As Match In matches
        ListBox1.Items.Add(hotelname.Value.Split("""").GetValue(3))
    Next

However this only adds the first HotelName to my ListBox. Any ideas how I can get each HotelName match into my listbox? Thanks for your time!

shortened String:

"HotelName":"Homewood Suites by Hilton Dallas-Arlington","HotelNameShort":"Homewood Suites by Hilton...","HotelNameShortAvailabilityRecommendations":"Homewood Suites by H...","HotelName":"Knights Inn Arlington","HotelNameShort":"Knights Inn Arlington","HotelNameShortAvailabilityRecommendations":"Knights Inn Arlingto...","HotelName":"Howard Johnson Express Inn - Arlington Ballpark / Six Flags","HotelNameShort":"Howard Johnson Express In...","HotelNameShortAvailabilityRecommendations":"Howard Johnson Expre...","HotelName":"Super 8 Arlington/SW","HotelNameShort":"Super 8 Arlington/SW","HotelNameShortAvailabilityRecommendations":"Super 8 Arlington/SW...",

Was it helpful?

Solution

Your .* is greedy. That means it consumes as much as possible, so your first match will take in the entire string up until the very last ". Then you split that one single match (which actually contains all hotel names) at " and take out the third element. (Check out hotelname.Value.Split("""").Count, it will be huge)

The trick is to either make the repetition non-greedy, or even better, disallow the repetition to consume ":

"""HotelName"":""([^""]*)""

Now between " and " we are only repeating non-quote characters, so we can never go past the first closing quote. This alone should solve your problem, but I also added those parentheses. They will not match any actual parentheses, but instead create a capturing group, which makes retrieval of the hotel name even easier:

For Each hotelname As Match In matches
    ListBox1.Items.Add(hotelname.Groups[1])

For every set of parentheses, the stuff that is matched inside them will be put into one element of match.Groups, counting opening parentheses from left to right. The regex matching already does all that you need to get the individual values, so why perform a second splitting step.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top