Question

I'm using objective c to create a program that will pull out data from a HTML file using regexes. The only lines that are important to the program contain the text popupName and I need to stip all HTML tags from it as well. Can this be done with one regex?

So far I have been using popupName to find the line I am looking for and then deleting everything matching <[^>]*>.

Could these two operations be combined into one?

Here's example input:

            <div>
                <div class="popupName"> Bob Smith</div>
                <div class="popupTitle">
                    <i></i>
                </div>
                <br />
                <div class="popupTitle"></div>
                <div class="popupLink"><a href="mailto:"></a></div>
            </div>

From that I would like to extract only "Bob Smith". Except, I would have multiple occurrences of the line names like that.

Was it helpful?

Solution

Your pattern is pretty close to what you would likely want with the addition of:

"popupName">(.*)|<[^>]*>

Adding "popupName" followed by a capture group will allow you to grab the specific info you want.

In Objective-C:

NSString* searchText = @"<div><div class=\"popupName\"> Bob Smith</div><div class=\"popupTitle\"><i></i></div><br /><div class=\"popupTitle\"></div><div class=\"popupLink\"><a href=\"mailto:\"></a></div></div><div>";
NSString *pattern = @"\"popupName\">(.*)|<[^>]*>";
NSRange searchRange = NSMakeRange(0, [searchText length]);

NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:0 error:&error];
NSString *results = [regex stringByReplacingMatchesInString:searchText options:0 range:searchRange withTemplate:@"$1"];

NSLog(@"results: %@",results);

Result:

results: Bob Smith

OTHER TIPS

I've been playing with this for a bit, but I'm using javascript and can't do a positive lookbehind. But if your objective C can let you do a positive lookbehind and positive lookahead, you should be able to do this.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top