Pergunta

I am new to regular expressions.I am trying to find the Images doesn't having BORDER. So the result should second Image.The text which is trying to match using regex is below.

<IMG onerror="this.errored=true;" USEMAP="#Map-43" BORDER="0"/>
<IMG onerror="this.errored=true;" USEMAP="#Map-43" />
<IMG onerror="this.errored=true;" USEMAP="#Map-43" BORDER="0"/>    

I tried the following regex but didn't worked

<IMG\\s[^((>)&(?!BORDER)]*>

So can any one help on this please?

Foi útil?

Solução

You can use HtmlAgilityPack to parse html

HtmlAgilityPack.HtmlDocument doc = new HtmlAgilityPack.HtmlDocument();
doc.LoadHtml(html);

var imgs = doc.DocumentNode.Descendants("img")
    .Where(n => n.Attributes["border"] == null)
    .ToList();

PS: See also this: RegEx match open tags except XHTML self-contained tags

Outras dicas

The better choice would be to use an html parser for such a problem.

But your main regex problem here is that you put your lookahead into a character class, that way all character where treated as literal characters.

<IMG\s(?:(?!BORDER)[^>])*>

should work better. See it on Regexr.

But thats only to explain your regex problem. To solve your programming task please use L.B answer.

Working example:

String html = "<IMG onerror=\"this.errored=true;\" USEMAP=\"#Map-43\" BORDER=\"0\"/><IMG onerror=\"this.errored=true;\" USEMAP=\"#Map-43\" /><IMG onerror=\"this.errored=true;\" USEMAP=\"#Map-43\" BORDER=\"0\"/>";
Console.WriteLine(Regex.Matches(html, @"<IMG\s(?:(?!BORDER)[^>])*>").Cast<Match>().ToList()[0]);
Console.ReadLine();

Another way is to get the "no border attribute" images client-side with the jQuery and CSS selectors:

$img = $('img').not('[border]');

Links:

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top