Domanda

I have the following code that gets the img tags urls from an XML which is working correctly:

Pattern p = Pattern.compile("<img[^>]+src\\s*=\\s*['\"]([^'\"]+)['\"][^>]*>");
Matcher m = p.matcher(xmlString);
while (m.find())
    imagesURLs.add(m.group(1));

My xml looks like the following:

<item>
    <desc>
       txt txt txt txt <img src="htttp://mysite.com/images/img.png"> txt txt
       <img src="htttp://mysite.com/images/img.png"> ...
    </desc>
</item>
<item>
    <desc>
       txt txt txt txt <img src="htttp://mysite.com/images/img.png"> txt txt
       <img src="htttp://mysite.com/images/img.png"><img src="htttp://mysite.com/images/img.png">
    </desc>
</item>

I want to modify the code to only get the first img tag url from each desc tag.

È stato utile?

Soluzione

Instead of trying to use a regex to figure this out (Which is a very POOR way to do this...) You should actually parse the xml using some Xml parsing library as provided by java. Like XmlPullParser.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top