Question

Using PHP and RegExp I need to grab the value inside of <rdf:li xml:lang="x-default"> and </rdf:li>

So the string I need a value from will have this line in it...

<rdf:li xml:lang="x-default">Yuengling Americas Oldest Brewery Huffmans Pub &amp; Grub 60x30 5</rdf:li>

I need to get the Yuengling Americas Oldest Brewery Huffmans Pub &amp; Grub 60x30 5 into a PHP variable. I'm not good with Regex, could someone help me to get this value?

$str = '<rdf:li xml:lang="x-default">Yuengling Americas Oldest Brewery Huffmans Pub &amp; Grub 60x30 5</rdf:li>';

My string comes from reading the contents of an .AI file....

%PDF-1.5
%âãÏÓ
1 0 obj
<</Metadata 2 0 R/OCProperties<</D<</ON[7 0 R]/Order 8 0 R/RBGroups[]>>/OCGs[7 0 R]>>/Pages 3 0 R/Type/Catalog>>
endobj
2 0 obj
<</Length 67315/Subtype/XML/Type/Metadata>>stream
<?xpacket begin="" id="W5M0MpCehiHzreSzNTczkc9d"?>
<x:xmpmeta xmlns:x="adobe:ns:meta/" x:xmptk="Adobe XMP Core 5.3-c011 66.145661, 2012/02/06-14:56:27        ">
   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      <rdf:Description rdf:about=""
            xmlns:dc="http://purl.org/dc/elements/1.1/">
         <dc:format>application/pdf</dc:format>
         <dc:title>
            <rdf:Alt>
               <rdf:li xml:lang="x-default">Yuengling Americas Oldest Brewery Huffmans Pub &amp; Grub 60x30 5</rdf:li>
            </rdf:Alt>
         </dc:title>
      </rdf:Description>
      <rdf:Description rdf:about=""
            xmlns:xmp="http://ns.adobe.com/xap/1.0/"
            xmlns:xmpGImg="http://ns.adobe.com/xap/1.0/g/img/">
         <xmp:MetadataDate>2014-04-01T16:13-05:00</xmp:MetadataDate>
         <xmp:ModifyDate>2014-04-01T16:13-05:00</xmp:ModifyDate>
         <xmp:CreateDate>2014-04-01T16:13-05:00</xmp:CreateDate>
         <xmp:CreatorTool>Adobe Illustrator CS6 (Windows)</xmp:CreatorTool>
         <xmp:Thumbnails>
            <rdf:Alt>
               <rdf:li rdf:parseType="Resource">....
Was it helpful?

Solution

Jason, all reservations aside, since you asked for a regex solution, here's a simple regex that matches what you want:

<rdf:li xml:lang="x-default">\K[^<]+(?=</rdf:li>)

How to use it:

$str = '<rdf:li xml:lang="x-default">Yuengling Americas Oldest Brewery Huffmans Pub &amp; Grub 60x30 5</rdf:li>';

$regex = '~<rdf:li xml:lang="x-default">\K[^<]+(?=</rdf:li>)~';

if(preg_match($regex,$str,$m)) {
    $myvariable = $m[0];
    echo $myvariable."<br />";
}

The output:

Yuengling Americas Oldest Brewery Huffmans Pub & Grub 60x30 5

How does it work?

You'll note that we start by matching the entire left delimiter. The \K then tells the engine to prune that from the returned match. Next [^<]+ matches any character that is not a < to eat up the text you want. Just to make sure, after matching that, we lookahead with (?= to make sure the closing delimiter follows the matched string.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top