Question

PHP has built in support for reading EXIF and IPTC metadata, but I can't find any way to read XMP?

Was it helpful?

Solution

XMP data is literally embedded into the image file so can extract it with PHP's string-functions from the image file itself.

The following demonstrates this procedure (I'm using SimpleXML but every other XML API or even simple and clever string parsing may give you equal results):

$content = file_get_contents($image);
$xmp_data_start = strpos($content, '<x:xmpmeta');
$xmp_data_end   = strpos($content, '</x:xmpmeta>');
$xmp_length     = $xmp_data_end - $xmp_data_start;
$xmp_data       = substr($content, $xmp_data_start, $xmp_length + 12);
$xmp            = simplexml_load_string($xmp_data);

Just two remarks:

  • XMP makes heavy use of XML namespaces, so you'll have to keep an eye on that when parsing the XMP data with some XML tools.
  • considering the possible size of image files, you'll perhaps not be able to use file_get_contents() as this function loads the whole image into memory. Using fopen() to open a file stream resource and checking chunks of data for the key-sequences <x:xmpmeta and </x:xmpmeta> will significantly reduce the memory footprint.

OTHER TIPS

I'm only replying to this after so much time because this seems to be the best result when searching Google for how to parse XMP data. I've seen this nearly identical snippet used in code a few times and it's a terrible waste of memory. Here is an example of the fopen() method Stefan mentions after his example.

<?php

function getXmpData($filename, $chunkSize)
{
    if (!is_int($chunkSize)) {
        throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
    }

    if ($chunkSize < 12) {
        throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
    }

    if (($file_pointer = fopen($filename, 'r')) === FALSE) {
        throw new RuntimeException('Could not open file for reading');
    }

    $startTag = '<x:xmpmeta';
    $endTag = '</x:xmpmeta>';
    $buffer = NULL;
    $hasXmp = FALSE;

    while (($chunk = fread($file_pointer, $chunkSize)) !== FALSE) {

        if ($chunk === "") {
            break;
        }

        $buffer .= $chunk;
        $startPosition = strpos($buffer, $startTag);
        $endPosition = strpos($buffer, $endTag);

        if ($startPosition !== FALSE && $endPosition !== FALSE) {
            $buffer = substr($buffer, $startPosition, $endPosition - $startPosition + 12);
            $hasXmp = TRUE;
            break;
        } elseif ($startPosition !== FALSE) {
            $buffer = substr($buffer, $startPosition);
            $hasXmp = TRUE;
        } elseif (strlen($buffer) > (strlen($startTag) * 2)) {
            $buffer = substr($buffer, strlen($startTag));
        }
    }

    fclose($file_pointer);
    return ($hasXmp) ? $buffer : NULL;
}

A simple way on linux is to call the exiv2 program, available in an eponymous package on debian.

$ exiv2 -e X extract image.jpg

will produce image.xmp containing embedded XMP which is now yours to parse.

I know... this is kind of an old thread, but it was helpful to me when I was looking for a way to do this, so I figured this might be helpful to someone else.

I took this basic solution and modified it so it handles the case where the tag is split between chunks. This allows the chunk size to be as large or small as you want.

<?php
function getXmpData($filename, $chunk_size = 1024)
{
	if (!is_int($chunkSize)) {
		throw new RuntimeException('Expected integer value for argument #2 (chunkSize)');
	}

	if ($chunkSize < 12) {
		throw new RuntimeException('Chunk size cannot be less than 12 argument #2 (chunkSize)');
	}

	if (($file_pointer = fopen($filename, 'rb')) === FALSE) {
		throw new RuntimeException('Could not open file for reading');
	}

	$tag = '<x:xmpmeta';
	$buffer = false;

	// find open tag
	while ($buffer === false && ($chunk = fread($file_pointer, $chunk_size)) !== false) {
		if(strlen($chunk) <= 10) {
			break;
		}
		if(($position = strpos($chunk, $tag)) === false) {
			// if open tag not found, back up just in case the open tag is on the split.
			fseek($file_pointer, -10, SEEK_CUR);
		} else {
			$buffer = substr($chunk, $position);
		}
	}

	if($buffer === false) {
		fclose($file_pointer);
		return false;
	}

	$tag = '</x:xmpmeta>';
	$offset = 0;
	while (($position = strpos($buffer, $tag, $offset)) === false && ($chunk = fread($file_pointer, $chunk_size)) !== FALSE && !empty($chunk)) {
		$offset = strlen($buffer) - 12; // subtract the tag size just in case it's split between chunks.
		$buffer .= $chunk;
	}

	fclose($file_pointer);

	if($position === false) {
		// this would mean the open tag was found, but the close tag was not.  Maybe file corruption?
		throw new RuntimeException('No close tag found.  Possibly corrupted file.');
	} else {
		$buffer = substr($buffer, 0, $position + 12);
	}

	return $buffer;
}
?>

I've developped the Xmp Php Tookit extension : it's a php5 extension based on the adobe xmp toolkit, which provide the main classes and method to read/write/parse xmp metadatas from jpeg, psd, pdf, video, audio... This extension is under gpl licence. A new release will be available soon, for php 5.3 (now only compatible with php 5.2.x), and should be available on windows and macosx (now only for freebsd and linux systems). http://xmpphptoolkit.sourceforge.net/

Bryan's solution was the best one so far, but it had a few issues so I modified it to simplify it, and remove some functionality.

There were three issues I found with his solution:

A) If the chunk extracted falls right in between one of the strings we're searching for, it won't find it. Small chunk sizes are more likely to cause this issue.

B) If the chunk contains both the start AND the end, it won't find it. This is an easy one to fix with an extra if statement to recheck the chunk that the start is found in to see if the end is also found.

C) The else statement added to the end to break the while loop if it doesn't find the xmp data has a side effect that if the start element isn't found on the first pass, it will not check anymore chunks. This is likely easy to fix too, but with the first issue it's not worth it.

My solution below isn't as powerful, but it's more robust. It will only check one chunk, and extract the data from that. It will only work if the start and end are in that chunk, so the chunk size needs to be large enough to ensure that it always captures that data. From my experience with Adobe Photoshop/Lightroom exported files, the xmp data typically starts at around 20kB, and ends at around 45kB. My chunk size of 50k seems to work nicely for my images, it would be much less if you strip some of that data on export, such as the CRS block that has a lot of develop settings.

function getXmpData($filename)
{
    $chunk_size = 50000;
    $buffer = NULL;

    if (($file_pointer = fopen($filename, 'r')) === FALSE) {
        throw new RuntimeException('Could not open file for reading');
    }

    $chunk = fread($file_pointer, $chunk_size);
    if (($posStart = strpos($chunk, '<x:xmpmeta')) !== FALSE) {
        $buffer = substr($chunk, $posStart);
        $posEnd = strpos($buffer, '</x:xmpmeta>');
        $buffer = substr($buffer, 0, $posEnd + 12);
    }
    fclose($file_pointer);
    return $buffer;
}

Thank you Sebastien B. for that shortened version :). If you want to avoid the problem, when chunk_size is just too small for some files, just add recursion.

function getXmpData($filename, $chunk_size = 50000){      
  $buffer = NULL;
  if (($file_pointer = fopen($filename, 'r')) === FALSE) {
    throw new RuntimeException('Could not open file for reading');
  }

  $chunk = fread($file_pointer, $chunk_size);
  if (($posStart = strpos($chunk, '<x:xmpmeta')) !== FALSE) {
      $buffer = substr($chunk, $posStart);
      $posEnd = strpos($buffer, '</x:xmpmeta>');
      $buffer = substr($buffer, 0, $posEnd + 12);
  }

  fclose($file_pointer);

// recursion here
  if(!strpos($buffer, '</x:xmpmeta>')){
    $buffer = getXmpData($filename, $chunk_size*2);
  }

  return $buffer;
}

If you have ExifTool available (a very useful tool) and can run external commands, you can use it's option to extract XMP data (-xmp:all) and output it in JSON format (-json), which you can then easily convert to a PHP object:

$command = 'exiftool -g -json -struct -xmp:all "'.$image_path.'"';
exec($command, $output, $return_var);
$metadata = implode('', $output);
$metadata = json_decode($metadata);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top