It is possible to do some formats right in PHP. The DOCx and PPTx are easy:
For Word files:
function PageCount_DOCX($file) {
$pageCount = 0;
$zip = new ZipArchive();
if($zip->open($file) === true) {
if(($index = $zip->locateName('docProps/app.xml')) !== false) {
$data = $zip->getFromIndex($index);
$zip->close();
$xml = new SimpleXMLElement($data);
$pageCount = $xml->Pages;
}
$zip->close();
}
return $pageCount;
}
and for PowerPoint
function PageCount_PPTX($file) {
$pageCount = 0;
$zip = new ZipArchive();
if($zip->open($file) === true) {
if(($index = $zip->locateName('docProps/app.xml')) !== false) {
$data = $zip->getFromIndex($index);
$zip->close();
$xml = new SimpleXMLElement($data);
print_r($xml);
$pageCount = $xml->Slides;
}
$zip->close();
}
return $pageCount;
}
Older Office documents are a different story. You'll find some discussion about doing that here: How to get the number of pages in a Word Document on linux?
As for PDF files, I prefer to use FPDI, even though it requires a license to parse newer PDF file formats. You can use do it simply like this:
function PageCount_PDF($file) {
$pageCount = 0;
if (file_exists($file)) {
require_once('fpdf/fpdf.php');
require_once('fpdi/fpdi.php');
$pdf = new FPDI(); // initiate FPDI
$pageCount = $pdf->setSourceFile($file); // get the page count
}
return $pageCount;
}