I am trying to grab a word count from an uploaded word doc (.doc, .docx, .rtf) but it always carries through the annoying Word formatting.

Anybody tackled this issue before and know how to solve it? Thanks :)

有帮助吗?

解决方案

You will need to:

  1. Distinguish the file type

    $file_name = $_FILES['image']['name'];
    $file_extn = end(explode(".", strtolower($_FILES['image']['name'])));
    
    if($file_extn == "doc" || $file_extn == "docx"){
        docx2text();
    }elseif($file_extn == "rtf"){
        rtf2text();
    }
    
  2. Convert the document to text

    https://stackoverflow.com/a/7371315/2512934 for doc or docx http://webcheatsheet.com/php/reading_the_clean_text_from_rtf.php for rtf

  3. Count the words http://php.net/manual/en/function.str-word-count.php

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top