Question

So, I found this code from here by Nigel Garvey, and I want to add an ignore list similar to set wordsToIgnore to {"and", "the", "a", "for", "in", "is"}. The problem is I'm generally incompetent when it comes to these things. Can the powers that be take pity, and tell me how to add a ignore list? I've tried various types of frequency counts, but this one gives a correctly styled output in text edit, and is able to cut down the outputted words to a given number, but lacks the ability to ignore certain words. best regards.

Edit: I did post earlier with similar tags, but because I was working with a different script I thought it best to begin a new post. If I did wrong my apologies.

Was it helpful?

Solution

I didn't test this but after a quick look here's what I think. Change this section in the "on main(pdfFile)" handler to the following...

-- Go through the sorted list, counting the instances of each word. Store each word and its score in a list in the 'scores' list in the script object above.
set wordsToIgnore to {"and", "the", "a", "for", "in", "is"}
set currentWord to item 1 of o's wrds
set c to 1
repeat with i from 2 to (count o's wrds)
    set thisWord to item i of o's wrds
    if thisWord is not in wordsToIgnore then
        if (thisWord is currentWord) then
            set c to c + 1
        else
            set end of o's scores to {currentWord, c}
            set currentWord to thisWord
            set c to 1
        end if
    end if
end repeat
set end of o's scores to {currentWord, c}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top