Question

For a programming lab my assignment is to write a program that checks the spelling of a word. I am doing this all on my own so this is basically my last resort.The program should work like this: iterate through all lines of the document you want to check.If a word is not in the dictionary, print the word and the line where you found it.

I have to use a dictionary file in which all words are capitalized.The file that I'm checking for correct spelling isn't. So somewhere I have to capitalize the words, but I cannot figure out where. Every time I run this code it just print every line in the AliceInWonderLand200.txt.

My code:

import re
def split_line(line):
    return re.findall('[A-Za-z]+9(?:\'[A-Za-z]+)',line)

file = open("dictionary.txt")
dictionary = []
for line in file:
    line = line.strip()
    dictionary.append(line)
file.close()
print("----Linear search-----")
file2 = open("AliceInWonderLand200.txt")
i = 0
for line in file2:
    words = []
    words.append(split_line(line))
    for word in line:
        i+= 1
        word = word.upper()
        if word not in dictionary:
            print("Line ",i,": probably misspelled: ", word)
file.close()

What I have tried:

I have tried to use words.append(split_line(line.upper()),but that didn't work. I have tried to assign word to word.upper(), that didn't work either. Every time when I run this code it just prints every line in the AliceInWonderLand200.txt.

I have looked everywhere to find a satisfying answer. I have found the same question here on stackoverflow, but I didn't really understand the answer Python Spell Checker Linear Search

edit

I have added the task and the output that I should have to make it easier for you guys.

What my output should be:

--- Linear Search ---
Line 3  possible misspelled word: Lewis
Line 3  possible misspelled word: Carroll
Line 46  possible misspelled word: labelled
Line 46  possible misspelled word: MARMALADE
Line 58  possible misspelled word: centre
Line 59  possible misspelled word: learnt
Line 69  possible misspelled word: Antipathies
Line 73  possible misspelled word: curtsey
Line 73  possible misspelled word: CURTSEYING
Line 79  possible misspelled word: Dinah'll
Line 80  possible misspelled word: Dinah
Line 81  possible misspelled word: Dinah
Line 89  possible misspelled word: Dinah
Line 89  possible misspelled word: Dinah
Line 149  possible misspelled word: flavour
Line 150  possible misspelled word: toffee
Line 186  possible misspelled word: croquet

the task: http://programarcadegames.com/index.php?chapter=lab_spell_check

Was it helpful?

Solution

First of all, you're better off using a set to hold your dictionary words, for better lookup speeds. Also, it would help to lowercase all the words in your dictionary to make comparisons more uniform.

with open('dictionary.txt') as infile:
    dictionary = {line.strip().lower() for line in infile}

print("----Linear search-----")
with open('AliceInWonderLand200.txt') as infile:
    for i,line in enumerate(infile, 1):
        line = line.strip()
        words = split_line(line) # your split_line function
        for word in words:
            if word.lower() not in dictionary:
                print("Line ", i, ": probably misspelled: ", word)

Hope this helps

OTHER TIPS

You can lowercase the words in the dictionary:

for line in file:
    line = line.strip().lower()
    dictionary.append(line)

and lowercase the word that you are checking for:

for word in line:
    i += 1
    word = word.lower()
    ...
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top