Question

I am currently developing a program that produces a simple spelling test for students. It uses two lists to hold the spellings and the definitions separately

e.g.

spelling = [(‘pen’, ‘chair’)] 

definitions = [(‘a writing instrument’, ‘something you can sit on’)]

These lists can be appended to by the program if a user so desires by taking their input and adding it to the appropriate lists.

I am OK with how to code the majority of the program, but I need to be able to mark the answers that a student enters via their input compared to the word that is stored in the list. The student is shown the definition on screen (randomly from the list – displaying a test of 20 questions in total) and is then expected to type out the appropriate spelling for that matching word. The part I am stuck with however is that the mark needs to differ compared to how close they are to the correct spelling. If a student gets the exact spelling correct it should give them 5 marks and if they get it mostly correct with a minor error they get 2 marks. If they get a major error they get 0 marks.

Can anybody help me with how to go about marking the spelling please? I think I would need the list function to break the word down but then I am a little uncertain what to do from there, as the system needs to mark words that can change (as a user can add/remove them from the lists).

Was it helpful?

Solution 3

To compare two sequences (strings are sequences of characters) and get how close they are one from each other, you could use the SequenceMatcher from difflib and its ratio() function. You will probably need to make some tests to see if the ratio is representative enough for your use case and decide a threshold for each of your marks (e.g. between 0.99 and 0.75 they get 2 marks, under 0.75 they get no marks).

OTHER TIPS

Maybe Peter Norvig's spell checker in Python can help you.

I don't know what advice people can give you, because the rules for "mostly correct with minor error" are up to you.

But looking at Norvig's statistical approach could be instructive.

What you need to compute is called the Levenshtein distance between the word entered by the student and the correct one. Following the link you will find ample exposition of the topic, including pointers to derived algorithms such as the Damerau-Levenshtein distance.

In addition to those standard algorithms, you might want to consider if all character insertions, deletions and changes or swappings are to be assigned the same penalty in your application. For example using -ize instead of -ise could be considered a minor or null error, etc.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top