Question

This is a beginner OO Python question. I wish there was a stackoverflow for beginners were I could ask this without get negative votes. So, here we go.

When I run this code:

from nltk import NaiveBayesClassifier,classify
import USSSALoader
import random

class genderPredictor():

    def getFeatures(self):
        if self._loadNames() != None:
            maleNames,femaleNames=self._loadNames()
        else:
            print "There is no training file."
            return

        featureset = list()
        for nameTuple in maleNames:
            features = self._nameFeatures(nameTuple[0])
            featureset.append((features,'M'))

        for nameTuple in femaleNames:
            features = self._nameFeatures(nameTuple[0])
            featureset.append((features,'F'))

        return featureset

    def trainAndTest(self,trainingPercent=0.80):
        featureset = self.getFeatures()
        random.shuffle(featureset)

        name_count = len(featureset)

        cut_point=int(name_count*trainingPercent)

        train_set = featureset[:cut_point]
        test_set  = featureset[cut_point:]

        self.train(train_set)

        return self.test(test_set)

    def classify(self,name):
        feats=self._nameFeatures(name)
        return self.classifier.classify(feats)

    def train(self,train_set):
        self.classifier = NaiveBayesClassifier.train(train_set)
        return self.classifier

    def test(self,test_set):
       return classify.accuracy(self.classifier,test_set)

    def getMostInformativeFeatures(self,n=5):
        return self.classifier.most_informative_features(n)

    def _loadNames(self):
        return USSSALoader.getNameList()

    def _nameFeatures(self,name):
        name=name.upper()
        return {
            'last_letter': name[-1],
            'last_two' : name[-2:],
            'last_is_vowel' : (name[-1] in 'AEIOUY')
        }

if __name__ == "__main__":
    gp = genderPredictor()
    accuracy=gp.trainAndTest()

And self._loadNames() returns None, I got this error (from random imported module):

shuffle C:\Python27\lib\random.py   285     
TypeError: object of type 'NoneType' has no len()

This happend because despite I put a return statment in getFeatures(self), the flow jumps into the next class method (which is trainAndTest(self,trainingPercent=0.80)) which calls the random module (random.shuffle(featureset)).

So, I'd like to know: how to stop the procedure flow not only in the getFeatures(self) method, but in the entire class that contains it?

By the way, thanks Stephen Holiday for sharing the code.

Was it helpful?

Solution

This happend because despite I put a return statment in getFeatures(self), the flow jumps into the next class method (which is trainAndTest(self,trainingPercent=0.80)) which calls the random module (random.shuffle(featureset)).

An important thing to remember is that None is a perfectly valid value. The return statement in your getFeatures() is doing exactly what it is told and returning the valid value. Only an exceptional situation, or you explicitly, will stop that flow.

Instead of asking how you can "return from the class", what you might want to look into is checking the return values of functions you call and making sure its what you expect before you proceed. There are two places you could do this:

def trainAndTest(self,trainingPercent=0.80):
    featureset = self.getFeatures()

...

def _loadNames(self):
    return USSSALoader.getNameList()

In the first spot, you could check if featureset is None, and react if it is None.
In the second spot, instead of blindly returning, you could check it first and react there.

Secondly. you have the option of raising exceptions. Exceptions are a situation where the code has encountered an error and can't continue. It is then the responsibility of the calling function to either handle it or let it ride up the chain. If nothing handles the exception, your application will crash. As you can see, you are getting an exception being raised from the random class because you are allowing a None to make its way into the shuffle call.

names = USSSALoader.getNameList()
if names is None:
    # raise an exception?
    # do something else?
    # ask the user to do something?

The question at that point is, what do you want your program to do at that moment when it happens to get a None instead of a valid list? Do you want an exception similar to the one being raised by random, but more helpful and specific to your application? Or maybe you just want to call some other method that gets a default list. Is not having the names list even a situation where your application do anything other than exit? That would be an unrecoverable situation.

names = USSSALoader.getNameList()
if names is None:
    raise ValueError("USSSALoader didn't return any "
                     "valid names! Can't continue!")

Update

From your comment, I wanted to add the specific handling you wanted. Python has a handful of built in exception types to represent various circumstances. The one you would most likely want to raise is an IOError, indicating that the file could not be found. I assume "file" means whatever file USSSALoader.getNameList() needs to use and can't find.

names = USSSALoader.getNameList()
if names is None:
    raise IOError("No USSSALoader file found")

At this point, unless some function higher up the calling chain handles it, your program will terminate with a traceback error.

OTHER TIPS

There is nothing like "return from the entire class". You need to organize your code so that return values are valid in the functions that get them. Those functions can test the value to determine what to do next. The class boundaries have no effect on program flow, just the namespacing of methods.

Generally what you would do here is check for validity after you call the function, e.g.:

featureset = self.getFeatures()
if not featureset:
    # You could log an error message if you expected to get something, etc.
    return
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top