Come posso ottimizzare questo nidificato ciclo for?

https://stackoverflow.com/questions/3426769

26-09-2019
|

Domanda

Il programma dovrebbe passare attraverso ogni parola nella matrice creata dal file di testo di parola, e se è maggiore di 8 caratteri, aggiungerlo alla matrice goodWords. Ma l'avvertimento è che io voglio solo la parola di radice di essere nella matrice GoodWords, ad esempio:

Se salutare viene aggiunto alla matrice, non voglio che saluta o saluti o greeters, ecc.

    NSString *string = [NSString stringWithContentsOfFile:@"/Users/james/dev/WordParser/word.txt" encoding:NSUTF8StringEncoding error:NULL];
    NSArray *words = [string componentsSeparatedByString:@"\r\n"];
    NSMutableArray *goodWords = [NSMutableArray array];
    BOOL shouldAddToGoodWords = YES;

    for (NSString *word in words)
    {
        NSLog(@"Word: %@", word);

        if ([word length] > 8)
        {
            NSLog(@"Word is greater than 8");

            for (NSString *existingWord in [goodWords reverseObjectEnumerator])
            {
                NSLog(@"Existing Word: %@", existingWord);
                if ([word rangeOfString:existingWord].location != NSNotFound)
                {
                    NSLog(@"Not adding...");
                    shouldAddToGoodWords = NO;
                    break;
                }
            }

            if (shouldAddToGoodWords)
            {
                NSLog(@"Adding word: %@", word);
                [goodWords addObject:word];
            }
        }

        shouldAddToGoodWords = YES;
    }

Soluzione

Che ne dite di qualcosa di simile?

//load the words from wherever
NSString * allWords = [NSString stringWithContentsOfFile:@"/usr/share/dict/words"];
//create a mutable array of the words
NSMutableArray * words = [[allWords componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]] mutableCopy];
//remove any words that are shorter than 8 characters
[words filterUsingPredicate:[NSPredicate predicateWithFormat:@"length >= 8"]];
//sort the words in ascending order
[words sortUsingSelector:@selector(caseInsensitiveCompare:)];

//create a set of indexes (these will be the non-root words)
NSMutableIndexSet * badIndexes = [NSMutableIndexSet indexSet];
//remember our current root word
NSString * currentRoot = nil;
NSUInteger count = [words count];
//loop through the words
for (NSUInteger i = 0; i < count; ++i) {
    NSString * word = [words objectAtIndex:i];
    if (currentRoot == nil) {
        //base case
        currentRoot = word;
    } else if ([word hasPrefix:currentRoot]) {
        //word is a non-root word.  remember this index to remove it later
        [badIndexes addIndex:i];
    } else {
        //no match. this word is our new root
        currentRoot = word;
    }
}
//remove the non-root words
[words removeObjectsAtIndexes:badIndexes];
NSLog(@"%@", words);
[words release];

Questo viene eseguito molto molto rapidamente sulla mia macchina (2.8GHz MBP).

Altri suggerimenti

Un Trie sembra adatto per il vostro scopo. È come un hash, ed è utile per rilevare se una data stringa è un prefisso di una stringa già visto.

ho usato un NSSet per garantire che hai solo 1 copia di una parola aggiunta alla volta. Si aggiungerà una parola se la NSSet non contiene già esso. E poi controlla per vedere se la nuova parola è una stringa per qualsiasi parola che è già stato aggiunto, se è vero allora non aggiungerà la nuova parola. Si tratta di case-insensitive pure.

Quello che ho scritto è una refactoring del codice. Probabilmente non è molto più veloce, ma davvero si vuole una struttura dati albero se si vuole rendere molto più veloce quando si desidera effettuare la ricerca per parole che sono già stati aggiunti al vostro albero.

Date un'occhiata al RedBlack Alberi o B-Alberi .

words.txt

objective
objectively
cappucin
cappucino
cappucine
programme
programmer
programmatic
programmatically

Codice Sorgente

- (void)addRootWords {

    NSString        *textFile = [[NSBundle mainBundle] pathForResource:@"words" ofType:@"txt"];
    NSString        *string = [NSString stringWithContentsOfFile:textFile encoding:NSUTF8StringEncoding error:NULL];
    NSArray         *wordFile = [string componentsSeparatedByString:@"\n"];
    NSMutableSet    *goodWords = [[NSMutableSet alloc] init];

    for (NSString *newWord in wordFile)
    {
        NSLog(@"Word: %@", newWord);
        if ([newWord length] > 8)
        {
            NSLog(@"Word '%@' contains 8 or more characters", newWord);
            BOOL shouldAddWord = NO;
            if ( [goodWords containsObject:newWord] == NO) {
                shouldAddWord = YES;
            }
            for (NSString *existingWord in goodWords)
            {
                NSRange textRange = [[newWord lowercaseString] rangeOfString:[existingWord lowercaseString]];
                if( textRange.location != NSNotFound ) {
                    // newWord contains the a substring of existingWord
                    shouldAddWord = NO;
                    break;
                }
                NSLog(@"(word:%@) does not contain (substring:%@)", newWord, existingWord);
                shouldAddWord = YES;
            }
            if (shouldAddWord) {
                NSLog(@"Adding word: %@", newWord);
                [goodWords addObject:newWord];
            }
        }
    }

    NSLog(@"***Added words***");
    int count = 1;
    for (NSString *word in goodWords) {
        NSLog(@"%d: %@", count, word);
        count++;
    }

    [goodWords release];
}

Output:

***Added words***
1: cappucino
2: programme
3: objective
4: programmatic
5: cappucine

Autorizzato sotto: CC-BY-SA insieme a attribuzione

Non affiliato a StackOverflow