كيف يمكنني تحسين هذا التداخل للحلقة؟

https://stackoverflow.com/questions/3426769

26-09-2019
|

سؤال

يجب أن يمر البرنامج من خلال كل كلمة في المصفوفة التي تم إنشاؤها من ملف النص ، وإذا كان أكبر من 8 أحرف ، فأضفها إلى goodWords مجموعة مصفوفة. لكن التحذير هو أنني أريد فقط أن تكون كلمة الجذر في صفيف Goodwords ، على سبيل المثال:

إذا تمت إضافة تحية إلى المصفوفة ، فأنا لا أريد التحيات أو التحيات أو التحيات ، إلخ.

    NSString *string = [NSString stringWithContentsOfFile:@"/Users/james/dev/WordParser/word.txt" encoding:NSUTF8StringEncoding error:NULL];
    NSArray *words = [string componentsSeparatedByString:@"\r\n"];
    NSMutableArray *goodWords = [NSMutableArray array];
    BOOL shouldAddToGoodWords = YES;

    for (NSString *word in words)
    {
        NSLog(@"Word: %@", word);

        if ([word length] > 8)
        {
            NSLog(@"Word is greater than 8");

            for (NSString *existingWord in [goodWords reverseObjectEnumerator])
            {
                NSLog(@"Existing Word: %@", existingWord);
                if ([word rangeOfString:existingWord].location != NSNotFound)
                {
                    NSLog(@"Not adding...");
                    shouldAddToGoodWords = NO;
                    break;
                }
            }

            if (shouldAddToGoodWords)
            {
                NSLog(@"Adding word: %@", word);
                [goodWords addObject:word];
            }
        }

        shouldAddToGoodWords = YES;
    }

المحلول

ماذا عن شيء مثل هذا؟

//load the words from wherever
NSString * allWords = [NSString stringWithContentsOfFile:@"/usr/share/dict/words"];
//create a mutable array of the words
NSMutableArray * words = [[allWords componentsSeparatedByCharactersInSet:[NSCharacterSet newlineCharacterSet]] mutableCopy];
//remove any words that are shorter than 8 characters
[words filterUsingPredicate:[NSPredicate predicateWithFormat:@"length >= 8"]];
//sort the words in ascending order
[words sortUsingSelector:@selector(caseInsensitiveCompare:)];

//create a set of indexes (these will be the non-root words)
NSMutableIndexSet * badIndexes = [NSMutableIndexSet indexSet];
//remember our current root word
NSString * currentRoot = nil;
NSUInteger count = [words count];
//loop through the words
for (NSUInteger i = 0; i < count; ++i) {
    NSString * word = [words objectAtIndex:i];
    if (currentRoot == nil) {
        //base case
        currentRoot = word;
    } else if ([word hasPrefix:currentRoot]) {
        //word is a non-root word.  remember this index to remove it later
        [badIndexes addIndex:i];
    } else {
        //no match. this word is our new root
        currentRoot = word;
    }
}
//remove the non-root words
[words removeObjectsAtIndexes:badIndexes];
NSLog(@"%@", words);
[words release];

هذا يعمل بسرعة كبيرة على الجهاز الخاص بي (2.8 جيجا هرتز MBP).

نصائح أخرى

أ تري يبدو مناسبا لغرضك. إنه مثل التجزئة ، وهو مفيد لاكتشاف ما إذا كانت سلسلة معينة بادئة لسلسلة شوهد بالفعل.

لقد استخدمت NSSet للتأكد من أن لديك نسخة واحدة فقط من كلمة تمت إضافتها في وقت واحد. سوف يضيف كلمة إذا كان NSSet لا يحتوي بالفعل على ذلك. ثم يتحقق لمعرفة ما إذا كانت الكلمة الجديدة عبارة عن سلسلة فرعية لأي كلمة تمت إضافتها بالفعل ، إذا كان ذلك صحيحًا ، فلن يضيف الكلمة الجديدة. انها حساسة للحالة أيضا.

ما كتبته هو إعادة إنشاء رمزك. ربما لا يكون الأمر أسرع بكثير ولكنك تريد حقًا هيكل بيانات الأشجار إذا كنت ترغب في جعلها أسرع بكثير عندما تريد البحث عن كلمات تمت إضافتها بالفعل إلى شجرتك.

ألق نظرة على أشجار Redblack أو أشجار ب.

الكلمات

objective
objectively
cappucin
cappucino
cappucine
programme
programmer
programmatic
programmatically

مصدر الرمز

- (void)addRootWords {

    NSString        *textFile = [[NSBundle mainBundle] pathForResource:@"words" ofType:@"txt"];
    NSString        *string = [NSString stringWithContentsOfFile:textFile encoding:NSUTF8StringEncoding error:NULL];
    NSArray         *wordFile = [string componentsSeparatedByString:@"\n"];
    NSMutableSet    *goodWords = [[NSMutableSet alloc] init];

    for (NSString *newWord in wordFile)
    {
        NSLog(@"Word: %@", newWord);
        if ([newWord length] > 8)
        {
            NSLog(@"Word '%@' contains 8 or more characters", newWord);
            BOOL shouldAddWord = NO;
            if ( [goodWords containsObject:newWord] == NO) {
                shouldAddWord = YES;
            }
            for (NSString *existingWord in goodWords)
            {
                NSRange textRange = [[newWord lowercaseString] rangeOfString:[existingWord lowercaseString]];
                if( textRange.location != NSNotFound ) {
                    // newWord contains the a substring of existingWord
                    shouldAddWord = NO;
                    break;
                }
                NSLog(@"(word:%@) does not contain (substring:%@)", newWord, existingWord);
                shouldAddWord = YES;
            }
            if (shouldAddWord) {
                NSLog(@"Adding word: %@", newWord);
                [goodWords addObject:newWord];
            }
        }
    }

    NSLog(@"***Added words***");
    int count = 1;
    for (NSString *word in goodWords) {
        NSLog(@"%d: %@", count, word);
        count++;
    }

    [goodWords release];
}

انتاج:

***Added words***
1: cappucino
2: programme
3: objective
4: programmatic
5: cappucine

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow