Question

Does anyone know how i can use NSScanner to separate a string by comma into an array EXCEPT when a comma is embedded within quotes?

Before i had been using:

  NSArray *arrData = [strData componentsSeparatedByString:@","];

However i have quotes inside the string that have commas inside them, i want these not to be separated. I have never worked with NSScanner before and i am struggling to come to terms with the documentation. Has anyone done a similar thing before?

Was it helpful?

Solution

If you absolutely have to use an NSScanner you could do something like this.

        NSScanner *scanner = [[NSScanner alloc] initWithString:@"\"Foo, Inc\",0.00,1.00,\"+1.5%\",\"+0.2%\",\"Foo"];
        NSCharacterSet *characters = [NSCharacterSet characterSetWithCharactersInString:@"\","];
        [scanner setCharactersToBeSkipped:nil];
        NSMutableArray *words = [[NSMutableArray alloc]init];
        NSMutableString *word = [[NSMutableString alloc] init];
        BOOL inQuotes = NO;
        while(scanner.isAtEnd == NO)
        {
            NSString *subString;
            [scanner scanUpToCharactersFromSet:characters intoString:&subString];
            NSUInteger currentLocation = [scanner scanLocation];
            if(currentLocation >= scanner.string.length)
            {
                if(subString.length > 0)
                    [words addObject:subString];
                break;
            }
            if([scanner.string characterAtIndex:currentLocation] == '"')
            {
                inQuotes = !inQuotes;
                if(subString == nil)
                {
                    [scanner setScanLocation:currentLocation + 1];
                    continue;
                }
                [word appendFormat:@"%@",subString];
                if(word.length > 0)
                   [words addObject:word.copy];
                [word deleteCharactersInRange:NSMakeRange(0, word.length)];
            }
            if([scanner.string characterAtIndex:currentLocation] == ',')
            {
                if(subString == nil)
                {
                    [scanner setScanLocation:currentLocation + 1];
                    continue;
                }
                if(inQuotes == NO)
                    [words addObject:subString];
                else
                    [word appendFormat:@"%@,",subString];
            }
            [scanner setScanLocation:currentLocation + 1];
        }

EDIT: This gives the following output:

Foo, Inc

0.00

1.00

+1.5%

+0.2%

Foo

Hope this is what you want

As you can see it gets complicated and very error prone, I would recommend using Regex for this.

OTHER TIPS

This is the same general procedure as lead_the_zeppelin's answer, but I think it's considerably more straightforward. We use an NSMutableString, accum, to build up each intra-comment piece, which may consist of any number of quoted segments.

Each iteration, we scan up to a comma, a quote, or the end of the string, whichever comes first. If we find a comma, whatever's been accumulated so far should be saved. For a quote, we pick up everything up to the closing quote mark -- this is the key that avoids interpreting commas inside quotations as split points. Note that this won't work if quotes are not always balanced. If it's neither of those, we've reached the end of the string, so we save the accumulated string and quit.

// Demo data
NSArray * commaStrings = @[@"This, here, \"has\" no quoted, commas.",
                           @"This \"has, no\" unquoted \"commas,\"",
                           @"This, has,no,quotes",
                           @"This has no commas",
                           @"This has, \"a quoted\" \"phrase, followed\", by a, quoted phrase",
                           @"\"This\", one, \"went,\", to, \"mar,ket\"",
                           @"This has neither commas nor quotes",
                           @"This ends with a comma,"];

NSCharacterSet * commaQuoteSet = [NSCharacterSet characterSetWithCharactersInString:@",\""];

for( NSString * commaString in commaStrings ){

    NSScanner * scanner = [NSScanner scannerWithString:commaString];
    // Scanner ignores whitespace by default; turn that off.
    [scanner setCharactersToBeSkipped:nil];

    NSMutableArray * splitStrings = [NSMutableArray new];

    NSMutableString * accum = [NSMutableString new];

    while( YES ){

        // Set to an empty string for the case where the scanner is
        // at the end of the string and won't scan anything;
        // appendString: will die if its argument is nil.
        NSString * currScan = @"";
        // Scan up to a comma or a quote; this will go all the way to
        // the end of the string if neither of those exists.
        [scanner scanUpToCharactersFromSet:commaQuoteSet
                                intoString:&currScan];
        // Add the just-scanned material to whatever we've already got.
        [accum appendString:currScan];

        if( [scanner scanString:@"," intoString:NULL] ){
            // If it's a comma, save the accumulated string,
            [splitStrings addObject:accum];
            // clear it out,
            accum = [NSMutableString new];
            // and keep scanning.
            continue;
        }
        else if( [scanner scanString:@"\"" intoString:NULL] ) {
            // If a quote, append the quoted segment to the accumulation,
            [scanner scanUpToString:@"\""
                         intoString:&currScan];
            [accum appendFormat:@"\"%@\"", currScan];
            // and continue, appending until the next comma.
            [scanner scanString:@"\"" intoString:NULL];
            continue;
        }
        else {
            //Otherwise, there's nothing else to split; 
            // just save the remainder of the string
            [splitStrings addObject:accum];
            break;
        }

    }
    NSLog(@"%@", splitStrings);
}

Also, as Chuck suggested, you might want to just get a CSV parser.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top