Question

I have this csv file, plain text here: http://pastie.org/1425970

What it looks like in excel: http://cl.ly/3qXk

An example of what I would like it to look like (just using the first row as example): http://cl.ly/3qYT

Plain text of first row: http://pastie.org/1425979

I need to create a csv file, to import all of the information into a database table.

I could manually create the csv, but I wanted to see if it was possible to accomplish this using regular expressions in textwrangler (grep) find and replace

Was it helpful?

Solution

Regular expressions aren't really the best way to accomplish this. As others have noted, you're better off writing some code to parse the file into the format you want.

With that said, this ugly regex should get you halfway there:

Find:

(\d+),"?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?(?:(\d+),? ?)?"?

Replace:

\1,\2\r\1,\3\r\1,\4\r\1,\5\r\1,\6\r\1,\7\r\1,\8

Which will leave you with some extra rows, like below:

1,1
1,8
1,11
1,13
1,
1,
1,
2,10
2,11
2,12
2,
2,
...

You can clean up the extra rows by hand, or with the following regex:

Find:

\d+,\r

Replace:

(empty string)

OTHER TIPS

Using Perl, you could do something like this:

open(my $read,"<","input.csv") or die ("Gah, couldn't read input.csv!\n"); open(my $write,">","output.csv") or die ("WHAAAARGARBL!\n"); while(<$read>) { chomp; if(/(\d+),"(.*)"/) { my @arr=split(/,/,$2); foreach(@arr) { print $write $1.",".$2."\n"; } } } close($read); close($write);

I don't know textmate. But in general I can describe what it takes to do this in pseudo-code.

loop, read each line  
   strip off the newline
   split into an array using /[, "]+/ as delimeter regex
   loop using result. an array slice from element 1 to the last element
       print element 0, comma, then itterator value
   end loop
end loop

In Perl, something like this ..

while ($line = <DATA> ) {
    chomp $line;
    @data_array = split /[, "]+/, $line;
    for $otherfield ( @data_array[ 1 .. $#data_array ]) {
        print "$data_array[0], $otherfield\n";
    }
}

It should be easy if you have a split capability.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top