Question

I have a csv file I need to import into my db.

Sample input:

122545;bmwx3;new;red,black,white,pink

I want the final output to be like this:

INSERT INTO myTable VALUES ("122545", "bmwx3", "new", "red");
INSERT INTO myTable VALUES ("122545", "bmwx3", "new", "black");
INSERT INTO myTable VALUES ("122545", "bmwx3", "new", "white");
INSERT INTO myTable VALUES ("122545", "bmwx3", "new", "pink");

The 4th element is a "sub-csv" with an unknown amount of entries. But always in that format (no ")

Ideally I would like to do this in notepad++ using regex, if not possible I will have to cook up a script.

I think that first I need to make this:

122545;bmwx3;new;red,black,white,pink

Look like this:

122545;bmwx3;new;red

122545;bmwx3;new;black

122545;bmwx3;new;white

122545;bmwx3;new;pink

My problem is that I don't know to match the sub-csv. Is it even possible to do this in pure regex (no programming needed)?

Was it helpful?

Solution

If the 122545;bmwx3;new; part is not fixed

In three steps:

  • Get to red,black,white,pink#LIMIT#122545;bmwx3;new;: replace (.*;)([^;]*) with \2#LIMIT#\1

  • Create the 122545;bmwx3;new;red stings: replace

    (\w+)(?:,|(?=#LIMIT#))(?=.*#LIMIT#(.*))
    

    with \2\1\n (see demo)

  • Remove the #LIMIT#... lines: replace ^#LIMIT#.* with an empty string


If the 122545;bmwx3;new; part is fixed

@hjpotter's idea seems pretty cool, you just new to replace , with

\n122545;bmwx3;new;

What's left

Replace

^(\w*);(\w*);(\w*);(\w*)$

with

INSERT INTO myTable VALUES ("\1", "\2", "\3", "\4")

You're good to go !

OTHER TIPS

Certainly not the simplest way, but it works:

Find what: ^([^,]+;)(.+),([^,]+)$
Replace with: $1$2\n$1$3

And click on Replace all as many time as needed!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top