Vim: regular expression to delete all lines except those starting with a given list of numbers

Question 1

Use a global match:

:v/^\(subject\|1\|6\|12\),/ delete

For every line that does not match that regular expression, delete it.

It yields:

subject,parameter1,parameter2,parameter3
1,blah,blah,blah
12,blah,blah,blah

EDIT: Just now I realised that you were already using the global match. You error was in the character class. It matches any character inside it regardless of repeated ones, in your case numbers one, two, six and a space. You must separate them in different branches, like I did before.

Question 2

The character class [1 6 12] means "any single character that is in this class,
i.e. any one of ' ', 1, 2, 6 (the repeated 1 is ignored).

You could use

:g!/^1,\|^6,\|^12,\|^subject/d

which is close to your original syntax - but it works (tested with vim on Mac OS X).

Note - it is important to include the comma, so that the line starting with 1 doesn't "protect" 11, 12345, etc.

You might want to do this differently though - using grep.

Put all the "white listed" numbers in a file, one per line, like so:

^subject
^1,
^2,
^6,
^12,

then do

grep -f whitelist csvFile

and the output will be your "edited" file (which you can pipe to a new file).

If you are even more interested in "efficiency", you could make your text file (let's continue to call it whitelist) just

subject
1
2
6
12

and use the following command:

cat whitelist | xargs -I {} grep "^"{}"," cvsFile

This needs a bit of explaining.

xargs            - take the input one line at a time
-I {}            - and insert that line in the command that follows, at the {}

This means that the grep command will be run n times (once per line in the whitelist file), and each time the regular expression that is fed into grep will be the concatenation of

"^"              - start of line
{}               - contents of one line of the input file (whitelist)
","              - comma that follows the number

So this is a compact way of writing

grep "^subject," csvFile; grep "^1," csvFile; grep "^2," csvFile;

etc.

It has the advantage that you can now generate your whitelist any way you want - as long as it ends up in a file, one line at a time, you can use it; the disadvantage is that you are essentially running grep n times. If your files get very large, and you have a large number of items in your white list, that may start to be a problem; but since your OS is likely to put the file into cache after the first read-through, it is really quite fast. The use of the ^ anchor makes the regular expression very efficient - as soon as it doesn't find a match it goes on to the next line.

Question 3

a "functional" alternative:

:g/./if index([1,12,6],str2nr(split(getline("."),",")[0]))<0|exec 'normal! dd'|endif