I am writing a node module that takes csv file and turns it into a javascript object. Because I am allowing the user to specify the delimiter, and support text qualifiers, I need to parse it with dynamic regex.
Here is how I create the regex:
settings.dilemeter = escapeForRegex(settings.dilemeter);
settings.textQualifier = escapeForRegex(settings.textQualifier);
var d = settings.dilemeter;
var tq = settings.textQualifier;
///////////////////////////////////////////////////////////////
/// This appears to be glitched
///////////////////////////////////////////////////////////////
var searchArray = [
"(" + tq + d + tq + ")", // First case to search for, eg: ","
"(" + tq + d + ")", // Second case to search for, eg: ",
"(" + d + tq + ")", // Third case to search for, eg: ,"
"(" + d + ")", // Last case to search for, eg: ,
"(" + tq + "$)", // if the text qualifier is the very last thing
];
var regexString = "(" + searchArray.join('|') + ')';
console.log(regexString);
var regex = new RegExp(regexString);
Which produces a regular expression that looks like this (when using | and " as dilemeters and text qualifiers) (("\|")|("\|)|(\|")|(\|)|("$))
which seems to match strings that I want to match here: http://regexpal.com/?flags=gm®ex=((%22%5C%7C%22)%7C(%22%5C%7C)%7C(%5C%7C%22)%7C(%5C%7C)%7C(%22%24))&input=h1%7Ch2%7Ch3%7Ch4%0Avalue%201%7C%22Value%202%22%7Cvalue%203%7C%22value%20-%205%22%7Csomething%7C%22Else%22
However, when I run this using string.split(regex)
I get really strange results.
var testString = [
'h1|h2|h3|h4', // The first line will be the headers
'value 1|"Value 2"|value 3|"value - 5"'// This is the first row of data
];
console.log(testString[1].split(regex));
produces:
["value 1",
"|"",
undefined,
undefined,
"|"",
undefined,
undefined,
"Value 2",
""|",
undefined,
""|",
undefined,
undefined,
undefined,
"value 3",
"|"",
undefined,
undefined,
"|"",
undefined,
undefined,
"value - 5",
""",
undefined,
undefined,
undefined,
undefined,
""",
""]
I can't seem to figure out why there are all of these undefined and why its returning the items that I am trying to split on.
I created a plunker with a more contextually complete demonstration http://plnkr.co/edit/hn2GUFYodYQeuQLqqwVD?p=preview