How can I properly parse an email address with name?
-
13-11-2019 - |
Question
I'm reading email headers (in Node.js, for those keeping score) and they are VARY varied. E-mail addresses in the to
field look like:
"Jake Smart" <jake@smart.com>, jack@smart.com, "Development, Business" <bizdev@smart.com>
and a variety of other formats. Is there any way to parse all of this out?
Here's my first stab:
- Run a
split()
on-
to break up the different people into an array - For each item, see if there's a
<
or"
. - If there's a
<
, then parse out the email - If there's a
"
, then parse out the name - For the name, if there's a
,
, then split to get Last, First names.
If I first do a split on the ,
, then the Development, Business
will cause a split error. Spaces are also inconsistent. Plus, there may be more e-mail address formats that come through in headers that I haven't seen before. Is there any way (or maybe an awesome Node.js library) that will do all of this for me?
Solution
There's a npm module for this - mimelib (or mimelib-noiconv if you are on windows or don't want to compile node-iconv)
npm install mimelib-noiconv
And the usage would be:
var mimelib = require("mimelib-noiconv");
var addressStr = 'jack@smart.com, "Development, Business" <bizdev@smart.com>';
var addresses = mimelib.parseAddresses(addressStr);
console.log(addresses);
// [{ address: 'jack@smart.com', name: '' },
// { address: 'bizdev@smart.com', name: 'Development, Business' }]
OTHER TIPS
The actual formatting for that is pretty complicated, but here is a regex that works. I can't promise it always will work though. http://tools.ietf.org/html/rfc2822#page-15
const str = "...";
const pat = /(?:"([^"]+)")? ?<?(.*?@[^>,]+)>?,? ?/g;
let m;
while (m = pat.exec(str)) {
const name = m[1];
const mail = m[2];
// Do whatever you need.
}
I'd try and do it all in one iteration (performance). Just threw it together (limited testing):
var header = "\"Jake Smart\" <jake@smart.com>, jack@smart.com, \"Development, Business\" <bizdev@smart.com>";
alert (header);
var info = [];
var current = [];
var state = -1;
var temp = "";
for (var i = 0; i < header.length + 1; i++) {
var c = header[i];
if (state == 0) {
if (c == "\"") {
current.push(temp);
temp = "";
state = -1;
} else {
temp += c;
}
} else if (state == 1) {
if (c == ">") {
current.push(temp);
info.push (current);
current = [];
temp = "";
state = -1;
} else {
temp += c;
}
} else {
if (c == "<"){
state = 1;
} else if (c == "\"") {
state = 0;
}
}
}
alert ("INFO: \n" + info);
For something complete, you should port this to JS: http://cpansearch.perl.org/src/RJBS/Email-Address-1.895/lib/Email/Address.pm
It gives you all the parts you need. The tricky bit is just the set of regexps at the start.