Question

I am trying to write a Thunderbird extension which will let you compose a message but it will process the message text before sending it out. So I need access to the plain text content of the email body.

Here is what I have so far, just as some test code in the Extension Developer Javascript console.

var composer = document.getElementById('msgcomposeWindow');
var frame = composer.getElementsByAttribute('id', 'content-frame').item(0);
if(frame.editortype != 'textmail') {
  print('Sorry, you are not composing in plain text.');
  return;
}

var doc = frame.contentDocument.documentElement;

// XXX: This does not work because newlines are not in the string!
var text = doc.textContent;
print('Message content:');
print(text);
print('');

// Do a TreeWalker through the composition window DOM instead.
var body = doc.getElementsByTagName('body').item(0);
var acceptAllNodes = function(node) { return NodeFilter.FILTER_ACCEPT; };
var walker = document.createTreeWalker(body, NodeFilter.SHOW_TEXT | NodeFilter.SHOW_ELEMENT, { acceptNode: acceptAllNodes }, false);

var lines = [];

var justDidNewline = false;
while(walker.nextNode()) {
  if(walker.currentNode.nodeName == '#text') {
    lines.push(walker.currentNode.nodeValue);
    justDidNewline = false;
  }
  else if(walker.currentNode.nodeName == 'BR') {
    if(justDidNewline)
      // This indicates back-to-back newlines in the message text.
      lines.push('');
    justDidNewline = true;
  }
}

for(a in lines) {
  print(a + ': ' + lines[a]);
}

I would appreciate any feedback as to whether I'm on the right track. I also have some specific questions:

  • Does doc.textContent really not have newlines? How stupid is that? I'm hoping it's just a bug with the Javascript console but I suspect not.
  • Is the TreeWalker correct? I first tried NodeFilter.SHOW_TEXT but it did not traverse into the <SPAN>s which contain the quoted material in a reply. Similarly, it seems funny to FILTER_ACCEPT every node and then manually cherry-pick it later, but I had the same problem where if I rejected a SPAN node, the walker would not step inside.
  • Consecutive <BR>s break the naive implementation because there is no #text node in between them. So I manually detect them and push empty lines on my array. Is it really necessary to do that much manual work to access the message content?
Was it helpful?

Solution

Well, don't everybody chime in at once!

I posted this as a mozilla.dev.extensions thread and there was some fruitful discussion. I've been playing around in Venkman and the solution is to throw away my DOM/DHTML habits and write to the correct API.

var editor = window.gMsgCompose.editor;

// 'text/html' works here too
var text = editor.outputToString('text/plain', editor.eNone)

Now text has the plaintext version of the email body being composed.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top