Question

I am really having trouble getting my head around crossbrowser recursion in the DOM. I want to get only the text content of a node, but not any HTML tags or other information. Through trial and error, I found that the textContent and innerText attributes don't hold across all browsers, so I have to use the data attribute.

Now the function I have so far is this:

    getTextContentXBrowser: function(nodeIn) {
        // Currently goes down two levels. Need to abstract further to handle arbitrary number of levels
        var tempString = '';
        for (i=0, len=nodeIn.childNodes.length; i < len; i++) {
            if (nodeIn.childNodes[i].firstChild !== null) {
                tempString += nodeIn.childNodes[i].firstChild.data;
            } else {
                if (nodeIn.childNodes[i].data && nodeIn.childNodes[i].data !== '\n') {
                    tempString += nodeIn.childNodes[i].data;
                }
            }
        }
        return tempString;
    },

It's written in object notation, but otherwise it's a pretty standard unremarkable function. It goes down two levels, which is almost good enough for what I want to do, but I want to "set it and forget it" if possible.

I've been at it for four hours and I haven't been able to abstract this to an arbitrary number of levels. Is recursion even my best choice here? Am I missing a better option? How would I convert the above function to recurse?

Thanks for any help!

Update: I rewrote it per dsfq's model, but for some reason, it goes one level down and is unable to go back up afterwards. I realized that my problem previously was that I wasn't concatenating in the second if clause, but this seems to have stopped me short of the goal. Here is my updated function:

    getTextContentXBrowser: function(nodeIn) {
        var tempString = '';
        for (i=0, len=nodeIn.childNodes.length; i < len; i++) {
            if (nodeIn.childNodes[i].data) {
                tempString += nodeIn.childNodes[i].data;
            } else if (nodeIn.childNodes[i].firstChild) {
                tempString += this.getTextContentXBrowser(nodeIn.childNodes[i]);
            }
        }
        return tempString.replace(/ /g,'').replace(/\n/g,'');
    },

Anyone see what I'm missing?

No correct solution

OTHER TIPS

Have you considered doing this with jQuery?

getTextContentXBrowser: function(nodeIn) {
    return $(nodeIn).text();
}

As simple as that!

It can be really simple function calling itself to to replace nodes with its contents. For example:

function flatten(node) {
    for (var c = node.childNodes, i = c.length; i--;) {
        if (c[i].nodeType == 1) {
            c[i].parentNode.replaceChild(document.createTextNode(flatten(c[i]).innerHTML), c[i]);
        }
    }
}

Looks like in your case you getTextContentXBrowser is a method of some object, so you will need to call it from inside itself properly (in my example I just use function).

Demo: http://jsfiddle.net/7tyYA/

Note that this function replaces nodes with a text in place. If you want a function that just returns a text without modifying actual node right away consider this example with another version of the script:

Demo 2: http://jsfiddle.net/7tyYA/1/

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top