Javascript - regex - how to remove words with specified length

Question 1

The problem with greek characters is because of \b. You can take a look here: Javascript - regex - word boundary (\b) issue where @Casimir et Hippolyte proposes the following solution:

Since Javascript doesn't have the lookbehind feature and since word boundaries work only with members of the \w character class, the only way is to use groups (and capturing groups if you want to make a replacement):

//example to remove 2 letter words:
txt = txt.replace(/(^|[^a-zA-ZΆΈ-ώἀ-ῼ\n])([a-zA-ZΆΈ-ώἀ-ῼ]{2})(?![a-zA-ZΆΈ-ώἀ-ῼ])/gm, '\1');

I also added 0-9 inside the first and the third match because it was removing words like "2TB" or "mp3"

Question 2

Why using regex, I think you problem can be resolved without using regex

check the example below it should give you a hint on how to start

text = 'English: the on in to of \n Greek: πως θα το πω';
var tokens = text.split(/\s+/);
var text = tokens.filter(function(token){ return token.length > 2}).join(' ');
alert(text);

Question 3

JavaScript has problems with Unicode support in regular expressions. To make the things working, I'd suggest to use XRegExp library, which has a stable support of Unicode.

MORE: http://xregexp.com/plugins/#unicode

Question 4

try this

text = 'English: the on in to of \n Greek: πως θα το πω';
text = text.replace(/\b[0-9a-zA-ZΆ-ώἀ-ῼ]{2}\b/g, '');
alert(text);
text2 = text.split(' ');
text = text2.filter(function(text2){ return text2.length != 2}).join(' ');
alert(text);

Edit-------------------

Try this,

text = 'English: the on in to of \n Greek: πως θα το πω';
text.replace(/\b[\n]\b/g, '\n ').replace(/\b[\t]\b/g, '\t ');
text2 = text.split(' ');
text = text2.filter(function(text2){ return text2.length != 2}).join(' ');
alert(text);

You will mantain \t, \n and will remove 2-letter word is between 2 tabs or two line feeds