Latin char in Javascript regexp

https://stackoverflow.com/questions/17949820

04-06-2022
|

Question

How can i inlude the use of latin chars like ČčĆćŠšĐđ in this javascript regexp

var regex = new RegExp('\\b' + this.value, "i");

UPDATE:

I have this code for filtering checkbox label, but it doesnt work well when there is an input with Č č ć

function listFilter(list, input) {
    var $lbs = list.find('.css-label');

    function filter(){
        var regex = new RegExp('\\b' + this.value);
        var $els = $lbs.filter(function(){
            return regex.test($(this).text());
        });
        $lbs.not($els).hide().prev().hide();
        $els.show().prev().show();
    };

    input.keyup(filter).change(filter)
}

jQuery(function($){
    listFilter($('#list'), $('.search-filter'))
})

here is a fiddle: DEMO

Solution

The problem in your regexp is that the word boundary isn't properly detected with those chars (just like \w and \W are badly handled with regards to Unicode).

I'd suggest to start with

new RegExp('(^|[\\s\\.])ČčĆćŠšĐđ', "i")

and to add to [\\s\\.] the other chars you may be needing as word boundaries.

If you can't define the expected possible word boundaries, you'd better use a library to produce "Unicode compatible" regular expressions. Some are listed in this related question.

OTHER TIPS

try with:

/^[A-z\u00C0-\u00ff\s'\.,-\/#!$%\^&\*;:{}=\-_`~()]+$/

as regular expression.

See the examples below:

var regexp = /[A-z\u00C0-\u00ff]+/g,
  ascii = ' hello !@#$%^&*())_+=',
  latin = 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏàáâãäåæçèéêëìíîïÐÑÒÓÔÕÖØÙÚÛÜÝÞßðñòóôõöøùúûüýþÿ',
  chinese = ' 你 好 ';

console.log(regexp.test(ascii)); // true
console.log(regexp.test(latin)); // true
console.log(regexp.test(chinese)); // false

Glist: https://gist.github.com/germanattanasio/84cd25395688b7935182

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow