Frage

What I want is to calculate how much time the caret will move from the beginning till the end of the string.

Explanations:
Look this string "" in this fiddle: http://jsfiddle.net/RFuQ3/
If you put the caret before the first quote then push the right arrow you will push 3 times to arrive after the second quote (instead of 2 times for an empty string).

The first way, and the easiest to calculate the length of a string is <string>.length.
But here, it returns 2.

The second way, from JavaScript Get real length of a string (without entities) gives 2 too.

How can I get 1?


1-I thought to a way to put the string in a text input, and then do a while loop with a try{setCaret}catch(){}
2-It's just for fun

War es hilfreich?

Lösung

The character in your question "󠀁" is the Unicode Character 'LANGUAGE TAG' (U+E0001).

From the following Stack Overflow questions,

we learn that

JavaScript strings are UCS-2 encoded but can represent Unicode code points outside the Basic Multilingual Pane (U+0000-U+D7FF and U+E000-U+FFFF) using two 16 bit numbers (a UTF-16 surrogate pair), the first of which must be in the range U+D800-U+DFFF.

The UTF-16 surrogate pair representing "󠀁" is U+DB40 and U+DC01. In decimal U+DB40 is 56128, and U+DC01 is 56321.

console.log("󠀁".length); // 2
console.log("󠀁".charCodeAt(0)); // 56128
console.log("󠀁".charCodeAt(1)); // 56321
console.log("\uDB40\uDC01" === "󠀁"); // true
console.log(String.fromCharCode(0xDB40, 0xDC01) === "󠀁"); // true

Adapting the code from https://stackoverflow.com/a/4885062/788324, we just need to count the number of code points to arrive at the correct answer:

var getNumCodePoints = function(str) {
    var numCodePoints = 0;
    for (var i = 0; i < str.length; i++) {
        var charCode = str.charCodeAt(i);
        if ((charCode & 0xF800) == 0xD800) {
            i++;
        }
        numCodePoints++;
    }
    return numCodePoints;
};

console.log(getNumCodePoints("󠀁")); // 1

jsFiddle Demo

Andere Tipps

function realLength(str) {
    var i = 1;
    while (str.substring(i,i+1) != "") i++;
    return (i-1);
}

Didn't try the code, but it should work I think.

Javascript doesn't really support unicode. You can try

yourstring.replace(/[\uD800-\uDFFF]{2}/g, "0").length

for what it's worth

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top