Question

I'm looking for easiest way to determine if a character in Rust is between two Unicode values.

For example, I want to know if a character s is between [#x1-#x8] or [#x10FFFE-#x10FFFF]. Is there a function that does this already?

Was it helpful?

Solution

The simplest way for me to match a character was this

fn match_char(data: &char) -> bool {
    match *data {
        '\x01'...'\x08' |
        '\u{10FFFE}'...'\u{10FFFF}' => true,
        _ => false,
    }
}

Pattern matching a character was the easiest route for me, compared to a bunch of if statements. It might not be the most performant solution, but it served me very well.

OTHER TIPS

The simplest way, assuming that they are not Unicode categories (in which case you should be using std::unicode) is to use the regular comparison operators:

(s >= '\x01' && s <= '\x08') || s == '\U0010FFFE' || s == '\U0010FFFF'

(In case you weren't aware of the literal forms of these things, one gets 8-bit hexadecimal literals \xXX, 16-bit hexadecimal literals \uXXXX, and 32-bit hexadecimal literals \UXXXXXXXX. Matter of fact, casts would work fine too, e.g. 0x10FFFE as char, and would be just as efficient; just less easily readable.)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top