Question

I'm trying to build an implementation of <ctype.h> with a lookup table and bits masks (this method). But I've seen in the C11 standard I need some informations about the current locale :

7.4 Character handling <ctype.h>

[...]

2 The behavior of these functions is affected by the current locale.

So, do I need a <locale.h> implementation? How can I manage my <ctype.h> implementation with the C standard library?

Was it helpful?

Solution

A fast simple way to do this for 8-bit characters is to have one bitmask for each defined value of LC_CTYPE. For wide character types you can reduce the size of the tables by some method like a 2-stage lookup. To be efficient, this will need to be designed for each character encoding. Having a dynamic lookup on LC_CTYPE would allow adding new locales more easily.

Looks to me like you could cover the Western languages with 16 or so 1-byte tables. To do everything would take about 50 tables, some of them quite tedious.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top