How do I open a file named 𤭢.txt with CreateFile() API function?
-
04-07-2021 - |
Question
Code points of some Unicode characters (like 𤭢) consume more than 2-bytes. How do I use Win32 API functions like CreateFile()
with these characters?
WinBase.h
WINBASEAPI
__out
HANDLE
WINAPI
CreateFileA(
__in LPCSTR lpFileName,
__in DWORD dwDesiredAccess,
__in DWORD dwShareMode,
__in_opt LPSECURITY_ATTRIBUTES lpSecurityAttributes,
__in DWORD dwCreationDisposition,
__in DWORD dwFlagsAndAttributes,
__in_opt HANDLE hTemplateFile
);
WINBASEAPI
__out
HANDLE
WINAPI
CreateFileW(
__in LPCWSTR lpFileName,
__in DWORD dwDesiredAccess,
__in DWORD dwShareMode,
__in_opt LPSECURITY_ATTRIBUTES lpSecurityAttributes,
__in DWORD dwCreationDisposition,
__in DWORD dwFlagsAndAttributes,
__in_opt HANDLE hTemplateFile
);
#ifdef UNICODE
#define CreateFile CreateFileW
#else
#define CreateFile CreateFileA
#endif // !UNICODE
LPCSTR and LPCWSTR are define in WinNT.h as:
typedef __nullterminated CONST CHAR *LPCSTR, *PCSTR;
typedef __nullterminated CONST WCHAR *LPCWSTR, *PCWSTR;
CHAR
and WCHAR
is defined in WinNT.h as:
typedef char CHAR;
#ifndef _MAC
typedef wchar_t WCHAR; // wc, 16-bit UNICODE character
#else
// some Macintosh compilers don't define wchar_t in a convenient location, or define it as a char
typedef unsigned short WCHAR; // wc, 16-bit UNICODE character
#endif
CreateFileA()
accepts LPCSTR
file names, which are stored in 8-bit char
array internally.
CreateFileW()
accepts LPCWSTR
file names, which are stored in 16-bit wchar_t
array internally.
I have created a file in the position C:\𤭢.txt. It looks like it is not possible to open this file using CreateFile()
, because it contains the character 𤭢 whose Unicode code point is 0x24B62 which doesn't fit even in a WCHAR array cell.
But that file exists in my harddisk and Windows manages it normally. How do I open this file by a Win32 API function, like Windows does internally?
Solution
Such characters are represented by UTF-16 surrogate pairs. It takes two wide character elements to represent that code point. So, you just need to call CreateFile
passing the necessary surrogate pair. And naturally you need to use the wide variant of CreateFile
.
Presumably you won't be hard-coding such a filename in your code. In which case you'll be getting it from a file dialog, FindFirstFile
, etc. And those APIs will give you the appropriate UTF-16 encoded buffer for the file.