How to open an std::fstream (ofstream or ifstream) with a unicode filename?
Question
You wouldn't imagine something as basic as opening a file using the C++ standard library for a Windows application was tricky ... but it appears to be. By Unicode here I mean UTF-8, but I can convert to UTF-16 or whatever, the point is getting an ofstream instance from a Unicode filename. Before I hack up my own solution, is there a preferred route here ? Especially a cross-platform one ?
Solution
The C++ standard library is not Unicode-aware. char
and wchar_t
are not required to be Unicode encodings.
On Windows, wchar_t
is UTF-16, but there's no direct support for UTF-8 filenames in the standard library (the char
datatype is not Unicode on Windows)
With MSVC (and thus the Microsoft STL), a constructor for filestreams is provided which takes a const wchar_t*
filename, allowing you to create the stream as:
wchar_t const name[] = L"filename.txt";
std::fstream file(name);
However, this overload is not specified by the C++11 standard (it only guarantees the presence of the char
based version). It is also not present on alternative STL implementations like GCC's libstdc++ for MinGW(-w64), as of version g++ 4.8.x.
Note that just like char
on Windows is not UTF8, on other OS'es wchar_t
may not be UTF16. So overall, this isn't likely to be portable. Opening a stream given a wchar_t
filename isn't defined according to the standard, and specifying the filename in char
s may be difficult because the encoding used by char varies between OS'es.
OTHER TIPS
The current versions of Visual C++ the std::basic_fstream have an open()
method that take a wchar_t* according to http://msdn.microsoft.com/en-us/library/4dx08bh4.aspx.
Since C++17, there is a cross-platform way to open an std::fstream with a Unicode filename using the std::filesystem::path overload. Until C++20, you can create a path from a UTF-8 string with std::filesystem::u8path. Example:
std::ofstream out(std::filesystem::u8path(u8"こんにちは"));
out << "hello";
After C++20, you can create a path by passing UTF-8 to the constructor: std::filesystem::path(u8"こんにちは")
(u8path will be deprecated).
Use std::wofstream
, std::wifstream
and std::wfstream
. They accept unicode filename. File name has to be wstring
, array of wchar_t
s, or it has to have _T()
macro, or prefix L
before the text.
Have a look at Boost.Nowide:
#include <boost/nowide/fstream.hpp>
#include <boost/nowide/cout.hpp>
using boost::nowide::ifstream;
using boost::nowide::cout;
// #include <fstream>
// #include <iostream>
// using std::ifstream;
// using std::cout;
#include <string>
int main() {
ifstream f("UTF-8 (e.g. ß).txt");
std::string line;
std::getline(f, line);
cout << "UTF-8 content: " << line;
}
If you're using Qt mixed with std::ifstream
:
return std::wstring(reinterpret_cast<const wchar_t*>(qString.utf16()));