"#包括"一文本文件的C节目作为一个char[]

https://stackoverflow.com/questions/410980

03-07-2019
|

题

有一种方式来包括整个文本文件作为一串在一C节目的编写时间？

是这样的：

file.txt:
```
This is
a little
text file
```

主。c:

#include <stdio.h>
int main(void) {
   #blackmagicinclude("file.txt", content)
   /*
   equiv: char[] content = "This is\na little\ntext file";
   */
   printf("%s", content);
}

获得一个小的程序上的指纹stdout"这是一个小小的文本的文件"

在那一刻我用予以解决了python脚本，但是它的屁股-丑陋的并且只限于一个可变的名字，你可以告诉我的另一种方式来做到这一点？

解决方案

我建议使用(unix工具)xxd 这一点。你可以用它喜欢这样

$ echo hello world > a
$ xxd -i a

产出：

unsigned char a[] = {
  0x68, 0x65, 0x6c, 0x6c, 0x6f, 0x20, 0x77, 0x6f, 0x72, 0x6c, 0x64, 0x0a
};
unsigned int a_len = 12;

其他提示

这个问题是关于C但在情况下，有人试图做到这一用C++11那么它可以做到只有小小的变化，包括文本文件由于采用了新的原字符串:

C++这样做：

const char *s =
#include "test.txt"
;

在文本的文件这样做：

R"(Line 1
Line 2
Line 3
Line 4
Line 5
Line 6)"

因此，有必须仅仅是一个前缀在该文件的顶部和一个后缀在它的结束。其间你可以做你想做的，没有特殊的逃跑是必要的，因为只要你不需字符序列 )".但是，即使这可能的工作如果你指定你自己的定义符:

R"=====(Line 1
Line 2
Line 3
Now you can use "( and )" in the text file, too.
Line 5
Line 6)====="

你有两个可能性：

使用的编译/连接的扩展到转换为一个文件成为二进制文件，以适当的符号指示的开始和结束的二进制数据。看到这个回答：包含二进制文件与GNU ld连接的脚本.
把你的文件的字符序列常量，可以初始化的一个阵列。注意你不能就这么做""和跨越多个线。你会需要一个线继续符(\)，逃避 " 字符和其他人来做这项工作。容易只是写一个小小的程序转换为字节到一个序列像 '\xFF', '\xAB', ...., '\0' (或使用unix工具 xxd 描述另一个答复，如果你拥有它提供的!):

代码:

#include <stdio.h>

int main() {
    int c;
    while((c = fgetc(stdin)) != EOF) {
        printf("'\\x%X',", (unsigned)c);
    }
    printf("'\\0'"); // put terminating zero
}

(没有进行测试).然后要做的：

char my_file[] = {
#include "data.h"
};

这里的数据。h产生的

cat file.bin | ./bin2c > data.h

好吧，灵感 Daemin的后我测试以下简单的例子：

a.数据：

"this is test\n file\n"

测试。c:

int main(void)
{
    char *test = 
#include "a.data"
    ;
    return 0;
}

海湾合作委员会-电子测试。c输出：

# 1 "test.c"
# 1 "<built-in>"
# 1 "<command line>"
# 1 "test.c"

int main(void)
{
    char *test =
# 1 "a.data" 1
"this is test\n file\n"
# 6 "test.c" 2
    ;
    return 0;
}

因此，它的工作，但需要数据包围加上引号。

我喜欢kayahr的答复。 如果你不想触摸的文件输入 然而，如果您使用 CMake, 你可以添加的定界符字符序列上的文件。以下CMake代码，例如，副本输入文件和包裹它们的内容：

function(make_includable input_file output_file)
    file(READ ${input_file} content)
    set(delim "for_c++_include")
    set(content "R\"${delim}(\n${content})${delim}\"")
    file(WRITE ${output_file} "${content}")
endfunction(make_includable)

# Use like
make_includable(external/shaders/cool.frag generated/cool.frag)

然后包括在c++是这样的：

constexpr char *test =
#include "generated/cool.frag"
;

什么可能的工作是如果你喜欢的东西:

int main()
{
    const char* text = "
#include "file.txt"
";
    printf("%s", text);
    return 0;
}

当然你必须要小心什么实际上是在文件，确保没有双引号，即所有适当的人物都逃跑，等等。

因此，它可能会更加容易，如果你只是载的的文本文件在运行时，或嵌入的文字直接进入代码。

如果你仍然希望案文在另一个文件中，你可以拥有它在那里，但是它必须表示那里作为一串。你会用代码如上但没有双引号中。例如：

"Something evil\n"\
"this way comes!"

int main()
{
    const char* text =
#include "file.txt"
;
    printf("%s", text);
    return 0;
}

你需要我的 xtr 实用但可以做一个 bash script.这是一个脚本我打电话 bin2inc.第一个参数的名称得到的 char[] variable.第二个参数的名字 file.输出是C include file 与文件的内容进行编码(在小写的 hex)以变量名称给出。的 char array 是 zero terminated, 和长度的数据保存在 $variableName_length

#!/bin/bash

fileSize ()

{

    [ -e "$1" ]  && {

        set -- `ls -l "$1"`;

        echo $5;

    }

}

echo unsigned char $1'[] = {'
./xtr -fhex -p 0x -s ', ' < "$2";
echo '0x00'
echo '};';
echo '';
echo unsigned long int ${1}_length = $(fileSize "$2")';'

你可以得到XTR在这里 xtr(外推器的性格)是GPLV3

你可以这样做使用 objcopy:

objcopy --input binary --output elf64-x86-64 myfile.txt myfile.o

现在你有一个目的文件可以链接到你的可执行其中包含符号的开始、结束和大小的内容 myfile.txt.

我重新实现xxd在python3，固定的所有xxd的烦恼:

Const的正确性
串的长度数据类型:int→位置
空终止(情况下，你可能会想要那样)
C串的兼容：放下 unsigned 上阵列。
较小的、可读的输出，因为你会写：Printable ascii码被输出为；其它字节进制编码。

这里是剧本，通过过滤本身，所以你可以看到它做了什么:

pyxxd.c

#include <stddef.h>

extern const char pyxxd[];
extern const size_t pyxxd_len;

const char pyxxd[] =
"#!/usr/bin/env python3\n"
"\n"
"import sys\n"
"import re\n"
"\n"
"def is_printable_ascii(byte):\n"
"    return byte >= ord(' ') and byte <= ord('~')\n"
"\n"
"def needs_escaping(byte):\n"
"    return byte == ord('\\\"') or byte == ord('\\\\')\n"
"\n"
"def stringify_nibble(nibble):\n"
"    if nibble < 10:\n"
"        return chr(nibble + ord('0'))\n"
"    return chr(nibble - 10 + ord('a'))\n"
"\n"
"def write_byte(of, byte):\n"
"    if is_printable_ascii(byte):\n"
"        if needs_escaping(byte):\n"
"            of.write('\\\\')\n"
"        of.write(chr(byte))\n"
"    elif byte == ord('\\n'):\n"
"        of.write('\\\\n\"\\n\"')\n"
"    else:\n"
"        of.write('\\\\x')\n"
"        of.write(stringify_nibble(byte >> 4))\n"
"        of.write(stringify_nibble(byte & 0xf))\n"
"\n"
"def mk_valid_identifier(s):\n"
"    s = re.sub('^[^_a-z]', '_', s)\n"
"    s = re.sub('[^_a-z0-9]', '_', s)\n"
"    return s\n"
"\n"
"def main():\n"
"    # `xxd -i` compatibility\n"
"    if len(sys.argv) != 4 or sys.argv[1] != \"-i\":\n"
"        print(\"Usage: xxd -i infile outfile\")\n"
"        exit(2)\n"
"\n"
"    with open(sys.argv[2], \"rb\") as infile:\n"
"        with open(sys.argv[3], \"w\") as outfile:\n"
"\n"
"            identifier = mk_valid_identifier(sys.argv[2]);\n"
"            outfile.write('#include <stddef.h>\\n\\n');\n"
"            outfile.write('extern const char {}[];\\n'.format(identifier));\n"
"            outfile.write('extern const size_t {}_len;\\n\\n'.format(identifier));\n"
"            outfile.write('const char {}[] =\\n\"'.format(identifier));\n"
"\n"
"            while True:\n"
"                byte = infile.read(1)\n"
"                if byte == b\"\":\n"
"                    break\n"
"                write_byte(outfile, ord(byte))\n"
"\n"
"            outfile.write('\";\\n\\n');\n"
"            outfile.write('const size_t {}_len = sizeof({}) - 1;\\n'.format(identifier, identifier));\n"
"\n"
"if __name__ == '__main__':\n"
"    main()\n"
"";

const size_t pyxxd_len = sizeof(pyxxd) - 1;

使用(这一提取的脚本):

#include <stdio.h>

extern const char pyxxd[];
extern const size_t pyxxd_len;

int main()
{
    fwrite(pyxxd, 1, pyxxd_len, stdout);
}

即使这是可以做到在汇编时(我不认为它可以在一般性)，该案文将有可能的预处理过的标题而不是文件的内容逐字。我想你会需要加载的文本文件在运行时或做一个讨厌的切割-n-粘贴的工作。

在x。h

"this is a "
"buncha text"

在主要的。c

#include <stdio.h>
int main(void)
{
    char *textFileContents =
#include "x.h"
    ;

    printf("%s\n", textFileContents);

    return 0
}

应该做的工作。

Hasturkun的回答使用xxd-我的选择是优秀的。如果你想要把转换过程中(文->hex包括文件)直接进入你的生成所转储.c工具/库中最近增加了一个功能类似于xxd的-我选项(它不会给你的满头-你需要提供char列定义--但这具有的优点让你选择的名称char阵列):

http://25thandclement.com/~william/projects/hexdump.c.html

它的许可证是一个很大的"标准"比xxd，是非常自由的一个例子使用它嵌入一个初始文件，在程序中可以看到的CMakeLists.txt 和方案。c文件：

https://github.com/starseeker/tinyscheme-cmake

有的优点和缺点既包括生成的文件在来源的树木和捆绑的公用事业如何处理将取决于具体的目标和需求的项目。转储.c开辟了捆绑的选择该应用程序。

我认为这是不可能的编译器和预处理的孤独。海湾合作委员会允许这样的：

#define _STRGF(x) # x
#define STRGF(x) _STRGF(x)

    printk ( MODULE_NAME " built " __DATE__ " at " __TIME__ " on host "
            STRGF(
#               define hostname my_dear_hostname
                hostname
            )
            "\n" );

但不幸的是不是这样的：

#define _STRGF(x) # x
#define STRGF(x) _STRGF(x)

    printk ( MODULE_NAME " built " __DATE__ " at " __TIME__ " on host "
            STRGF(
#               include "/etc/hostname"
            )
            "\n" );

错误是：

/etc/hostname: In function ‘init_module’:
/etc/hostname:1:0: error: unterminated argument list invoking macro "STRGF"

为什么不链接文本进入程序，并使用它作为一个全球性的可变! 这里就是一个例子。我在考虑使用这对包括开放GL色器内的文件的可执行，因为GL着色需要编为GPU在运行时间。

我有类似的问题，并为小型文件的上述解决方案的约翰内斯*绍布的工作就像一个魅力的我。

然而，对文件是一个比较大，它遇到的问题的角色阵列限制的编译器。因此，我写了一个小小的编码应用程序，将文件内容的进一2D字阵列同样大小的块(以及可能填补零)。它产生输出的文本文件2D阵列的数据这样的：

const char main_js_file_data[8][4]= {
    {'\x69','\x73','\x20','\0'},
    {'\x69','\x73','\x20','\0'},
    {'\x61','\x20','\x74','\0'},
    {'\x65','\x73','\x74','\0'},
    {'\x20','\x66','\x6f','\0'},
    {'\x72','\x20','\x79','\0'},
    {'\x6f','\x75','\xd','\0'},
    {'\xa','\0','\0','\0'}};

其中4实际上是一个可变MAX_CHARS_PER_ARRAY的编码。该文件将得到C码，所谓的，例如"main_js_file_data.h"然后可以很容易地内联到C++应用程序，例如这样的：

#include "main_js_file_data.h"

这里是源代码的编码器：

#include <fstream>
#include <iterator>
#include <vector>
#include <algorithm>


#define MAX_CHARS_PER_ARRAY 2048


int main(int argc, char * argv[])
{
    // three parameters: input filename, output filename, variable name
    if (argc < 4)
    {
        return 1;
    }

    // buffer data, packaged into chunks
    std::vector<char> bufferedData;

    // open input file, in binary mode
    {    
        std::ifstream fStr(argv[1], std::ios::binary);
        if (!fStr.is_open())
        {
            return 1;
        }

        bufferedData.assign(std::istreambuf_iterator<char>(fStr), 
                            std::istreambuf_iterator<char>()     );
    }

    // write output text file, containing a variable declaration,
    // which will be a fixed-size two-dimensional plain array
    {
        std::ofstream fStr(argv[2]);
        if (!fStr.is_open())
        {
            return 1;
        }
        const std::size_t numChunks = std::size_t(std::ceil(double(bufferedData.size()) / (MAX_CHARS_PER_ARRAY - 1)));
        fStr << "const char " << argv[3] << "[" << numChunks           << "]"    <<
                                            "[" << MAX_CHARS_PER_ARRAY << "]= {" << std::endl;
        std::size_t count = 0;
        fStr << std::hex;
        while (count < bufferedData.size())
        {
            std::size_t n = 0;
            fStr << "{";
            for (; n < MAX_CHARS_PER_ARRAY - 1 && count < bufferedData.size(); ++n)
            {
                fStr << "'\\x" << int(unsigned char(bufferedData[count++])) << "',";
            }
            // fill missing part to reach fixed chunk size with zero entries
            for (std::size_t j = 0; j < (MAX_CHARS_PER_ARRAY - 1) - n; ++j)
            {
                fStr << "'\\0',";
            }
            fStr << "'\\0'}";
            if (count < bufferedData.size())
            {
                fStr << ",\n";
            }
        }
        fStr << "};\n";
    }

    return 0;
}

如果你愿意诉诸一些肮脏的把戏可以获得创造性生串的文字和 #include 对于某些类型的文件。

例如，说我想包括一些SQL脚本源码在我的项目和我想得到法突出了但不想要任何特别建立的基础设施。我能有这样的文件 test.sql 这是有效的SQL源码在哪里 -- 开始一个评论：

--x, R"(--
SELECT * from TestTable
WHERE field = 5
--)"

然后在我的C++码我可以有：

int main()
{
    auto x = 0;
    const char* mysql = (
#include "test.sql"
    );

    cout << mysql << endl;
}

输出为：

--
SELECT * from TestTable
WHERE field = 5
--

或包括一些蟒蛇从一个文件代码 test.py 这是一个有效的Python脚本(因为 # 开始一个评论和蟒蛇 pass 是一个没有-op):

#define pass R"(
pass
def myfunc():
    print("Some Python code")

myfunc()
#undef pass
#define pass )"
pass

然后在C++编码：

int main()
{
    const char* mypython = (
#include "test.py"
    );

    cout << mypython << endl;
}

这将产出：

pass
def myfunc():
    print("Some Python code")

myfunc()
#undef pass
#define pass

它应该尽可能发挥类似的技巧，用于各种其他类型的代码你可能会想到包括作为一串。它是否是一个好主意我不确定。这是一种整齐哈克，但可能不是你想要的真正生产码。可能是确定用一个周末，哈克的项目。

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow