C での単純なデータのシリアル化

https://stackoverflow.com/questions/6382626

28-10-2019
|

質問

現在アプリケーションを再設計しているのですが、一部のデータをシリアル化する際に問題が発生しました。

サイズが mxn の配列があるとします。

double **data;

に連載したいと思っています

char *dataSerialized

単純な区切り文字 (行に 1 つ、要素に 1 つ) を使用します。

逆シリアル化は非常に簡単で、区切り文字をカウントし、保存するデータのサイズを割り当てます。ただし、serialize 関数についてはどうでしょうか。

serialize_matrix(double **data, int m, int n, char **dataSerialized);

char 配列に必要なサイズを決定し、それに適切なメモリを割り当てるための最良の戦略は何でしょうか?

おそらく、文字列内の double の固定幅の指数表現を使用しているでしょうか?double のすべてのバイトを char に変換し、sizeof(double) で位置合わせされた char 配列を作成することは可能ですか?数値の精度を維持するにはどうすればよいでしょうか?

注記：

データはバイナリやファイルではなく、char 配列で必要です。

シリアル化されたデータは、C サーバーと Java クライアントの間で ZeroMQ を使用してネットワーク経由で送信されます。配列の次元と sizeof(double) を考慮して、これら 2 つの間で常に正確に再構築できることは可能でしょうか?

解決

Java は、生のバイトを読み取り、必要なものに変換するための非常に優れたサポートを備えています。単純なワイヤ形式を決定し、C でこれにシリアル化し、Java でアンシリアル化できます。

アンシリアル化とシリアル化を行うコードを含む、非常に単純な形式の例を次に示します。

必要に応じてどこかにダンプできる、少し大きなテストプログラムを作成しました。C でランダムなデータ配列を作成し、シリアル化して、base64 エンコードされたシリアル化された文字列を stdout に書き込みます。次に、はるかに小さい Java プログラムがこれを読み取り、デコードし、逆シリアル化します。

シリアル化する C コード:

/* 
I'm using this format:
32 bit signed int                   32 bit signed int                   See below
[number of elements in outer array] [number of elements in inner array] [elements]

[elements] is buildt like
[element(0,0)][element(0,1)]...[element(0,y)][element(1,0)]...

each element is sendt like a 64 bit iee754 "double". If your C compiler/architecture is doing something different with its "double"'s, look forward to hours of fun :)

I'm using a couple non-standard functions for byte-swapping here, originally from a BSD, but present in glibc>=2.9.
*/

/* Calculate the bytes required to store a message of x*y doubles */
size_t calculate_size(size_t x, size_t y)
{
    /* The two dimensions in the array  - each in 32 bits - (2 * 4)*/
    size_t sz = 8;  
    /* a 64 bit IEE754 is by definition 8 bytes long :) */
    sz += ((x * y) * 8);    
    /* and a NUL */
    sz++;
    return sz;
}

/* Helpers */
static char* write_int32(int32_t, char*);
static char* write_double(double, char*);
/* Actual conversion. That wasn't so hard, was it? */
void convert_data(double** src, size_t x, size_t y, char* dst)
{

    dst = write_int32((int32_t) x, dst);    
    dst = write_int32((int32_t) y, dst);    

    for(int i = 0; i < x; i++) {
        for(int j = 0; j < y; j++) {
            dst = write_double(src[i][j], dst);
        }
    }
    *dst = '\0';
}


static char* write_int32(int32_t num,  char* c)
{
    char* byte; 
    int i = sizeof(int32_t); 
    /* Convert to network byte order */
    num = htobe32(num);
    byte = (char*) (&num);
    while(i--) {
        *c++ = *byte++;
    }
    return c;
}

static char* write_double(double d, char* c)
{
    /* Here I'm assuming your C programs use IEE754 'double' precision natively.
    If you don't, you should be able to convert into this format. A helper library most likely already exists for your platform.
    Note that IEE754 endianess isn't defined, but in practice, normal platforms use the same byte order as they do for integers.
*/
    char* byte; 
    int i = sizeof(uint64_t);
    uint64_t num = *((uint64_t*)&d);
    /* convert to network byte order */
    num = htobe64(num);
    byte = (char*) (&num);
    while(i--) {
        *c++ = *byte++; 
    }
    return c;
}

アンシリアライズする Java コード:

/* The raw char array from c is now read into the byte[] `bytes` in java */
DataInputStream stream = new DataInputStream(new ByteArrayInputStream(bytes));

int dim_x; int dim_y;
double[][] data;

try {   
    dim_x = stream.readInt();
    dim_y = stream.readInt();
    data = new double[dim_x][dim_y];
    for(int i = 0; i < dim_x; ++i) {
        for(int j = 0; j < dim_y; ++j) {
            data[i][j] = stream.readDouble();
        }
    }

    System.out.println("Client:");
    System.out.println("Dimensions: "+dim_x+" x "+dim_y);
    System.out.println("Data:");
    for(int i = 0; i < dim_x; ++i) {
        for(int j = 0; j < dim_y; ++j) {
            System.out.print(" "+data[i][j]);
        }
        System.out.println();
    }


} catch(IOException e) {
    System.err.println("Error reading input");
    System.err.println(e.getMessage());
    System.exit(1);
}

他のヒント

バイナリファイルを作成している場合は、実際のバイナリデータ (64 ビット) をシリアル化する良い方法を考える必要があります。 double. 。これは、double の内容をファイルに直接書き込む (エンディアンを考慮する) ことから、より複雑な正規化シリアル化スキーム (例:NaN の明確に定義された表現を使用します)。それは本当にあなた次第です。基本的に同種のアーキテクチャ間で動作することを期待している場合は、おそらく直接メモリダンプで十分でしょう。

テキストファイルに書き込みたい場合で、ASCII 表現を探している場合は、10 進数値表現を使用しないことを強くお勧めします。代わりに、base64 などを使用して 64 ビットの生データを ASCII に変換できます。

あなたは本当に自分が持っているすべての精度を維持したいと思っています double!

ライセンス： CC-BY-SA と帰属

所属していません StackOverflow