C+ 이상의 C ++ struct를 C#에 마샬링하는 가장 효율적인 방법은 무엇입니까?

https://stackoverflow.com/questions/878073

22-08-2019
|

문제

각각 1000 개 이상의 레코드를 가진 수많은 이진 파일을 읽기 시작합니다. 새 파일이 지속적으로 추가되므로 디렉토리를 모니터링하고 새 파일을 수신 할 때 새 파일을 처리하기 위해 Windows 서비스를 작성합니다. 파일은 C ++ 프로그램으로 작성되었습니다. C#에서 구조물 정의를 재현하고 데이터를 잘 읽을 수 있지만, 내가하는 방식이 결국 내 응용 프로그램을 죽일 것이라고 우려합니다.

using (BinaryReader br = new BinaryReader(File.Open("myfile.bin", FileMode.Open)))
{
    long pos = 0L;
    long length = br.BaseStream.Length;

    CPP_STRUCT_DEF record;
    byte[] buffer = new byte[Marshal.SizeOf(typeof(CPP_STRUCT_DEF))];
    GCHandle pin;

    while (pos < length)
    {
        buffer = br.ReadBytes(buffer.Length);
        pin = GCHandle.Alloc(buffer, GCHandleType.Pinned);
        record = (CPP_STRUCT_DEF)Marshal.PtrToStructure(pin.AddrOfPinnedObject(), typeof(CPP_STRUCT_DEF));
        pin.Free();

        pos += buffer.Length;

        /* Do stuff with my record */
    }
}

실제로 C ++ 앱과 의사 소통하지 않기 때문에 Gchandle을 사용할 필요는 없다고 생각합니다. 모든 것이 관리 코드에서 수행되고 있지만 대체 방법은 모릅니다.

해결책

특정 응용 프로그램의 경우 한 가지만 결정적인 답변을 제공합니다.

여기에 큰 Pinvoke 솔루션으로 작업하면서 배운 교훈이 여기에 있습니다. 마샬링 데이터를위한 가장 효과적인 방법은 어리석은 마샬 필드입니다. 즉, CLR이 기본 코드와 관리 코드 사이의 데이터를 이동하기 위해 Memcpy에 금액을 단순하게 할 수 있음을 의미합니다. 간단히 말해서, 모든 비 인화 배열과 문자열을 구조에서 꺼내십시오. 그것들이 기본 구조로 존재하는 경우, intptr로 표현하고 주문형 값을 관리 코드로 마샬링하십시오.

나는 Marshal.ptrtoStructure 사용 대 네이티브 API의 값을 갖는 것 사이의 차이를 프로파일하지 않았습니다. Ptrtostructure가 프로파일 링을 통해 병목 현상으로 드러나면 투자해야 할 것일 수 있습니다.

대규모 계층 구조의 경우, 주문형 마샬과 전체 구조를 한 번에 관리 코드로 끌어 당기는 경우. 나는 큰 나무 구조를 다룰 때이 문제를 가장 많이 만듭니다. 개별 노드를 마샬링하는 것은 뻔뻔스럽고 성능이 현명하다면 매우 빠릅니다. 그 순간에 필요한 것을 마샬링하는 것이 좋습니다.

다른 팁

사용 Marshal.PtrToStructure 다소 느립니다. CodeProject에서 다음 기사를 찾았는데, 이는 이진 데이터를 읽는 다양한 방법을 비교 (및 벤치마킹) 매우 도움이됩니다.

C#을 가진 빠른 바이너리 파일 읽기

Jaredpar의 포괄적 인 답변 외에도 사용할 필요가 없습니다. GCHandle, 대신 안전하지 않은 코드를 사용할 수 있습니다.

fixed(byte *pBuffer = buffer) {
     record = *((CPP_STRUCT_DEF *)pBuffer);
}

의 전체 목적 GCHandle/fixed 진술은 특정 메모리 세그먼트를 고정/수정하여 GC의 관점에서 메모리를 움직일 수없는 것입니다. 메모리가 움직일 수 있으면 재배치하면 포인터가 유효하지 않습니다.

그래도 어떤 방법이 더 빠릅니다.

이것은 당신의 질문의 한계를 벗어 났을 수도 있지만, 나는 fread () 또는 structs에서 읽기에 유사한 빠른 일을 한 관리 된 C ++에 약간의 어셈블리를 작성하는 경향이 있습니다. 일단 읽은 후에는 C#을 사용하여 필요한 모든 것을 수행 할 수 있습니다.

구조화 된 파일을 사용하는 동안 잠시 거슬러 올라간 작은 수업이 있습니다. 안전하지 않은 상태에서 부끄러워 할 때 가장 빠른 방법이었습니다 (이것은 비슷한 성능을 대체하고 유지하려고했던 것입니다.)

using System;
using System.Collections.Generic;
using System.IO;
using System.Runtime.InteropServices;

namespace PersonalUse.IO {

    public sealed class RecordReader<T> : IDisposable, IEnumerable<T> where T : new() {

        const int DEFAULT_STREAM_BUFFER_SIZE = 2 << 16; // default stream buffer (64k)
        const int DEFAULT_RECORD_BUFFER_SIZE = 100; // default record buffer (100 records)

        readonly long _fileSize; // size of the underlying file
        readonly int _recordSize; // size of the record structure
        byte[] _buffer; // the buffer itself, [record buffer size] * _recordSize
        FileStream _fs;

        T[] _structBuffer;
        GCHandle _h; // handle/pinned pointer to _structBuffer 

        int _recordsInBuffer; // how many records are in the buffer
        int _bufferIndex; // the index of the current record in the buffer
        long _recordPosition; // position of the record in the file

        /// <overloads>Initializes a new instance of the <see cref="RecordReader{T}"/> class.</overloads>
        /// <summary>
        /// Initializes a new instance of the <see cref="RecordReader{T}"/> class.
        /// </summary>
        /// <param name="filename">filename to be read</param>
        public RecordReader(string filename) : this(filename, DEFAULT_STREAM_BUFFER_SIZE, DEFAULT_RECORD_BUFFER_SIZE) { }

        /// <summary>
        /// Initializes a new instance of the <see cref="RecordReader{T}"/> class.
        /// </summary>
        /// <param name="filename">filename to be read</param>
        /// <param name="streamBufferSize">buffer size for the underlying <see cref="FileStream"/>, in bytes.</param>
        public RecordReader(string filename, int streamBufferSize) : this(filename, streamBufferSize, DEFAULT_RECORD_BUFFER_SIZE) { }

        /// <summary>
        /// Initializes a new instance of the <see cref="RecordReader{T}"/> class.
        /// </summary>
        /// <param name="filename">filename to be read</param>
        /// <param name="streamBufferSize">buffer size for the underlying <see cref="FileStream"/>, in bytes.</param>
        /// <param name="recordBufferSize">size of record buffer, in records.</param>
        public RecordReader(string filename, int streamBufferSize, int recordBufferSize) {
            _fileSize = new FileInfo(filename).Length;
            _recordSize = Marshal.SizeOf(typeof(T));
            _buffer = new byte[recordBufferSize * _recordSize];
            _fs = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.None, streamBufferSize, FileOptions.SequentialScan);

            _structBuffer = new T[recordBufferSize];
            _h = GCHandle.Alloc(_structBuffer, GCHandleType.Pinned);

            FillBuffer();
        }

        // fill the buffer, reset position
        void FillBuffer() {
            int bytes = _fs.Read(_buffer, 0, _buffer.Length);
            Marshal.Copy(_buffer, 0, _h.AddrOfPinnedObject(), _buffer.Length);
            _recordsInBuffer = bytes / _recordSize;
            _bufferIndex = 0;
        }

        /// <summary>
        /// Read a record
        /// </summary>
        /// <returns>a record of type T</returns>
        public T Read() {
            if(_recordsInBuffer == 0)
                return new T(); //EOF
            if(_bufferIndex < _recordsInBuffer) {
                // update positional info
                _recordPosition++;
                return _structBuffer[_bufferIndex++];
            } else {
                // refill the buffer
                FillBuffer();
                return Read();
            }
        }

        /// <summary>
        /// Advances the record position without reading.
        /// </summary>
        public void Next() {
            if(_recordsInBuffer == 0)
                return; // EOF
            else if(_bufferIndex < _recordsInBuffer) {
                _bufferIndex++;
                _recordPosition++;
            } else {
                FillBuffer();
                Next();
            }
        }

        public long FileSize {
            get { return _fileSize; }
        }

        public long FilePosition {
            get { return _recordSize * _recordPosition; }
        }

        public long RecordSize {
            get { return _recordSize; }
        }

        public long RecordPosition {
            get { return _recordPosition; }
        }

        public bool EOF {
            get { return _recordsInBuffer == 0; }
        }

        public void Close() {
            Dispose(true);
        }

        void Dispose(bool disposing) {
            try {
                if(disposing && _fs != null) {
                    _fs.Close();
                }
            } finally {
                if(_fs != null) {
                    _fs = null;
                    _buffer = null;
                    _recordPosition = 0;
                    _bufferIndex = 0;
                    _recordsInBuffer = 0;
                }
                if(_h.IsAllocated) {
                    _h.Free();
                    _structBuffer = null;
                }
            }
        }

        #region IDisposable Members

        public void Dispose() {
            Dispose(true);
        }

        #endregion

        #region IEnumerable<T> Members

        public IEnumerator<T> GetEnumerator() {
            while(_recordsInBuffer != 0) {
                yield return Read();
            }
        }

        #endregion

        #region IEnumerable Members

        System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator() {
            return GetEnumerator();
        }

        #endregion

    } // end class

} // end namespace

사용:

using(RecordReader<CPP_STRUCT_DEF> reader = new RecordReader<CPP_STRUCT_DEF>(path)) {
    foreach(CPP_STRUCT_DEF record in reader) {
        // do stuff
    }
}

(여기서 새로 새겨 져서 게시하기에는 너무 많지 않았기를 바랍니다 ... 수업에 붙여 넣은 것만으로도 주석이나 그 말을 줄이지 않았습니다.)

이것은 C ++ 나 마샬링과 관련이없는 것 같습니다. 당신은 당신이 다른 것이 무엇인지 구조를 알고 있습니다.

분명히 하나의 구조물을 나타내는 바이트 그룹을 읽은 다음 비트 컨버터를 사용하여 바이트를 해당 C# 필드에 배치하는 간단한 코드가 필요합니다.

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow