Question

I have an application that heavily reads and write to files (a custom format), I was told to improve performance by using direct unmanaged code. Before attempting in the real application I made a small tests just to see how the performance gains would be, but for my surprise, the unmanaged versions seems to be like 8x slower than using simply filestream.

Here is the managed function:

    private int length = 100000;
    private TimeSpan tspan;

    private void UsingManagedFileHandle()
    {
        DateTime initialTime = DateTime.Now;

        using (FileStream fileStream = new FileStream("data2.txt", FileMode.Create, FileAccess.ReadWrite))
        {
            string line = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890123";
            byte[] bytes = Encoding.Unicode.GetBytes(line);

            for (int i = 0; i < length; i++)
            {
                fileStream.Write(bytes, 0, bytes.Length);
            }

            fileStream.Close();
        }

        this.tspan = DateTime.Now.Subtract(initialTime);
        label2.Text = "" + this.tspan.TotalMilliseconds + " Milliseconds";
    }

Here is the unmanaged way:

    public void UsingAnUnmanagedFileHandle()
    {

        DateTime initialTime;
        IntPtr hFile;

        hFile = IntPtr.Zero;

        hFile = FileInteropFunctions.CreateFile("data1.txt",
            FileInteropFunctions.GENERIC_WRITE | FileInteropFunctions.GENERIC_READ,
            FileInteropFunctions.FILE_SHARE_WRITE,
            IntPtr.Zero,
            FileInteropFunctions.CREATE_ALWAYS,
            FileInteropFunctions.FILE_ATTRIBUTE_NORMAL, 
            0);

        uint lpNumberOfBytesWritten = 0;

        initialTime = DateTime.Now;

        if (hFile.ToInt64() > 0)
        {
            string line = "ABCDEFGHIJKLMNOPQRSTUVWXYZ1234567890123"; 
            byte[] bytes = Encoding.Unicode.GetBytes(line);
            uint bytesLen = (uint)bytes.Length;

            for (int i = 0; i < length; i++)
            {
                FileInteropFunctions.WriteFile(hFile,
                        bytes,
                        bytesLen,
                        out lpNumberOfBytesWritten,
                        IntPtr.Zero);
            }

            FileInteropFunctions.CloseHandle(hFile);

            this.tspan = DateTime.Now.Subtract(initialTime);
            label1.Text = "" + this.tspan.TotalMilliseconds + " Milliseconds";

        }
        else
            label1.Text = "Error";

    }

    [DllImport("kernel32.dll", SetLastError = true)]
    public static extern bool CloseHandle(IntPtr hObject);

    [DllImport("kernel32.dll", SetLastError = true)]
    public static extern unsafe IntPtr CreateFile(
        String lpFileName,              // Filename
        uint dwDesiredAccess,              // Access mode
        uint dwShareMode,              // Share mode
        IntPtr attr,                   // Security Descriptor
        uint dwCreationDisposition,           // How to create
        uint dwFlagsAndAttributes,           // File attributes
        uint hTemplateFile);               // Handle to template file


    [DllImport("kernel32.dll")]
    public static extern unsafe int WriteFile(IntPtr hFile,
        // byte[] lpBuffer,
        [MarshalAs(UnmanagedType.LPArray)] byte[] lpBuffer, // also tried this.
        uint nNumberOfBytesToWrite, 
        out uint lpNumberOfBytesWritten,
        IntPtr lpOverlapped);

The iteration using FileStream takes about 70 ms in my computer. The one using WriteFile takes about 550ms.

I tested several times and with several amount of iterations and the difference in performance is consistent.

I have no idea why the unmanaged code is being slower then the managed code.

EDIT

Thank you very much for your explanations, guys . I thought there was something "magical" undergoing FileStream and you have explained it so well. So, I know now there's no easy path to gain performance in this part, and I would like to ask you for opinion for other simple ways to gain speed. The file is random access in the real application, and size could range from 1MB to 1GB.

Was it helpful?

Solution 2

Well, FileStream is jut a wrapper around CreateFile/WriteFile. It's written by bunch of smart guys. So I see no logical explanation at all why you assume that your one should be faster :P.

As already stated, FileStream probably does extra-buffering before calling WriteFile() thus minimizing unmanaged method calls. And this is important - only make unmanaged calls when they are necessary. They cost. Buffer sizes are usually multiple of disk sector size. You can experiment with different sizes, though this is OS dependent, and most likely will yield other results on other computers.

But it's also important to know that WriteFile() does internal buffering too. It's not like you call WriteFile() and bam it's written to file. It will be flushed to HDD once it's time.

I think there is unnecessary byte[] marshaling going on. Eg when you call WriteFile(), system makes copy of your buffer. It should be avoidable by unsafe() keyword and little bit of hacking.

There is also FILE_FLAG_SEQUENTIAL_SCAN that can't be accessed through FileStream(afaik) and it should let system know that you're gonna do file writes/reads only sequentially. This might give some performance boost theoretically.

OTHER TIPS

Your unmanaged calls write the data to disk as soon as possible while FileStream is buffered (ie does most operations in-memory and should call the underlying unmanaged calls much less often)

There are constructors on FileStream that let you control the buffer size if you want to tweak performance further.

The difference is because the calls to WriteFile are synchronous while the writes to the FileStream are not.

By default CreateFile will create a synchronous file handle, so the calls to WriteFile do not return until the data is written. If you add FILE_FLAG_OVERLAPPED to the CreateFile call the un-managed implementation will take approximately the same time as the managed.

See the documenation for Synchronous and Asynchronous I/O Handles section of the CreateFile defini

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top