Question

I have been prototyping an approach to integrating Lucene.NET with SQL Server using SQL Server FileTable. The usage is very convenient and the code is simple - I don't have to do anything special with customizing Lucene.NET. The advantage I seek is mostly operational and enterprisey - my Company operates SQL Server 24/7 and keeping the search index in the same control space has a lot of advantages for us (...and I do realize I won't get precise transactional consistency, that is OK).

Problem: No matter what I do, there seems to be approximately 200 milliseconds (+- 20-30ms) of overhead when writing to a file in the SQL FileTable UNC share using the WinAPI (via System.IO.FileStream). This is significant with Lucene.NET because an index write operation directly to my local file system takes ~50ms, whereas the same operation to the FileTable takes ~2-3 seconds!

To sanity check this, I created another experiment that writes 3 new (create) files at 10KB, 1MB and 10MB. I wrote these 3 files to:

  1. A local directory (c:\Search\\)
  2. A non-FileTable share via UNC Path (\\\127.0.0.1\\Search\\)
  3. A FileTable UNC path (\\\127.0.0.1\[instance-share]\\search_index\\)

Using System.Diagnostics.Stopwatch, writing to the local directory was fastest as expected, writing to the non-FileTable share was slower but comparable, and the FileTable was an order of magnitude slower. What was interesting is the 2 larger file sizes performed similarly in case 2 and 3, which leads me to believe the overhead with file creation levels the timings.

Question: Does anyone have deeper insight into why file creation with FileTable is so "slow"?

This is a development VM with little concurrent activity (4GB RAM, 2 vCPU, probably some IO contention but these tests are meant to be relative comparisons). Insert into SQL Server for a trivial row on this box barely hits 1ms.

I don't have the code handy but will post with an edit soon (with exact timings) - it is very simple and just writes 4K chunks of a statically initialized byte array in a loop to the desired size.

I did implement the following recommendations, and also tuned the SMB stack, with no difference in performance: http://blogs.msdn.com/b/blogdoezequiel/archive/2011/02/11/best-practices-on-filestream-implementations.aspx#.UkbEYtKshcZ

EDIT: Timings from output test console:

Writing files for directory: c:\Search
        Writing file size : 10240
        Writing file size : 1048576
        Writing file size : 10485760
Writing files for directory: \\127.0.0.1\Search
        Writing file size : 10240
        Writing file size : 1048576
        Writing file size : 10485760
Writing files for directory: \\127.0.0.1\Sql2012\FIndex\search_index
        Writing file size : 10240
        Writing file size : 1048576
        Writing file size : 10485760

Write Timings
---------------------------------------------------------------
Paths (rows): Local, Shared, SQL File Table
Sizes (columns): 10KB, 1MB, 10MB
---------------------------------------------------------------
Local:  3                       2                       17
Share:  28                      31                      64
FTable: 205                     249                     317

Source Code (very simple, posted for completeness): Console Main with ASCII art omitted:

private static readonly string[] paths = new string[] 
{
    @"c:\Search",
    @"\\127.0.0.1\Search",
    @"\\127.0.0.1\Sql2012\FIndex\search_index"
};

private static readonly int[] sizes = new int[] 
{
    1024 * 10,
    1024 * 1024,
    1024 * 1024 * 10
};

static void Main(string[] args)
{
    // Directory: Size 1, 2, 3
    var timings = new long[3, 3];
    var stopwatch = new Stopwatch();
    for(var x = 0; x < 3; x++)
    {
        Console.WriteLine("Writing files for directory: {0}", paths[x]);
        for(var y = 0; y < 3; y++)
        {
            Console.WriteLine("\tWriting file size : {0}", sizes[y]);
            string fileName = Path.Combine(paths[x], Guid.NewGuid().ToString() + ".bin");
            stopwatch.Start();
            FileIOTestHelper.WriteFile(fileName, sizes[y]);
            stopwatch.Stop();
            timings[x, y] = stopwatch.ElapsedMilliseconds;
            stopwatch.Reset();
        }
    }

// ascii art display code
}

Implementation class:

public static class FileIOTestHelper
{
    private static readonly byte[] testBuffer = CreateBuffer();

    private static byte[] CreateBuffer()
    {
        var buffer = new byte[4096];
        for (var i = 0; i < 4096; i++)
        {
            buffer[i] = (byte)(i % 256);
        }

        ForceIOJit(buffer);
        return buffer;
    }

    private static void ForceIOJit(byte[] initBuffer)
    {
        // Shouldn't matter, but eliminating any possible warm up cost.
        using (var fs = new FileStream(Path.GetTempFileName(), FileMode.Open))
        {
            fs.Write(initBuffer, 0, 4096);
            fs.Flush();
        }
    }

    public static void WriteFile(string name, int sizeInBytes)
    {
        var count = sizeInBytes / 4096;
        var remainder = sizeInBytes % 4096;
        using (var fs = new FileStream(name, FileMode.Create))
        {
            for (int i = 0; i < count; i++)
            {
                fs.Write(testBuffer, 0, 4096);
            }

            if (remainder > 0)
            {
                fs.Write(testBuffer, 0, remainder);
            }   

            fs.Flush();
        }
    }
}
Was it helpful?

Solution

After a lot more testing, the latency I am experiencing is consistent and not sensitive to the buffer size. I am accepting this as a limitation of FileTable and it being less than ideal for very chatty IO.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top