Question

I am looking at making a write optimization for CIFS/SMB such that the writing of duplicate blocks are suppressed. For example, I read a file from the remote share and modify a portion near the end of the file. When I save the file, I only want to send write requests back to the remote side for the portions of the file that have actually changed. So basically, suppress all writes up until the point at which a non duplicate write is encountered. At that point the suppression will be disabled and the writes will be allowed as usual. The problem is I can't find any documentation MS-SMB/MS-SMB2/MS-CIFS or otherwise that indicates whether or not this is a valid thing to do. Does anyone know if this would be valid?

Was it helpful?

Solution

Dig deep into the sources of the Linux kernel, there is documentation on CIFS - both in source and text. E.g. http://www.mjmwired.net/kernel/Documentation/filesystems/cifs.txt

If you want to study the behaviour of e.g. the CIFS protocol, you may be able to test it with the unix command "dd". Mount any remote file-system via CIFS, e.g. into /media/remote. Change into this folder

cd /media/remote
Now create a file with some random stuff (e.g. from the kernel's random pool):
dd if=/dev/urandom of=test.bin bs=4M count=5 
In this example, you should see some 20MB of traffic. Then create another smaller file, somewhere on your machine, say, your home-folder:
dd if=/dev/urandom of=~/test_chunk.bin bs=4M count=1
The interesting thing is what happens, if you attempt to write the chunk into a specific position of the remote test file:
dd if=~/test_chunk.bin of=test.bin bs=4M count=1 seek=3 conv=notrunc
Actually, this should only change block #4 out of 5 in the target file. I guess you can adjust the block size ... I did this with 4 MB blocks. But it should help to understand what happens on the network.

OTHER TIPS

The CIFS protocol does allow applications to write back specific portions of the file. This is controlled by the parameters DataOffset and DataLength in the SMB WriteAndX packet.

Documentation for the same can be found here: http://msdn.microsoft.com/en-us/library/ee441954.aspx

The client can use these fields to write a specific length of data to specific offsets within the file.

Similar support exists in more recent versions of the protocol as well ...

SMB protocol have such write optimization. It works with append cifs operation. Where protocol read EOF for file and start writing new data with offset set to EOF value and length as append data bytes.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top