Question

I have a folder with about 45 million small zip files in subfolders, totalling about 1.5TB

I need to transfer this folder to another windows machine.

Does anybody have any good ideas on the fastest way to create a single archive or image file for transfer that I can restore on the target machine?

Was it helpful?

Solution

Do you really think a single 1.5TB file will be easier to copy over than the individual files? Especially considering you then have to split them up again. Also it requires twice as must disk space at both ends to hold both the archive and the small files.

I recommend using a tool with backup and resume support such as robocopy to replicate the individual files to the target machine.

http://technet.microsoft.com/en-us/library/cc733145.aspx

OTHER TIPS

tar -cf filename.tar path/to/small/files/* to pack, tar -xf filename.tar to unpack.

tar is vailable for example via MinGW, or UnxUtils or GnuWin32. Peazip and 7z can create tar files too, but they are pretty pathetic performance-wise compared to the "genuine" tool. On my computer, the genuine tar utility runs 5-6 times faster on large numbers of small files (no idea why, it's just copying data from one file to another!).

Since your files are already ZIP files, it is unlikely that compression will further reduce the size. On the other hand, compression is usually in the low range of megabytes per second whereas disk read is in the hundreds. Thus, compression would considerably increase the time you spend creating that archive, and simply using tar is probably best.

Corruption should not be that much of an issue since typical transports (say, FTP) are reliable, and the underlying protocols and network layers are pretty good at checksumming and detecting bit errors.
Still, you might consider creating several smaller tar-files because if you only transfer one huge file and the FTP server at the other end crashes (or your internet connection gets a hiccup) after you have transferred 1.49 of your 1.5 TB, this will be pretty annoying. With somewhat smaller files, you don't need to resend that much.

Use the (unix) tool rsync - there are several windows versions of it. It has the big advantage of transferring only deltas, but in one TCP connection, so it always gives you full speed, and you just restart it should there be an interrupt.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top