Once you load the file into a string, it is immediately encoded as Unicode, regardless of the format of the original file. So what you are seeing is not the size of the file, but rather the size of the Unicode string representation of that file's contents. Based on your results, it looks like the file that you are loading is an ASCII file (one byte per character), but when you get the byte count in unicode (typically 2 bytes per character), it's doubling the size.
As others have said, if all you want is the file length, you can get it via the FileInfo.Length
property, which is far more efficient. For instance:
Dim Test As New FileInfo("C:\Users\Blubb\Documents\TOS.txt")
MessageBox.Show("The file has a size of " & Test.Length / 1024 & " kilobytes.")
If however, you really need to load the file first, the best approach is to read the bytes directly rather than loading it into a Unicode encoded string:
Dim Test() As Byte = System.IO.File.ReadAllBytes("C:\Users\Blubb\Documents\TOS.txt")
MessageBox.Show("The byte array 'Test' has a size of " & Test.Length / 1024 & " kilobytes.")
Notice that I used MessageBox.Show
which is preferable to the old VB6-style MsgBox
function. Or, if you really need to load it as a string, for some reason, you just have to make sure that you use the same encoding to count the bytes as the actual encoding of the original file:
Dim Test As String = System.IO.File.ReadAllText("C:\Users\Blubb\Documents\TOS.txt")
MessageBox.Show("The file loaded into the string 'Test' has a size of " & Encoding.ASCII.GetByteCount(Test) / 1024 & " kilobytes.")
Edit
As another example, using the string that you gave as an example in another comment:
Dim Test As String = "Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum."
'Displays 0,436... kilobytes
MessageBox.Show("ASCII size of 'Test': " & Encoding.ASCII.GetByteCount(Test) / 1024 & " kilobytes.")
'Displays 0,871... kilobytes
MessageBox.Show("Unicode size of 'Test': " & Encoding.Unicode.GetByteCount(Test) / 1024 & " kilobytes.")
As you can see, Unicode encoding is double the size in bytes. But both are representations of the same text, just using different byte-formats.