Pergunta

I can do this; I just don't know why it works. Using the MNIST database, which I downloaded from http://yann.lecun.com/exdb/mnist/, and the guidelines at the bottom of that page, I wrote the (as yet unfinished) method

// TRAINING SET IMAGE FILE (train-images-idx3-ubyte):
// [offset] [type]          [value]          [description] 
// 0000     32 bit integer  0x00000803(2051) magic number 
// 0004     32 bit integer  60000            number of images 
// 0008     32 bit integer  28               number of rows 
// 0012     32 bit integer  28               number of columns 
// 0016     unsigned byte   ??               pixel 
// 0017     unsigned byte   ??               pixel 
// ........ 
// xxxx     unsigned byte   ??               pixel

// TEST SET IMAGE FILE (t10k-images-idx3-ubyte):
// [offset] [type]          [value]          [description] 
// 0000     32 bit integer  0x00000803(2051) magic number 
// 0004     32 bit integer  10000            number of images 
// 0008     32 bit integer  28               number of rows 
// 0012     32 bit integer  28               number of columns 
// 0016     unsigned byte   ??               pixel 
// 0017     unsigned byte   ??               pixel 
// ........ 
// xxxx     unsigned byte   ??               pixel
let loadMnistImage file =
    use stream = File.Open(file, FileMode.Open)
    use reader = new BinaryReader(stream)
    let magicNumber = readInt(reader)
    let nImages = readInt(reader)
    let nRows = readInt(reader)
    let nColumns = readInt(reader)
    (magicNumber, nImages, nRows, nColumns);;

That was the easy part. The difficult part is the form of the readInt function. I can't just use BitConverter.ToInt(); I found the answer in this page: https://code.google.com/p/aguaviva-libs/source/browse/c%23/NeuronalNetwork/sets/HandWriting.cs?spec=svn9ffdf444c6317be049572cea59170602c8f28bea&r=9ffdf444c6317be049572cea59170602c8f28bea.

Translating the method

int Read(BinaryReader b, int i)
{
   int res = 0;

   while (i-- > 0)
   {
      res <<= 8;
      res |= b.ReadByte()
   }
   return res;
}

into F# gives

let readInt (b : BinaryReader) =
    [1..4] |> List.fold (fun res item -> (res <<< 8) ||| (int)(b.ReadByte())) 0

(assuming i = 4). This works: in F# interactive, the lines

loadMnistImage @"Data\t10k-images.idx3-ubyte"
loadMnistImage @"Data\train-images.idx3-ubyte"

give results of (2051, 10000, 28, 28) and (2051, 60000, 28, 28) respectively, which agree with the values in the comments from the first code snippet.

What I don't understand is why it works. What is with all this bit-shifting and folding on the bitwise or operator? Why can't I just use BitConverter.ToInt() instead?

Foi útil?

Solução 2

Posting my comment as an answer

As written, the method will work regardless of the endianness of the machine on which the code is running.

The standard library methods will all return results dependent on the endianness of the machine running the code. This may produce different results to what you are expecting (relative byte order is reversed).

Outras dicas

Standard library method IPAddress.NetworkToHostOrder(Int32) takes into account the endianness of the executing platform when converting an int from network order. The latter by standard convention is big-endian. As MNIST files follow the convention and are big-endian the following pair of standard library methods will do as endian-agnostic substitute of your readInt function:

let readInt (reader: System.IO.BinaryReader) =
    reader.ReadInt32() |> System.Net.IPAddress.NetworkToHostOrder

An equivalent, but more verbose variant involving BitConverter would be

let readInt (reader: System.IO.BinaryReader) =
    (reader.ReadBytes(4),0)
    |> System.BitCoverter.ToInt32
    |> System.Net.IPAddress.NetworkToHostOrder
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top