It took me a very long time to figure that one out. Thanks to Apple's Core Audio documentation, which says absolutely nothing about how to deal with it. Or any of the other keys for that matter. I had to examine an MP4 file with track information before I understood.
Answer
You need to assign it with an NSData containing the track information.
The data must consists of four 16-bit big endian values, whereas the 2nd is the track number and the 3rd is the total tracks in collection. 1st and 4th should be zero.
So basically you need to do this
int16_t trackNumber = 1; // track number
int16_t tracksInCollection = 12; // total number of tracks in collection
int16_t data[4] = { 0, trackNumber, tracksInCollection, 0 };
metadataItem.keySpace = AVMetadataKeySpaceiTunes;
metadataItem.key = AVMetadataiTunesMetadataKeyTrackNumber;
metadataItem.value = [NSData dataWithBytes:data length:sizeof(data)];
Notice: The same approach is applied for the AVMetadataiTunesMetadataKeyDiscNumber key.
A remarks on endianness
If you don't want to worry about byte-order, you can "borrow" a methods from the Berkeley sockets API. Or it might be a macro. Anyhow, it works like this:
bigendianval = htons(val);
or
int16_t trackNumber = htons(myTrackNumberVariable);
htons (Host to network short) will convert your 16-bit numbers to big endian - regardless of endianness of your own system. IP-networks are also big-endian, and therefore htons is reusable here.