Conditional Binary Unpacking With Node.js

https://stackoverflow.com/questions/21535161

06-10-2022
|

Question

I have been tasked to create a node.js program which has three main components: listen for incoming data (which comes in as packed binary), unpack and parse that data, then post it to the postgreSQL database. The issue I have is that all the npm libraries node currently offers do not play nicely with conditional formats of unpacking.

The data comes in the general format:

A header. This piece is fixed at 20 bytes on every incoming data packet and is thus easy to parse.
The first information section. This has a variable length.
The second information section. This has a variable length.
The third information section. This has a variable length.

Luckily, I am given the length of each section as the first two bytes in each case.

I have a guide to tell me how long each packet of information will be within the header use the bufferpack and binary npm libraries to parse the data as such. For the first information packet, I am not lucky enough to have a fixed guide :

var binary = require('binary');
var bufferpack = require('bufferpack');
//The data is received as a buffer in the main program.
exports.execute = function (data) {
    //The first piece of information is 8 bytes thus requiring
    //using binary npm. It is also big endian unsigned.
    //Bytes 0 - 7:
    var info1 = binary.parse(data).word64bu('data1').vars;
    //The next info packet is 1 unsigned byte. 
    var info2 = bufferpack.unpack('B', data, 8);
    //The next info packet is 9 bytes long, but is not collected 
    //because it is not required.

    //The last packet then is 2 bytes long, little endian unsigned
    //Bytes 18 - 19:
    var info3 = bufferpack.unpack('<H', data, 18);
    //End of header.

    //The above code runs fine and returns the expected values correctly. 

    //However, the next section of data comes in as conditional and presents
    //plenty of problems:

    //Luckily, I am given the length of the next piece of data.
    var firstSectionLength = bufferpack.unpack('<H', data, 20);
    //Next, three data packets will be sent, each containing 1-30 
    //Bytes of data. This is my current solution:
    var firstSectionInfo = bufferpack.unpack((modemInfoLength).toString() + 's', data, 22);
    //The next two information sections follow the same patter as the above.
    console.log(firstSectionInfo);
};

This code will log an array to the console which throws each piece of data into the first index, separated by '/u0000'. Since it is an array, I am unable to simply .split() it. However, if I .toString() the array it will return a string but deletes the '/u0000' parts leaving me with all the data strung together and no way to split them. Since they are all variable in length, I am unable to make a map to extract slices of the string as useful information.

Is there a way to either parse the data packets more effectively or parse the indexed array value reliably?

I'll add that previously python code was used to do the parsing like so:

def dataPacket(self, data, (host, port)):
    #I'll skip directly to how the first information section is parsed
    #...
    #Recall the header is fixed at 20 bytes.
    firstInfoSectionLength = unpack('H', data[20:22])
    firstInfoSectionBegin = 22
    firstInfoSectionEnd = firstInfoSectionBegin + firstInfoSectionLength[0]
    firstInfoSection = '%r' % data[firstInfoSectionBegin:firstInfoSectionEnd]
    firstInfoSection = firstInfoSection.strip("'")
    firstInfoSection = firstInfoSection.split('\\x00')
    data1 = firstInfoSection[0]
    data2 = firstInfoSection[1]
    data3 = firstInfoSection[2]
    data4 = firstInfoSection[3]
    data5 = firstInfoSection[4]

As you will notice, the python program parses the variable info length by different params then what I would be required to parse in the node.js program.

Thanks

Solution

If what you have in firstSectionInfo is this:

['ab\u0000cd\u0000']

I presume you want to split the only value in the array. So you can just do:

['ab\u0000cd\u0000'][0].split('\u0000')

You'll get the values 'ab', 'cd' and ''

(By the way, you keep writing /u0000 but this cannot correspond to Python's \x00.)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow