Question

I'm looking at using SuperCSV to check the contents of some files that I receive. The format of the file is such that it has a header record followed by the data records followed by a CRC32 checksum value.

e.g.

ABC|2|20130115150327|
1|1234567890123456|1234|20130109204710|21130109204710|
2|6543210987654321|1234|20130110043658|21130110043658|
1A345C7D

I have a few questions about the capabilities of SuperCSV in this situation.

  • Does it allow you to validate different lines against different definitions i.e. one for the header record and one for the data records?
  • Does it allow you to validate that the delimiter (pipe '|' in this case) must be appended to the end of the line?
  • Is there, or has anybody written, a CellProcessor that validates hexadecimal values?
Was it helpful?

Solution

  • Does it allow you to validate different lines against different definitions i.e. one for the header record and one for the data records?

Yes. Typically, you'd read the header with getHeader(), which doesn't use CellProcessors, but there's nothing stopping you from just reading the header as a normal line with read() using CellProcessors. Each call to read() lets you pass in the CellProcessors, so you can process/validate the header, data rows and checksum differently using 3 different CellProcessor arrays.

  • Does it allow you to validate that the delimiter (pipe '|' in this case) must be appended to the end of the line?

As you're using | as the delimiter, then the last column will be treated as an empty column (null). This means that your CellProcessor array for reading the header will have to have 4 elements (or 6 for the data rows), otherwise you will get an exception saying that the number of columns doesn't match the number of cell processors. By putting an new Equals(null) processor at the end, you can essentially validate that the line ends with a |.

  • Is there, or has anybody written, a CellProcessor that validates hexadecimal values?

You can use the existing cell processor new StrRegex("[0-9A-F]+") to validate using a regular expression. You can even register a human-readable message for the validation error (e.g. "not a valid hex value") using StrRegex.registerMessage().

If you want to parse the hex as a number (probably not, but just in case), then there's no existing ParseHex CellProcessor in Super CSV. If you write one and submit a patch, I'll include it in the upcoming release! Depending on how big the number is, maybe it's best to update ParseLong to have another constructor that accepts a radix (16 in this case)?

I'd recommend keeping things simple and using CsvListReader (you could use the other readers, but you'll need to define a nameMapping array to provide column names for the header, data and checksum rows) as follows:

  1. Read the first line (I'm assuming the second column is the number of following data rows?) using the 'header' CellProcessor array.

  2. Read the data rows n times (where n is given by the 2nd header column) using the 'data' CellProcessor array.

  3. Read the checksum using the 'checksum' CellProcessor array (probably just a single ParseChecksum()).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top