Using SuperCsv with multiple variable columns

https://stackoverflow.com/questions/18200826

csv
supercsv

24-06-2022
|

سؤال

I was looking at this example from the Super CSV website which shows that dateofbirth is optional column. What happens if i have more than one optional columns? How will the code change than?

 private static void readVariableColumnsWithCsvListReader() throws Exception {

        final CellProcessor[] allProcessors = new CellProcessor[] { new UniqueHashCode(), // customerNo (must be unique)
                new NotNull(), // firstName
                new NotNull(), // lastName
                new ParseDate("dd/MM/yyyy") }; // birthDate

        final CellProcessor[] noBirthDateProcessors = new CellProcessor[] { allProcessors[0], // customerNo
                allProcessors[1], // firstName
                allProcessors[2] }; // lastName

        ICsvListReader listReader = null;
        try {
                listReader = new CsvListReader(new FileReader(VARIABLE_CSV_FILENAME), CsvPreference.STANDARD_PREFERENCE);

                listReader.getHeader(true); // skip the header (can't be used with CsvListReader)

                while( (listReader.read()) != null ) {

                        // use different processors depending on the number of columns
                        final CellProcessor[] processors;
                        if( listReader.length() == noBirthDateProcessors.length ) {
                                processors = noBirthDateProcessors;
                        } else {
                                processors = allProcessors;
                        }

                        final List<Object> customerList = listReader.executeProcessors(processors);
                        System.out.println(String.format("lineNo=%s, rowNo=%s, columns=%s, customerList=%s",
                                listReader.getLineNumber(), listReader.getRowNumber(), customerList.size(), customerList));
                }

        }
        finally {
                if( listReader != null ) {
                        listReader.close();
                }
        }
}

Also what if the optional columns are not at the end but in center or somewhere else?

المحلول

So the real issue here is that to apply the correct cell processors, you need to know what data is in each column. With a valid CSV file (same no. of columns on each line) that's not a problem, but if you're dealing with a variable column CSV file it's tricky.

If, like the example, only 1 column is optional then you just need to count the number of columns read and use the appropriate array of cell processors. It doesn't matter where that optional column is, because it's still predictable.

If, however, more than 1 column is optional you're in trouble. For example, if middleName and city are optional in the following CSV file:

firstName,middleName,lastName,city
Philip,Fry,New York

That can be read as:

firstName="Philip", middleName="Fry", lastName="New York", city=null

firstName="Philip", middleName=null, lastName="Fry", city="New York"

It's no longer predictable. You may be able to inspect the data in the column to determine what that column should represent (e.g. a date has /'s), but that's not very robust, and even then you may even have to read a few lines in order to figure it out.

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow