Question

What is the best way to sanitize fields from a csv in supercsv? For example the First_Name column: trim the field, capitalize the first letter, remove various characters (quotes, commas, asterisks etc). Is it to write a custom CellProcessor like FmtName()? Maybe another one for FmtEmail() that lowercases everything, removes certain invalid characters?

Was it helpful?

Solution

I think the question you're trying to ask is:

"Is it better to write a custom cell processor that does all the conversions for a column, or to chain multiple reusable processors together?"

For example, with your first name example you could either:

a) write a custom cell processor which trimmed, capitalised and replaced all in the one processor:

new ParseFirstName()

b) chain together reusable processors (including the existing Super CSV processors and a new Capitalize custom cell processor that calls StringUtils.capitalize())

new Trim(new Capitalize(new StrReplace("[\",\\*]", "")))

I think it's really up to personal preference. Defining cell processors as done in b) can be quite verbose, but it means you can see all of the conversions/validation for all columns in the one place.

On the other hand, defining a custom cell processor for each column makes your cell processor setup very clean, but you may end up with duplicated code (e.g. if you wanted to capitalize multiple columns) and you can't see all the conversions at once. You'll also have more classes (more code).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top