Domanda

I have a CSV file that I can open in Excel 2012 and it comes in perfectly. When I try to setup the metadata for this CSV file in Talend the fields (columns) are not splitting the same was as Excel splits them. I suspect I am not properly setting the metadata.

The specific issue is that I have a column with string data in it which may contain commas within the string. For example suppose I have a CSV file with three columns: ID, Name and Age which looks like this:

ID,Name,Age
1,Ralph,34 
2,Sue,14
3,"Smith, John", 42

When Excel reads this CSV file it looks at the second element of the third row ("Smith, John") as a single token and places it into a cell by itself.

In Talend it trys to break this same token into two since there is a comma within the token. Apparently Excel ignores all delimeters within a quoted string while Talend by default does not.

My question is how to I get Talend to behave the same as Excel?

È stato utile?

Soluzione

if you use tfileinputdelimited component to read this csv file, you can use delimeter as "," and under csv options properties of this component you should enable Text Enclosure """ option or even if you use metadata there would be an option to define string/text enclosure - here you should mention """ to resolve your problem

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top