Question

Is there an existing application or library in Java which will allow me to convert a CSV data file to XML file?

The XML tags would be provided through possibly the first row containing column headings.

Was it helpful?

Solution

Maybe this might help: JSefa

You can read CSV file with this tool and serialize it to XML.

OTHER TIPS

As the others above, I don't know any one-step way to do that, but if you are ready to use very simple external libraries, I would suggest:

OpenCsv for parsing CSV (small, simple, reliable and easy to use)

Xstream to parse/serialize XML (very very easy to use, and creating fully human readable xml)

Using the same sample data as above, code would look like:

package fr.megiste.test;

import java.io.FileReader;
import java.io.FileWriter;
import java.util.ArrayList;
import java.util.List;

import au.com.bytecode.opencsv.CSVReader;

import com.thoughtworks.xstream.XStream;

public class CsvToXml {     

    public static void main(String[] args) {

        String startFile = "./startData.csv";
        String outFile = "./outData.xml";

        try {
            CSVReader reader = new CSVReader(new FileReader(startFile));
            String[] line = null;

            String[] header = reader.readNext();

            List out = new ArrayList();

            while((line = reader.readNext())!=null){
                List<String[]> item = new ArrayList<String[]>();
                    for (int i = 0; i < header.length; i++) {
                    String[] keyVal = new String[2];
                    String string = header[i];
                    String val = line[i];
                    keyVal[0] = string;
                    keyVal[1] = val;
                    item.add(keyVal);
                }
                out.add(item);
            }

            XStream xstream = new XStream();

            xstream.toXML(out, new FileWriter(outFile,false));

        } catch (Exception e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }
    }
}

Producing the following result: (Xstream allows very fine tuning of the result...)

<list>
  <list>
    <string-array>
      <string>string</string>
      <string>hello world</string>
    </string-array>
    <string-array>
      <string>float1</string>
      <string>1.0</string>
    </string-array>
    <string-array>
      <string>float2</string>
      <string>3.3</string>
    </string-array>
    <string-array>
      <string>integer</string>
      <string>4</string>
    </string-array>
  </list>
  <list>
    <string-array>
      <string>string</string>
      <string>goodbye world</string>
    </string-array>
    <string-array>
      <string>float1</string>
      <string>1e9</string>
    </string-array>
    <string-array>
      <string>float2</string>
      <string>-3.3</string>
    </string-array>
    <string-array>
      <string>integer</string>
      <string>45</string>
    </string-array>
  </list>
  <list>
    <string-array>
      <string>string</string>
      <string>hello again</string>
    </string-array>
    <string-array>
      <string>float1</string>
      <string>-1</string>
    </string-array>
    <string-array>
      <string>float2</string>
      <string>23.33</string>
    </string-array>
    <string-array>
      <string>integer</string>
      <string>456</string>
    </string-array>
  </list>
  <list>
    <string-array>
      <string>string</string>
      <string>hello world 3</string>
    </string-array>
    <string-array>
      <string>float1</string>
      <string>1.40</string>
    </string-array>
    <string-array>
      <string>float2</string>
      <string>34.83</string>
    </string-array>
    <string-array>
      <string>integer</string>
      <string>4999</string>
    </string-array>
  </list>
  <list>
    <string-array>
      <string>string</string>
      <string>hello 2 world</string>
    </string-array>
    <string-array>
      <string>float1</string>
      <string>9981.05</string>
    </string-array>
    <string-array>
      <string>float2</string>
      <string>43.33</string>
    </string-array>
    <string-array>
      <string>integer</string>
      <string>444</string>
    </string-array>
  </list>
</list>

I know you asked for Java, but this strikes me as a task well suited to a scripting language. Here is a quick (very simple) solution written in Groovy.

test.csv

string,float1,float2,integer
hello world,1.0,3.3,4
goodbye world,1e9,-3.3,45
hello again,-1,23.33,456
hello world 3,1.40,34.83,4999
hello 2 world,9981.05,43.33,444

csvtoxml.groovy

#!/usr/bin/env groovy

def csvdata = []
new File("test.csv").eachLine { line ->
    csvdata << line.split(',')
}

def headers = csvdata[0]
def dataRows = csvdata[1..-1]

def xml = new groovy.xml.MarkupBuilder()

// write 'root' element
xml.root {
    dataRows.eachWithIndex { dataRow, index ->
        // write 'entry' element with 'id' attribute
        entry(id:index+1) {
            headers.eachWithIndex { heading, i ->
                // write each heading with associated content
                "${heading}"(dataRow[i])
            }
        }
    }
}

Writes the following XML to stdout:

<root>
  <entry id='1'>
    <string>hello world</string>
    <float1>1.0</float1>
    <float2>3.3</float2>
    <integer>4</integer>
  </entry>
  <entry id='2'>
    <string>goodbye world</string>
    <float1>1e9</float1>
    <float2>-3.3</float2>
    <integer>45</integer>
  </entry>
  <entry id='3'>
    <string>hello again</string>
    <float1>-1</float1>
    <float2>23.33</float2>
    <integer>456</integer>
  </entry>
  <entry id='4'>
    <string>hello world 3</string>
    <float1>1.40</float1>
    <float2>34.83</float2>
    <integer>4999</integer>
  </entry>
  <entry id='5'>
    <string>hello 2 world</string>
    <float1>9981.05</float1>
    <float2>43.33</float2>
    <integer>444</integer>
  </entry>
</root>

However, the code does very simple parsing (not taking into account quoted or escaped commas) and it does not account for possible absent data.

I have an opensource framework for working with CSV and flat files in general. Maybe it's worth looking: JFileHelpers.

With that toolkit you can write code using beans, like:

@FixedLengthRecord()
public class Customer {
    @FieldFixedLength(4)
    public Integer custId;

    @FieldAlign(alignMode=AlignMode.Right)
    @FieldFixedLength(20)
    public String name;

    @FieldFixedLength(3)
    public Integer rating;

    @FieldTrim(trimMode=TrimMode.Right)
    @FieldFixedLength(10)
    @FieldConverter(converter = ConverterKind.Date, 
    format = "dd-MM-yyyy")
    public Date addedDate;

    @FieldFixedLength(3)
    @FieldOptional
    public String stockSimbol;  
}

and then just parse your text files using:

FileHelperEngine<Customer> engine = 
    new FileHelperEngine<Customer>(Customer.class); 
List<Customer> customers = 
    new ArrayList<Customer>();

customers = engine.readResource(
    "/samples/customers-fixed.txt");

And you'll have a collection of parsed objects.

Hope that helps!

This solution does not need any CSV or XML libraries and, I know, it does not handle any illegal characters and encoding issues, but you might be interested in it as well, provided your CSV input does not break the above mentioned rules.

Attention: You should not use this code unless you know what you do or don't have the chance to use a further library (possible in some bureaucratic projects)... Use a StringBuffer for older Runtime Environments...

So here we go:

BufferedReader reader = new BufferedReader(new InputStreamReader(
        Csv2Xml.class.getResourceAsStream("test.csv")));
StringBuilder xml = new StringBuilder();
String lineBreak = System.getProperty("line.separator");
String line = null;
List<String> headers = new ArrayList<String>();
boolean isHeader = true;
int count = 0;
int entryCount = 1;
xml.append("<root>");
xml.append(lineBreak);
while ((line = reader.readLine()) != null) {
    StringTokenizer tokenizer = new StringTokenizer(line, ",");
    if (isHeader) {
        isHeader = false;
        while (tokenizer.hasMoreTokens()) {
            headers.add(tokenizer.nextToken());
        }
    } else {
        count = 0;
        xml.append("\t<entry id=\"");
        xml.append(entryCount);
        xml.append("\">");
        xml.append(lineBreak);
        while (tokenizer.hasMoreTokens()) {
            xml.append("\t\t<");
            xml.append(headers.get(count));
            xml.append(">");
            xml.append(tokenizer.nextToken());
            xml.append("</");
            xml.append(headers.get(count));
            xml.append(">");
            xml.append(lineBreak);
            count++;
        }
        xml.append("\t</entry>");
        xml.append(lineBreak);
        entryCount++;
    }
}
xml.append("</root>");
System.out.println(xml.toString());

The input test.csv (stolen from another answer on this page):

string,float1,float2,integer
hello world,1.0,3.3,4
goodbye world,1e9,-3.3,45
hello again,-1,23.33,456
hello world 3,1.40,34.83,4999
hello 2 world,9981.05,43.33,444

The resulting output:

<root>
    <entry id="1">
        <string>hello world</string>
        <float1>1.0</float1>
        <float2>3.3</float2>
        <integer>4</integer>
    </entry>
    <entry id="2">
        <string>goodbye world</string>
        <float1>1e9</float1>
        <float2>-3.3</float2>
        <integer>45</integer>
    </entry>
    <entry id="3">
        <string>hello again</string>
        <float1>-1</float1>
        <float2>23.33</float2>
        <integer>456</integer>
    </entry>
    <entry id="4">
        <string>hello world 3</string>
        <float1>1.40</float1>
        <float2>34.83</float2>
        <integer>4999</integer>
    </entry>
    <entry id="5">
        <string>hello 2 world</string>
        <float1>9981.05</float1>
        <float2>43.33</float2>
        <integer>444</integer>
    </entry>
</root>

The big difference is that JSefa brings in is that it can serialize your java objects to CSV/XML/etc files and can deserialize back to java objects. And it's driven by annotations which gives you lot of control over the output.

JFileHelpers also looks interesting.

I don't understand why you would want to do this. It sounds almost like cargo cult coding.

Converting a CSV file to XML doesn't add any value. Your program is already reading the CSV file, so arguing that you need XML doesn't work.

On the other hand, reading the CSV file, doing something with the values, and then serializing to XML does make sense (well, as much as using XML can make sense... ;)) but you would supposedly already have a means of serializing to XML.

You can do this exceptionally easily using Groovy, and the code is very readable.

Basically, the text variable will be written to contacts.xml for each line in the contactData.csv, and the fields array contains each column.

def file1 = new File('c:\\temp\\ContactData.csv')
def file2 = new File('c:\\temp\\contacts.xml')

def reader = new FileReader(file1)
def writer = new FileWriter(file2)

reader.transformLine(writer) { line ->
    fields =  line.split(',')

    text = """<CLIENTS>
    <firstname> ${fields[2]} </firstname>
    <surname> ${fields[1]} </surname>
    <email> ${fields[9]} </email>
    <employeenumber> password </employeenumber>
    <title> ${fields[4]} </title>
    <phone> ${fields[3]} </phone>
    </CLIENTS>"""
}

You could use XSLT. Google it and you will find a few examples e.g. CSV to XML If you use XSLT you can then convert the XML to whatever format you want.

There is also good library ServingXML by Daniel Parker, which is able to convert almost any plain text format to XML and back.

The example for your case can be found here: It uses heading of field in CSV file as the XML element name.

There is nothing I know of that can do this without you at least writing a little bit of code... You will need 2 separate library:

  • A CSV Parser Framework
  • An XML Serialization Framework

The CSV parser I would recommend (unless you want to have a little bit of fun to write your own CSV Parser) is OpenCSV (A SourceForge Project for parsing CSV Data)

The XML Serialization Framework should be something that can scale in case you want to transform large (or huge) CSV file to XML: My recommendation is the Sun Java Streaming XML Parser Framework (See here) which allows pull-parsing AND serialization.

As far as I know, there's no ready-made library to do this for you, but producing a tool capable of translating from CSV to XML should only require you to write a crude CSV parser and hook up JDOM (or your XML Java library of choice) with some glue code.

Jackson processor family has backends for multiple data formats, not just JSON. This includes both XML (https://github.com/FasterXML/jackson-dataformat-xml) and CSV (https://github.com/FasterXML/jackson-dataformat-csv/) backends.

Conversion would rely on reading input with CSV backend, write using XML backend. This is easiest to do if you have (or can define) a POJO for per-row (CSV) entries. This is not a strict requirement, as content from CSV may be read "untyped" as well (a sequence of String arrays), but requires bit more work on XML output.

For XML side, you would need a wrapper root object to contain array or List of objects to serialize.

This may be too basic or limited of a solution, but couldn't you do a String.split() on each line of the file, remembering the result array of the first line to generate the XML, and just spit each line's array data out with the proper XML elements padding each iteration of a loop?

I had the same problem and needed an application to convert a CSV file to a XML file for one of my projects, but didn't find anything free and good enough on the net, so I coded my own Java Swing CSVtoXML application.

It's available from my website HERE. Hope it will help you.

If not, you can easily code your own like I did; The source code is inside the jar file so modify it as you need if it doesn't fill your requirement.

For the CSV Part, you may use my little open source library

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top