Question

this is my first question in this forum.... I'm making adata-mining application in java with the WEKA API. I make first a pre-processing stage and when I save the ARFF file i would like to add a couple of lines (as comments) specifing the preprocessing task that i have done to the file... the problem is that i don't know how to add comments to an ARFF file from the java WEKA API. To save the file i use the class ArffSaver like this...

    try {
        ArffSaver saver = new ArffSaver();
        saver.setInstances(dataPost);
        saver.setFile(arffFile);
        saver.writeBatch();
        return true;
    } catch (IOException ex) {
        Logger.getLogger(Preprocesamiento.class.getName()).log(Level.SEVERE, null, ex);
        return false;
    }

I would be really greatfull if someone could give some idea... thanks!

Was it helpful?

Solution

You should AVOID writting comments on an .arff file, even more when writting it from Java. These files are very "parser-sensitive". The Weka API to create these files is restrictive for this particular reason.

Even though, you can always add your comments manually with the % symbol. This said, I wouldn't recommend you writting anything more than instances, attributes and values into an .arff file. ;-)

OTHER TIPS

I don't see a reason to not write comments into the header of an ARFF file. The specification clearly says:

Lines that begin with a % are comments.

So while it is technically valid, it can be difficult if you want to use the ArffSaver#setFile method. This method does a lot of (convenient, but somewhat arbitrary and unspecified) work internally, until it finally calls

setDestination(new FileOutputStream(m_outputFile));

If this is not required, the easiest option is to write directly to an OutputStream, which then can simply be set as the destination for the ArffSaver. This can be wrapped in a small helper method, for example, like this:

static void writeArff(
    Instances instances, 
    List<String> commentLines, 
    OutputStream outputStream) throws IOException
{
    ArffSaver saver = new ArffSaver();
    saver.setInstances(instances);
    if (commentLines != null && !commentLines.isEmpty())
    {
        BufferedWriter bw = new BufferedWriter(
            new OutputStreamWriter(outputStream));
        for (String commentLine : commentLines)
        {
            bw.write("% " + commentLine + "\n");
        }
        bw.write("\n");
        bw.flush();
    }
    saver.setDestination(outputStream);
    saver.writeBatch();
}

When calling it like this

 List<String> comments = Arrays.asList("A comment", "Another one");
writeArff(instances, comments, outputStream);

then the given comments will be inserted at the top of the ARFF file.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top