
Has anyone done this? Is there any documentation on how to use this parser module? I've looked through the code but it's not clear to me to how to actually use the data after it's been parsed.

The file src\main\java\weka\core\converters\ (which I assume is where the Arff parsing happens) has these instructions:

  • Typical code for batch usage:
  • BufferedReader reader = new BufferedReader(new FileReader("/some/where/file.arff"));
  • ArffReader arff = new ArffReader(reader);
  • Instances data = arff.getData();
  • data.setClassIndex(data.numAttributes() - 1);

But what else can I do with 'data'? How do I access each row and the values in each row?

(By the way, I'm new to Java. If I run this code, is there some kind of introspection I could do on data to see what it offers? That's what I would do in Python.)

(I'm also open to suggestions for a simpler open source Arff parser to use in my project if one exists.)

Was it helpful?


It looks to me that your answer lies in the Instances class - that is where the data is stored.

I would find the API of the Instances classes, either by locating or generating its javadoc, or simply perusing its source. The methods of this class should allow you to manipulate the data that has been loaded from the ARFF file.


You can use Weka from Python, and get introspection. I've been successfully using Weka from JRuby to do the same thing. Google "Weka documentation" to find the page that links to the API for the stable and development version. I don't have enough reputation to put a second link in my answer :)

The weka parser is closely tied to their internal data model - Instances.

The ARFF format is not that hard to parse, you might be better off writing an custom parser that directly produces your desired data representation.

after you have the Instances object data, you can use it to:

data.get(index) //get a instance
data.enumerateInstances() // Returns an enumeration of all instances in the dataset.

You can see all the methods at: Instances JavaDoc

I used something like this:

public class Main {
    private static final String ARFF_FILE_PATH = "YOUR_ARFF_FILE_PATH";

    public static void main(String[] args) throws IOException {
        ArffLoader arffLoader = new ArffLoader();

        File datasetFile = new File(ARFF_FILE_PATH);

        Instances dataInstances = arffLoader.getDataSet();

        for(Instance inst : dataInstances){
            System.out.println("Instance:" + inst);
import weka.core.Instance;
import weka.core.Instances;
import weka.core.converters.ArffLoader;
import weka.core.converters.ArffLoader.ArffReader;

public class assign3 {
     public static void main(String args[]) throws IOException {

ArffLoader arffloader=new ArffLoader();
File filedata = new File("/home/cse611/Downloads/iris.arff");

     Instances data = arffloader.getDataSet();`enter code here`
     for(Instance inst : data){
         System.out.println("Instance:" + inst);
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top