Question

I know how to convert a Set of text or web page files in to arff file using TextDirectoryLoader.

I want to know how to convert a single Text file in to Arff file.

Any help will be highly appreciated.

Was it helpful?

Solution

Please be more specific. Anyway:

  • If the text in the file corresponds to a single document (that it, a single instance), then all you need is to replace all "new lines" with the escape code \n to make the full text be in a single line, then manually format as an arff with a single text attribute and a single instance.

    If the text corresponds to several instances (e.g. documents), then I suggest to make an script to break it into several files and to apply TextDirectoryLoader. If there is any specific formating (e.g. instances are enclosed in XML tags), you can either do the same (by taking advantage of the XML format), or to write a custom Loader class in WEKA to recognize your format and build an Instances object.

If you post an example, it would be easier to get a more precise suggestion.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top