I know how to convert a Set of text or web page files in to arff file using TextDirectoryLoader.

I want to know how to convert a single Text file in to Arff file.

Any help will be highly appreciated.

有帮助吗?

解决方案

Please be more specific. Anyway:

  • If the text in the file corresponds to a single document (that it, a single instance), then all you need is to replace all "new lines" with the escape code \n to make the full text be in a single line, then manually format as an arff with a single text attribute and a single instance.

    If the text corresponds to several instances (e.g. documents), then I suggest to make an script to break it into several files and to apply TextDirectoryLoader. If there is any specific formating (e.g. instances are enclosed in XML tags), you can either do the same (by taking advantage of the XML format), or to write a custom Loader class in WEKA to recognize your format and build an Instances object.

If you post an example, it would be easier to get a more precise suggestion.

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top