Question

I am trying to use the weka gui to classify some textual data.

I am using the stringtoword filter with the attribute indices default value being set to first-last.

However, i tried to change it to things such as 1, 500-last

it gives me an error of invalid range list.

Initially my arff has only 2 attributes.

class
text

Is there anything i am doing wrongly ?

I am pretty sure there are a lot of words in the text file and when i run the default filter of first-last it gives me a whole 10,000 number of attributes

Was it helpful?

Solution

The attribute indices takes index, respectively indices of attributes whose values you wish convert to word vector. So you have two attributes class with index 1 and text with index 2. Setting first-last takes both and very likely did nothing with class since it is usually single value, and make a word vector from attribute text.

Cut to the chase, your only options in this case is to use 2 or first-last, but result will be the same. 500 is out of range since you have only 2 attributes.

PS. If you wish use just range of words from obtained word vector, you can use Remove filter and specify indices of columns (words) you wish to remove...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top