Domanda

I am trying to use the weka gui to classify some textual data.

I am using the stringtoword filter with the attribute indices default value being set to first-last.

However, i tried to change it to things such as 1, 500-last

it gives me an error of invalid range list.

Initially my arff has only 2 attributes.

class
text

Is there anything i am doing wrongly ?

I am pretty sure there are a lot of words in the text file and when i run the default filter of first-last it gives me a whole 10,000 number of attributes

È stato utile?

Soluzione

The attribute indices takes index, respectively indices of attributes whose values you wish convert to word vector. So you have two attributes class with index 1 and text with index 2. Setting first-last takes both and very likely did nothing with class since it is usually single value, and make a word vector from attribute text.

Cut to the chase, your only options in this case is to use 2 or first-last, but result will be the same. 500 is out of range since you have only 2 attributes.

PS. If you wish use just range of words from obtained word vector, you can use Remove filter and specify indices of columns (words) you wish to remove...

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top