Pergunta

I am using Rapidminer to do a Naive Bayes text classification. My training set in an excel sheet with 2 columns: the first column is the LABEL and the second is the TEXT.

I used the the "Read Excel" operator to read the excel sheet (I use the "Set Role" operator to make sure that the LABEL column assumes the role of label and the TEXT column the text). I then used the "Data to Documents" operator and the "Process Documents" operator (token, stopword, stem, case, etc.) to process the data. However, when I tried to port the data to "Naive Bayes" operator, an error msg told me that the data is not labelled and asked me to use "Set Role" operator. So I added another "Set Role" after the "Process Documents" operator, and only "text" in the "attribute name", the LABEL disappeared. I have no idea what went wrong.

Foi útil?

Solução

If you set a breakpoint before the Process Documents from Data operator you should see one attribute with the role Label (and probably with type Polynominal) and another attribute with the role regular and type text. If this is OK, it should work.

Make sure the add meta information check box is set on the Process Documents from Data operator.

If it still doesn't work then posting the process XML is the next step.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top