The StringToWordVector
filter cannot be reversed. However, you have at least two possibilities:
- If you just want to see or show the original strings that are in each cluster, you can add an
ID
attribute, ensure it is not used during clustering (to avoid unexpected behavior), then recover the text from the original strings (ARFF
file). - If you want to show some meaningful summary of the contents of each cluster, you can just output the most frequent/heavy words in each cluster. This is a rather common approach when clustering texts.