Question

I am having difficulty getting help from the Mallet dev list, so I am trying here.

I have an InstancesList with a target alphabet of {A, B, C} and I need to change the target alphabet for another analysis to {A, NOT_A}.

So far, I have the following code (adapted from other Mallet source code) but I keep getting:

Alphabets don't match: Instance: [5976, null], InstanceList: [5976, 2]

...
InstanceList iListCopy = (InstanceList) instances.clone();

Alphabet blank = new Alphabet();
Alphabet newAlpha = new Alphabet();

//A and NOT_A cannot be found in alphabet, so add them.
newAlpha.lookupIndex("A", true);
newAlpha.lookupIndex("NOT_A", true);

Noop pipe = new Noop(blank, newAlpha);
InstanceList newIList = new InstanceList(pipe);

//iterate through each instance and change the target based on the
original value.
for (int i = 0; i < iListCopy.size(); i++) {
   Instance inst = iListCopy.get(i);

   FeatureVector original = (FeatureVector) inst.getData();
   Instance newInst = pipe.instanceFrom(new Instance(original,
newAlpha, inst.getName(), inst.getSource()));
   if (inst.getLabeling().toString().equals("A") {  
       newInst.setTarget("A");
   } else {
       newInst.setTarget("NOT_A");
   }
   newIList.add(newInst);   //FAILS with "Alphabets do not match."
}
...

Does anyone have any suggestions on how I can change the target alphabet from {A, B, C} to {A, NOT_A}?

Was it helpful?

Solution

Target should be a Label. Try this:

..
newInst.setTarget(((LabelAlphabet) newList.getTargetAlphabet()).lookupLabel("A"));
..
newInst.setTarget(((LabelAlphabet) newList.getTargetAlphabet()).lookupLabel("NOT_A"));
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top