Question

I got a situation that I don't know if is possible to use Weka classifications.

There is a big number of class classifications describing a pricing plan, just like that:

@attribute 'plan' {'Free', 'Basic', 'Premium', 'Enterprise'}
@attribute 'atr01' {TRUE, FALSE}
@attribute 'atr02' {TRUE, FALSE}
@attribute 'atr03' {TRUE, FALSE}
@attribute 'atr04' {TRUE, FALSE}
@attribute 'atr05' {TRUE, FALSE}
...
@attribute 'atr60' {TRUE, FALSE}

This list of attributes can grow up in the future... we expect to have 120 attributes.

What we need is to give a form so the user can check true or false for each attribute and our recomendation system will select the most appropriate plan for the user based on our training set.

The problem is that our training set contains only 1 row for each plan, just like that:

'Free',FALSE,TRUE,TRUE,FALSE...[+many trues and falses]...TRUE
'Basic',TRUE,FALSE,FALSE,FALSE...[+many trues and falses]...TRUE
'Premium',FALSE,FALSE,FALSE,FALSE...[+many trues and falses]...FALSE
'Enterprise',FALSE,TRUE,FALSE,FALSE...[+many trues and falses]...FALSE

This decision should try to match as many user selected options. I can't use filters because filters can result in zero results and I need at least one result.

I don't know if is it a machine learning problem and if Weka can help us.

Thanks.

Was it helpful?

Solution

You don't have a machine-learning problem because you do not have different examples to train for each class.

What you want is maybe a similarity measurement so to be able to score the suitness of the 4 plans. The most popular similarity measurement coming to mind is euclidean distance. Your attributes represent a vector in a euclidean space. Given the vector of the user you can calculate the distance to the vector of the 4 plans and present the "nearest" plan.

See http://en.wikipedia.org/wiki/Euclidean_distance

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top