Question

I am attempting to learn to use RapidMiner, and my boss wants me to perform a market basket analysis on a set of data. But when I use the given template, I get the following error:

Regular Attributes must be of type binomial.
This is given withing the FP-Growth operator.

I have a customerID (only numbers), a productName(Letters) and a Product Quantity (numbers) column.

As I am a newbie with RM, I have no idea what is wrong.

Any input would be greatly appreciated.
Thank you in advance.

Was it helpful?

Solution

FP-Growth needs an ExampleSet as input where all regular attributes are binominal which means boolean in this case. Sometimes a binominal attribute has a predefined positive/true and negative/false value otherwise the positive value can be specified as a parameter in FP-Growth operator. Furthermore every example represents a transactions (or basket of your customer) and every attribute an item in your complete product line whereby the value of this attribute defines if the item is in the basket or not.

To find association rules you need to find the frequent itemsets at first. This is the job of the FP-Growth operator. Your job is to transform the ExampleSet to a "transaction database", i.e. all attributes are binominal.

Unfortunately the template in RapidMiner is a little bit buggy. To fix the process you have to add two operator before the FP-Growth operator. First of all you need to replace all missing values with the value 0 (Operator 'Replace Missing Values' and the parameter 'default' to zero). After the replacement you need the 'Numerical to Binominal'-Operator. The default parameter values are sufficient to transform all attributes to binominal ones. The process should run now! Please note that you need a sufficient small min-support to find frequent itemsets.

OTHER TIPS

You will likely need to transform (fold) your data into a format with one row per customer (transactions, actually) and one quantity column (mabye binary) for each product. Use sparse vectors to save storing all the 0s.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top