Question

Assume a m x n matrix A gives us information about purchases of m customers in a market where n items are available. Given a binary matrix such as A which gives us either the presence(or absence) of the jth item in the ith purchase.

A = [ 1 0 0 1 0 1 ... 0; 0 1 0 1 0 0 ... 1; . . . 1 0 0 1 0 0 ... 0 ]

I would like to find how many customers have bought items{ {1},{1,2}, {1,2,3}... and so on.

How would do this efficiently in MATLAB? I've not been able to get a start on this.

Was it helpful?

Solution

The number of customer that have bought items { {1},{1,2}, {1,2,3}... and so on would be found out from this code

A = round(rand(5,10))

count = zeros(1,size(A,2));
count(1) = sum(A(:,1));
for k =2:size(A,2)
    count(k) = nnz(ismember(A(:,1:k),repmat(1,1,k),'rows'));
end

Example run:

A =

     0     1     1     0     1     1     1     0     1     1
     0     1     0     1     1     0     1     0     1     1
     1     1     1     0     1     1     0     0     0     0
     0     0     1     0     1     1     0     0     1     1
     1     0     1     0     0     1     1     1     1     0


count =

     2     1     1     0     0     0     0     0     0     0

Thus, for this example, we have 5 customers and 10 items. The counts of 2, 1, 1 and 0 represent the count of people that have bought item1, items1+2, items1+2+3 and items1+2+3+4 respectively.

EDIT 1 If you are looking to find count of people for all possible combinations of items, try this code

%%// Data
A = round(rand(5,4));
count = zeros(1,size(A,2));
count(1) = sum(A(:,1));

%%// Get the counts
combs = cell(1,size(A,2));
combs_counts = cell(1,size(A,2));
for k=1:size(A,2)

    c1 = combnk(1:size(A,2),k);
    combs(k) = {c1};

    counts = zeros(size(c1,1),1);
    for k2 = 1:size(c1,1)
        m1 = A(:,c1(k2,:));
        counts(k2) = nnz(ismember(m1,repmat(1,1,k),'rows'));
    end

    combs_counts(k) = {counts};
end

%% Testing: Let us check for all possible combinations with two items by printing the values
item_count = 2;

A
cell2mat(combs(item_count))
cell2mat(combs_counts(item_count))

A sample run gives

A =

     1     1     1     0
     1     0     1     0
     0     0     0     1
     1     0     1     0
     1     0     1     0


ans =

     3     4
     2     4
     2     3
     1     4
     1     3
     1     2


ans =

     0
     0
     1
     0
     4
     1

Thus, one can see that with 2 as the number of items, we have 6 possible combinations, which are listed in the cell array combs and for each, the count of customers is listed in another cell array combs_counts.

OTHER TIPS

    ItemsTot=sum(A,1)          % total of purchase for each item

    for k=2:n
        itemsNum=combnk(1:n,k) % possible combinations
        Cnline=size(C,1);    % = nchoosek(n,k);
        purchaseTot=sum(ItemsTot(itemsNum),2)
    end

EDIT: I thought you wanted the total of item 1 and 2 bought... not the number of customer who bought 1 and 2 :p

The code here does what you want, but it's pretty complicated to understand...

    %% data
    m=5;                %m customer
    n=4;                %n items
    A = round(rand(m,n)) % matrix of purchase (filled randomly)

    ItemComb=de2bi((1:2^n-1))

    B=ItemComb./(ItemComb*ones(n,n));

    Result=sum(A*B'==1,1)

It can be solved very simply as

sum(cumprod(A,2))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top