Given a cell array of vectors, delete vectors that are subsets of any other vector

StackOverflow https://stackoverflow.com/questions/21919734

  •  14-10-2022
  •  | 
  •  

Question

I wanna delete all the subsets of cell c, suppose I have 6 cell vectors: c{1}=[1 2 3]; c{2}=[2 3 4]; c{3}=[1 2 3 4 5 6]; c{4}=[2 3 4 7]; c{5}=[2 3 7]; c{6}=[4 5 6]; then I wanna delete [1 2 3], [2 3 4] and [4 5 6]. I used two for loops to find all these subsets, but it's too slow for large datasets, is there any simple way can do this?

Was it helpful?

Solution

The following code removes a vector if it's a subset of any other vector. The approach is very similar to my answer to this other question:

n = numel(c);
[i1 i2] = meshgrid(1:n); %// generate all pairs of cells (their indices, really)
issubset = arrayfun(@(k) all(ismember(c{i1(k)},c{i2(k)})), 1:n^2); %// subset?
issubset = reshape(issubset,n,n) - eye(n); %// remove diagonal
c = c(~any(issubset)); %// remove subsets

Note that, in your example, [2 3 7] should also be removed.

OTHER TIPS

You could find the cells that are exact matches for a particular input vector s1 using the following approach:

indx = find(cell2mat(cellfun(@(x)strcmp(num2str(x),num2str(s1)),c,'un', 0)));

You can then loop over matches (which should now be a much smaller set), and remove them by setting their contents to an empty set:

for ii=1:length(indx)
    c{:,ii} = [];
end
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top