How to programmatically generate a dataset object from the Cartesian product (aka "cross-join") of multiple one-dimensional cell arrays?

Question 1

This works for

arbitrary number of cell arrays, n;
arbitrary size of each cell array;
arbitrary type of each cell's contents.

It makes use of cellfun, arrayfun and comma-separated lists. The Cartesian product is computed on indices (not on actual elements) using ndgrid, with fliplr to yield the order you want (first column varies slowest, last column varies fastest).

The result is given as a cell array with n columns. If you need it in the form of a dataset, define appropriate names and use cell2dataset to convert.

c1 = {'even','odd'}; %// example data
c2 = {'green','red','yellow'};
c3 = {'clubs','diamonds','hearts','spades'};
sets = {c1, c2, c3}; %// can have an arbirary number of c's

num = numel(sets);
nums = cellfun(@(c) numel(c), sets);
inds = cell(1,num);
vec = fliplr(arrayfun(@(n) 1:n, nums, 'uni', 0));
[inds{:}] = ndgrid(vec{:});
inds = fliplr(inds);
factors = arrayfun(@(n) {sets{n}{inds{n}}},1:num, 'uni', 0);
factors = cat(1, factors{:}).';

Result:

>> factors
factors = 
    'even'    'green'     'clubs'   
    'even'    'green'     'diamonds'
    'even'    'green'     'hearts'  
    'even'    'green'     'spades'  
    'even'    'red'       'clubs'   
    'even'    'red'       'diamonds'
    'even'    'red'       'hearts'  
    'even'    'red'       'spades'  
    'even'    'yellow'    'clubs'   
    'even'    'yellow'    'diamonds'
    'even'    'yellow'    'hearts'  
    'even'    'yellow'    'spades'  
    'odd'     'green'     'clubs'   
    'odd'     'green'     'diamonds'
    'odd'     'green'     'hearts'  
    'odd'     'green'     'spades'  
    'odd'     'red'       'clubs'   
    'odd'     'red'       'diamonds'
    'odd'     'red'       'hearts'  
    'odd'     'red'       'spades'  
    'odd'     'yellow'    'clubs'   
    'odd'     'yellow'    'diamonds'
    'odd'     'yellow'    'hearts'  
    'odd'     'yellow'    'spades'

Question 2

This was fun to think about - here's what I came up with:

function product = setjoin(sets, names)
product = {};
nrows = 1;
for curset=sets(:)'
    curset = curset{1}(:);
    n = length(curset);
    setidx = repmat(1:n, nrows, 1)(:);
    product = [repmat(product, n, 1) curset(setidx)];
    nrows = nrows * n;
end
product = cell2dataset([names(:)'; product]);
end

where sets is a cell array of cell arrays {c1, c2,..., cn} and names is a cell array of strings. As is it's a bit hacky - this method of coercing things into row/column vectors where required is concise but isn't necessarily obvious, especially in generating setidx - but hopefully it gives you an idea to build upon.