Including all permutations when using data.table[,,by=...]

Question 1

I'd also go with a cross-join, but would use it in the i-slot of the original call to [.data.table:

keycols <- c("g1", "g2", "g3")                       ## Grouping columns
setkeyv(dat, keycols)                                ## Set dat's key
ii <- do.call(CJ, sapply(dat[, ..keycols], unique))  ## CJ() to form index
datCollapsed <- dat[ii, list(nv=.N)]                 ## Aggregate

## Check that it worked
nrow(datCollapsed)
# [1] 625
table(datCollapsed$nv)
#   0   1   2   3   4   5   6 
# 135 191 162  82  39  13   3

This approach is referred to as a "by-without-by" and, as documented in ?data.table, it is just as efficient and fast as passing the grouping instructions in via the by argument:

Advanced: Aggregation for a subset of known groups is particularly efficient when passing those groups in i. When i is a data.table, DT[i,j] evaluates j for each row of i. We call this by without by or grouping by i. Hence, the self join DT[data.table(unique(colA)),j] is identical to DT[,j,by=colA].

Question 2

Make a cartesian join of the unique values, and use that to join back to your results

dat.keys <- dat[,CJ(g1=unique(g1), g2=unique(g2), g3=unique(g3))]
setkey(datCollapsed, g1, g2, g3)
nrow(datCollapsed[dat.keys])  # effectively a left join of datCollapsed onto dat.keys
# [1] 625

Note that the missing values are NA right now, but you can easily change that to 0s if you want.