If almost all of the counts are 1, then you could use two datastructures: a HashSet which contains all the id-pairs whose count is 1, and a Dictionary for the id-pairs whose count is greater than 1. This makes incrementing and checking counts a bit slower, but it should save some space. (I don't know how .Net datastructures are laid out internally, so I hesitate to make a guess, but if it were C++, I'd say it would reduce space consumption by something like 25-30%, depending on the value of "almost all".)
If that's not enough space savings, here's an outline of some possibilities, although it can be a lot of work for uncertain gain:
In general, the cost of a container datastructure is composed of the size of the data of an element, plus some per-element overhead, plus some per-datastructure overhead. Hash tables have a medium amount of per-element overhead (one link to the next element in the bucket, plus allocation/alignment overhead); and binary trees have a lot of per-element overhead (two or more commonly three links, plus allocation/alignment overhead). Vectors technically have no per-element overhead, but they are usually overallocated to reduce insertion time, so you should think of them as having 50-100% per-element overhead.
One consequence is that if you an figure out a way to reduce the number of elements, you can often save space. For example, you could use a HashSet of id-pairs, as I suggested above. But if there are a lot fewer individual id1 values than pairs -- i.e., if the ids repeat -- then you could replace that with a dictionary mapping id1 to a vector of id2, which might reduce overhead. There's a big downside to this: it makes lookup and insert much more expensive; furthermore, it only helps if the hash-table per-element overhead is more than the expected vector overallocation overhead.