The primary shard doesn't hold the complete dataset, it holds all the unsharded collections data. For the collection you are sharding, the data should be balanced between all shards (unless your sharding key choice is a poor one).
If your primary shard runs out of space because of your un-sharded data you have two options: you either shard those [un-sharded] collections also or get a bigger disk.