Why does mongodb performance differ so dramatically across two similar shards?

Question

Based on updated stats, it seems clear that there exists an index on one shard for your partitioned collection that's not there on the other shard. This can happen when an index is built on the replica set in rotating fashion but someone forgets to build it on both shards, or when it was not supposed to be there, but was not dropped from all replica sets.

In your case, the extra index, "list.i_1" is 4.2GB in size and most certainly would contribute significantly to performance difference.

The rest of my comments are more general and some may not be specific to your example.

Generally speaking, it is not uncommon that users start with one shard (or unsharded replica set) and then add a second shard to take on half of the load.

Unfortunately, the way that data gets migrated to shard2 leaves shard1 with fragmented storage, both for data and indexes. Since MongoDB uses memory mapped files, the larger files end up using more RAM, causing more stress on the I/O subsystem, and in general are less performant than more compact shard2, which got all of its data basically "at once" and is able to store similar number of documents using less space.

What you can do to get shard1 back with the program is to compact the affected collection(s) or even repairDatabase() if there are multiple sharded collections. The latter will return the freed space to the OS, but even though compact does not return the space to OS, it will keep it in the free list to be used as you insert more data, but the existing data will be nicely co-located in minimum possible amount of space.

Notice that in the same replica set, even though one of your primaries is much larger than the other one, the secondary is significantly smaller. This would happen if the secondary "re-synced" all its data at one time much later than when the balancing between shards happened. In this case, you can step down your primary and let the more compact secondary take over - it should perform better, and meanwhile you can compact or repair the former primary (newly secondary). Generally, three replica nodes is recommended so that you are not running without a safety net while doing this sort of maintenance.

Another observation I will make is that even though there is a more-or-less evenly sharded collection on both shards, you have a number of additional collections that live on the primary shard for this database, which is the larger shard. The difference in index sizes is certainly due to the extra indexes for extra collections that exist on shard1 and not shard2. That's normal for database where only some collections are sharded.

There is not much that can be done about that imbalance except for:

shard the larger of the unsharded collections or
move half the unsharded collections into a different database which will have shard2 as its primary shard. This will split unsharded collections between two shards more "evenly".