First of all, the documentation for string.GetHashCode specifically says to not use string hash codes for any application where they need to be stable over time, because they are not. You should be using string hash codes for one purpose only, and that is to put strings in a dictionary.
Second, hash codes are not unique. There are only four billion possible hash codes (because the hash code is a 32 bit integer) but obviously there are more than four billion strings, so there must be many strings that have the same hash code. A collection of only a few thousand strings has an extremely high probability of containing two strings with the same hash code. A graph of the probability is here:
http://blogs.msdn.com/b/ericlippert/archive/2010/03/22/socks-birthdays-and-hash-collisions.aspx
So you might wonder how the dictionary works at all then, if it is using GetHashCode but there can be collisions. The answer is: when you put two things X and Y in a dictionary that have the same hash code, they go in the same "bucket". When you search for X the dictionary goes to the right bucket using the hash code, and then does the expensive equality check on each element in the bucket until it finds the right one. Since each bucket is small, this check is still fast enough most of the time.
I don't know how to solve your problem, but using a 32 bit hash is clearly not the right way to do it, so try something else. My suggestion would be to start using a database rather than CSV files if you have a lot of data to manage. That's what a database is for.
I have written many articles on string hashing that might interest you:
http://ericlippert.com/2011/02/28/guidelines-and-rules-for-gethashcode/