Question

I have a very large csv file where every field is the same width (hence every line is the same width). I need to find the differences in a specific column.

When I open 2 files in vimdiff, most lines are marked as a diff, because there is a regularly changing datetime field (say, columns 10-15). This field is correctly coloured red for a diff. But I am interested in, say columns 50-60, in which there will only be a few diffs throughout the entire file.

My only solution so far is to delete the parts of the file that I don't care about :%s/^.\{49} but this is very laggy because the files are so big.

Is there a better solution without needing to modify the files?

Was it helpful?

Solution

Would comparing specific fields be suitable?

vimdiff <(awk -F',' '{print $3}' a.csv) <(awk -F',' '{print $3}' b.csv)

Or, if you require a comparison among multiple fields:

vimdiff <(awk -F',' '{print $2","$3}' a.csv) <(awk -F',' '{print $2","$3}' b.csv)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top