Question

I need to compare two text files using only linux command lines, to get the list of added lines, removed lines, and modified lines. I currently use diff --context=0 but if a line is 'modified' and the following line is 'added', these two lines will be marked as 'modified' in a single change set (instead of two change sets: 'modified' for the first and 'added' for the second).

Here is an example of files:

File#1:

foo line1

File#2:

fooX line1
bar line2

Currently used command:

diff --minimal --context=0 file1 file2

Actual result:

! foo line1
--- 1 ----
! fooX line1
! bar line2

Needed result:

! foo line1
--- 1 ----
! fooX line1

+ bar line2

I expected --minimal would do the job but it does not. It seems it finds fewer change sets instead of smaller change sets like I need. If I insert a new line containing only "--" just after the first line, it does work, but the files I have to compare are very big and I don't really want to insert "--" between each line...

Was it helpful?

Solution

Since there was no suitable solution with diff, I generated:

  • a file listing all new lines (using another internal processing)
  • another file listing deleted lines (using another internal processing)
  • the result of diff --context=0 file1 file2 | grep "^\!"

Then I filtered the output of the 3rd file by removing all lines that were removed/added:

while read line
do
        if ! ( grep -q -- "$line" $ADDED_LINES_FILE $REMOVED_LINES_FILE) ; then
                echo "$line"
        fi
done < diff_output > $TMP_FILE
mv $TMP_FILE diff_output
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top