Question

How can 2 unsorted text files of different lengths be display side by side (in columns) in a shell

Given one.txt and two.txt:

$ cat one.txt
apple
pear
longer line than the last two
last line

$ cat two.txt
The quick brown fox..
foo
bar 
linux

skipped a line

Display:

apple                               The quick brown fox..
pear                                foo
longer line than the last two       bar 
last line                           linux

                                    skipped a line

paste one.txt two.txt almost does the trick but doesn't align the columns nicely as it just prints one tab between column 1 and 2. I know how to this with emacs and vim but want the output displayed to stdout for piping ect.

The solution I came up with uses sdiff and then pipes to sed to remove the output sdiff adds.

sdiff one.txt two.txt | sed -r 's/[<>|]//;s/(\t){3}//'

I could create a function and stick it in my .bashrc but surely a command for this exists already (or a cleaner solution potentially)?

Était-ce utile?

La solution

You can use pr to do this, using the -m flag to merge the files, one per column, and -t to omit headers, eg.

pr -m -t one.txt two.txt

outputs:

apple                               The quick brown fox..
pear                                foo
longer line than the last two       bar
last line                           linux

                                    skipped a line

See Also:

Autres conseils

To expand a bit on @Hasturkun's answer: by default pr uses only 72 columns for its output, but it's relatively easy to make it use all available columns of your terminal window:

pr -w $COLUMNS -m -t one.txt two.txt

Most shell's will store (and update) your terminal's screenwidth in the $COLUMNS environment variable, so we're just passing that value on to pr to use for its output's width setting.

This also answers @Matt's question:

Is there a way for pr to auto-detect screen width?

So, no: pr itself can't detect the screenwidth, but we're helping out a bit by passing in the terminal's width via the -w option.

If you know the input files have no tabs, then using expand simplifies @oyss's answer:

paste one.txt two.txt | expand --tabs=50

If there could be tabs in the input files, you can always expand first:

paste <(expand one.txt) <(expand two.txt) | expand --tabs=50
paste one.txt two.txt | awk -F'\t' '{
    if (length($1)>max1) {max1=length($1)};
    col1[NR] = $1; col2[NR] = $2 }
    END {for (i = 1; i<=NR; i++) {printf ("%-*s     %s\n", max1, col1[i], col2[i])}
}'

Using * in a format specification allows you to supply the field length dynamically.

There is a sed way:

f1width=$(wc -L <one.txt)
f1blank="$(printf "%${f1width}s" "")"
paste one.txt two.txt |
    sed "
        s/^\(.*\)\t/\1$f1blank\t/;
        s/^\(.\{$f1width\}\) *\t/\1 /;
    "

Under , you could use printf -v:

f1width=$(wc -L <one.txt)
printf -v f1blank "%${f1width}s"
paste one.txt two.txt |
    sed "s/^\(.*\)\t/\1$f1blank\t/;
         s/^\(.\{$f1width\}\) *\t/\1 /;"

(Of course @Hasturkun 's solution pr is the most accurate!):

Advantage of sed over pr

You can finely choose separation width and or separators:

f1width=$(wc -L <one.txt)
(( f1width += 4 ))         # Adding 4 spaces
printf -v f1blank "%${f1width}s"
paste one.txt two.txt |
    sed "s/^\(.*\)\t/\1$f1blank\t/;
         s/^\(.\{$f1width\}\) *\t/\1 /;"

Or, for sample, to mark lines containing line:

f1width=$(wc -L <one.txt)
printf -v f1blank "%${f1width}s"
paste one.txt two.txt |
    sed "s/^\(.*\)\t/\1$f1blank\t/;
  /line/{s/^\(.\{$f1width\}\) *\t/\1 |ln| /;ba};
         s/^\(.\{$f1width\}\) *\t/\1 |  | /;:a"

will render:

apple                         |  | The quick brown fox..
pear                          |  | foo
longer line than the last two |ln| bar 
last line                     |ln| linux
                              |  | 
                              |ln| skipped a line

remove dynamically field length counting from Barmar's answer will make it a much shorter command....but you still need at least one script to finish the work which could not be avoided no matter what method you choose.

paste one.txt two.txt |awk -F'\t' '{printf("%-50s %s\n",$1,$2)}'

If you want to know the actual difference between two files side by side, use diff -y:

diff -y file1.cf file2.cf

You can also set an output width using the -W, --width=NUM option:

diff -y -W 150 file1.cf file2.cf

and to make diff's column output fit your current terminal window:

diff -y -W $COLUMNS file1.cf file2.cf

Find below a python based solution.

import sys

# Specify the number of spaces between the columns
S = 4

# Read the first file
l0 = open( sys.argv[1] ).read().split('\n')

# Read the second file
l1 = open( sys.argv[2] ).read().split('\n')

# Find the length of the longest line of the first file
n = len(max(l0, key=len))

# Print the lines
for i in  xrange( max( len(l0), len(l1) ) ):

    try:
        print l0[i] + ' '*( n - len(l0[i]) + S) + l1[i]
    except:
        try:
            print ' ' + ' '*( n - 1 + S) + l1[i]
        except:
            print l0[i]

Example

apple                            The quick brown fox..
pear                             foo
longer line than the last two    bar 
last line                        linux

                                 skipped a line
diff -y <file1> <file2>


[root /]# cat /one.txt
apple
pear
longer line than the last two
last line
[root /]# cat /two.txt
The quick brown fox..
foo
bar
linux
[root@RHEL6-64 /]# diff -y one.txt two.txt
apple                                                         | The quick brown fox..
pear                                                          | foo
longer line than the last two                                 | bar
last line                                                     | linux
Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top