سؤال

I have a folder of 24 different files that all have the same tab-separated format:

This is an example:

zinc-n  with-iodide-n   8.0430  X
zinc-n  with-amount-of-supplement-n 12.7774 X
zinc-n  with-value-of-horizon-n 14.5585 X
zirconium-n as-valence-n    11.3255 X
zirconium-n for-form-of-norm-n  15.4607 X

I want to join the files in every possible combination of 2.

For instance, I want to join File 1 and File 2, File 1 and File 3, File 1 and File 4... and so on until I have an output of 552 files joining EACH file with EACH other file considering all the UNIQUE combinations

I know this can be done for instance in the Terminal with cat.

i.e.

cat File1 File2 > File1File2
cat File1 File3 > File1File3

... and so on.

But, to do this for each unique combination would be an extremely laborious process.

Is there a possible to automatize this process to join all of the unique combination using a command line in Terminal with grep for instance? or perhaps another suggestion for a more optimized solution than CAT.

هل كانت مفيدة؟

المحلول

You can try with . I use the combinations() function from the itertools module and join() the contents of each pair of files. Note that I use a cache to avoid reading each file many times, but you could exhaust your memory, so use the best approach for you:

import sys 
import itertools

seen = {}

for files in itertools.combinations(sys.argv[1:], 2): 
    outfile = ''.join(files)
    oh = open(outfile, 'w')

    if files[0] in seen:
        f1_data = seen[files[0]]
    else:
        f1_data = open(files[0], 'r').read()
        seen[files[0]] = f1_data

    if files[1] in seen:
        f2_data = seen[files[1]]
    else:
        f2_data = open(files[1], 'r').read()
        seen[files[1]] = f2_data

    print('\n'.join([f1_data, f2_data]), file=oh)

A test:

Assuming following content of three files:

==> file1 <==
file1 one
f1 two

==> file2 <==
file2 one
file2 two

==> file3 <==
file3 one
f3 two
f3 three

Run the script like:

python3 script.py file[123]

And it will create three new files with content:

==> file1file2 <==
file1 one
f1 two
file2 one
file2 two


==> file1file3 <==
file1 one
f1 two
file3 one
f3 two
f3 three


==> file2file3 <==
file2 one
file2 two
file3 one
f3 two
f3 three
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top