with open("file1") as f1:
with open("file2") as f2:
for line1, line2 in zip(f1, f2):
items1 = line1.split()
items2 = line2.split()
sums = ["({}+{})".format(i1, i2) for i1, i2 in zip(items1, items2)]
print(" ".join(sums))
Sum two files column by column in python
-
23-07-2023 - |
문제
I have two files of exactly the same size and same number of columns. I want to add the i^th column of the 1st file to the i^th column of the 2nd file. Is their a neat way for doing this with python?
file1
a a a a a a
a a a a a a
a a a a a a
file2
b b b b b b
b b b b b b
b b b b b b
I want:
(a+b) (a+b) (a+b) (a+b) (a+b) (a+b)
(a+b) (a+b) (a+b) (a+b) (a+b) (a+b)
(a+b) (a+b) (a+b) (a+b) (a+b) (a+b)
EDIT: The above is just a simplification of a more complicated problem of mine. Each file has thousands of rows and I have many files (~100) to perform this kind of operation on.
해결책
다른 팁
pandas DataFrame can be a good choice for such operation. It allows making operation on whole data frames(matrices) e.g df_one.add(df_two)
1 steep read data from files into data frames: http://pandas.pydata.org/pandas-docs/version/0.13.1/generated/pandas.DataFrame.from_csv.html (example: http://www.econpy.org/tutorials/general/csv-pandas-dataframe)
2 add two data frames as shown in this SO answear: Adding two pandas dataframes
i think this will help you
with open("file1") as a, open("file2") as b:
x = [[int(i) for i in u.split()] for u in a.readlines()]
y = [[int(i) for i in v.split()] for v in b.readlines()]
n = len(x)
m = len(x[0])
ans = ""
for i in xrange(n):
for j in xrange(m):
ans += str(x[i][j]+y[i][j]) + " "
print ans[:-1]
ans = ""
You can use numpy.loadtxt()
:
a = np.loadtxt('a.txt', dtype=object)
b = np.loadtxt('b.txt', dtype=object)
which will accept the element-wise string concatenation that you want, and even more:
print('('+a+'+'+b+')')
#array([['(a+b)', '(a+b)', '(a+b)', '(a+b)'],
# ['(a+b)', '(a+b)', '(a+b)', '(a+b)'],
# ['(a+b)', '(a+b)', '(a+b)', '(a+b)']], dtype=object)
print(a+b)
#array([['ab', 'ab', 'ab', 'ab'],
# ['ab', 'ab', 'ab', 'ab'],
# ['ab', 'ab', 'ab', 'ab']], dtype=object)
print(3*a)
#array([['aaa', 'aaa', 'aaa', 'aaa'],
# ['aaa', 'aaa', 'aaa', 'aaa'],
# ['aaa', 'aaa', 'aaa', 'aaa']], dtype=object)