You need to decide in what axis you want to append your files. Pandas will always try to do the right thing by:
- Assuming that each column from each file is different, and appending digits to columns with similar names across files if necessary, so that they don't get mixed;
- Items that belong to the same row index across files are placed side by side, under their respective columns.
The trick to appending efficiently is to tip the files sideways, so you get the desired behaviour to match what pandas.concat
will be doing. This is my recipe:
from pandas import *
files = !ls *.csv # IPython magic
d = concat([read_csv(f, index_col=0, header=None, axis=1) for f in files], keys=files)
Notice that read_csv
is transposed with axis=1
, so it will be concatenated on the column axis, preserving its names. If you need, you can transpose the resulting DataFrame back with d.T
.
EDIT:
For different number of columns in each source file, you'll need to supply a header. I understand you don't have a header in your source files, so let's create one with a simple function:
def reader(f):
d = read_csv(f, index_col=0, header=None, axis=1)
d.columns = range(d.shape[1])
return d
df = concat([reader(f) for f in files], keys=files)