Question

I'm trying to convert the following string to a Pandas dataframe:

'2477\t1974\t89.104.195.179\tDK\t17\t212711\x00\n1974\t2370\t212.10.164.160\tDK\t19\t213017\x00\n1974\t2370\t87.50.40.214\tDK\t17\t56743\x00\n'

The problem I'm encountering is that pandas converts each value to it's own column instead of 6 columns and 3 rows as desired.

pd.read_csv(StringIO(data), sep='\t', lineterminator='\n', names=['a','b','c','d','e','f'])

I've tried playing around with some of the other read_csv parameters with no success. What am I doing wrong?

Was it helpful?

Solution

By specifying raw sep and lineterminator, it works:

from StringIO import StringIO
import pandas as pd
data = '2477\t1974\t89.104.195.179\tDK\t17\t212711\x00\n1974\t2370\t212.10.164.160\tDK\t19\t213017\x00\n1974\t2370\t87.50.40.214\tDK\t17\t56743\x00\n'
df = pd.read_csv(StringIO(data), sep=r'\t', lineterminator=r'\n', names=['a','b','c','d','e','f'])
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top