Insert tab-delimited values into database

https://stackoverflow.com/questions/8841205

27-10-2019
|

문제

I have a tab-delimited txt file with rows separated by tabs, and rows separated by newlines. This is what it actually looks like:

476502291\t\tLF3139812164\t\tTitle 1\tKids & Family\nGRAV_2011\t\tThe Full Picture\tIndependent\n [...etc...]

Note that sometimes values are separated by two tabs instead of one.

I need to insert this into a mysql table, which should result in the following:

ID             title               genre
476502291      Title 1             Kids & Family
GRAV_2011      The Full Picture    Independent

How would I read a tab-separated txt file and run a for loop in order to insert values into a table named vendor using MySQLdb?

>>> import MySQLdb
>>> conn = MySQLdb.connect (host = "localhost",
                             user = "me",
                             passwd = "password",
                             db = "my-db")
>>> cursor = conn.cursor ()
>>> # for loop  # how to read from the txt file to insert it as required?
>>>     # cursor.execute (INSERT...)
>>> conn.commit()
>>> conn.close()

해결책

Step 1. Read the csv module. http://docs.python.org/library/csv.html. This does what you want.

with open('your_data_file.dat','r') as source:
    rdr= csv.reader( source, delimiter='\t', quotechar='')
    for row in rdr:
        # you have your columns with which to do your insert.
conn.commit()

Step 2. Read up on context managers, also.

from contextlib import closing

with open('your_data_file.dat','r') as source:
    rdr= csv.reader( source, delimiter='\t', quotechar='')
    with closing(conn.cursor()) as cursor:
        for row in rdr:
            # you have your columns with which to do your insert.
conn.commit()

This will assure that cursors and files are properly closed.

다른 팁

As long as tabs are only used as delimiters in your file you should be able to do something like this:

import re

# connect to MySQLdb

with open(file_name) as f:
    for line in f:
        id, title, genre = re.split(r'\t+', line)
        # execute INSERT statement

The idea is that you will always have two groups of tabs, one between ID and title, and the other between title and genre. By using re.split() on \t+ (one or more tabs) you will get a list of length 3 with the fields you are interested in.

If there are any lines in your file that do not match this format you should add some additional checking, maybe something along the lines of data = re.split(r'\t+', line) and if len(data) == 3: before the tuple unpacking.

edit: This solution goes with the assumption that you do not have blank fields, so if it is legal for a line to have just an ID and a genre but no title this will not work. It will still work if you can have a title but no ID or genre as long as there are are leading tabs when ID is missing and trailing tabs when genre is missing.

import reg

connect to MySQLdb

with open(file_name) as f: for line in f: id, title, genre = re.split(r'\t+', line) # execute INSERT statement

라이센스 : CC-BY-SA ~와 함께 속성

제휴하지 않습니다 StackOverflow