Domanda

I'm trying to divide a very big text file into two parts and dump the two parts into two different mysql table. I do this in python, inspect the text line by line and categorize each line by a category code.

Now after I divide the text, how do I pipe them into two separate fifo files so I can pipe these fifo to mysql client tools?

È stato utile?

Soluzione 2

I think you're looking to create pipes for two separate, and apparently simultaneous, MySQL imports, from the same Python script?

While it's not impossible to do this via shell redirection, it's going to be painful. Your Python script has to somehow pass the file descriptors of its pipes to the shell, so your shell script can redirect those file descriptors to the MySQL commands.

A much easier solution is to do it in Python, with the subprocess module.

I don't know the tool and syntax you hope to use for doing the bulk load; all you've told us is that you want to give it a "pipe". So, I'll just assume that it's the mysqlimport command mentioned in hbristow's answer, and that it handles stdin via the usual Unix convention of giving it - as a filename; since this is just for demonstrating the actual interesting part, it doesn't matter very much anyway.

So:

from subprocess import Popen, stdin

args = ['mysqlimport', my_db_name, '-']
with Popen(args, stdin=PIPE) as import1, Popen(args, stdin=PIPE) as import2:
    with open('giantfile.txt') as f:
        for line in f:
            data = parse(line)
            if belongs_in_import2(data):
                import2.stdin.write(make_sql(data))
            else:
                import1.stdin.write(make_sql(data))

We've created two separate child processes, each with its own separate stdin pipe, and we can write to them the same way we can to any other files.

You may need to import1.stdin.close() and import2.stdin.close() if the mysqlimport tool expects you to close/EOF the input file before actually waiting on it to exit.

If you're using Python 2.4-2.7, you should install and use the subprocess32 backport. If you can't do that for some reason (or if you're using Python 3.0-3.1 and can't upgrade for some reason), you can't use a with statement here; instead, you need to explicitly close the pipes and wait the processes.

Altri suggerimenti

I assume what you're wanting to do is call the MYSQL command

LOAD DATA INFILE

without actually creating the INFILE. You could try using the mysqlimport command-line client, and providing that it is happy to accept a pipe, do something like:

python categorize.py --code x big_text_file.txt | mysqlimport db_name /dev/stdin

where your Python script splits the text file by the code input on the command-line and outputs the result as a string, which is piped to mysqlimport.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top