Question

I'm looking tools/libraries which allows fast (easy) data import to existing database tables. For example phpmyadmin allows data import from .csv, .xml etc. In Hadoop hue via Beesvax for Hive we can create table from file. I'm looking tools which I can use with postgresql or libraries which allows doing such things fast and easily - I'm looking for way to avoid coding it manualy from reading file to inserting to db via jdbc.

Was it helpful?

Solution

You can do all that with standard tools in PostgreSQL, without additional libraries.

For .csv files you can use the built in COPY command. COPY is fast and simple. The source file has to lie on the same machine as the database for that. If not, you can use the very similar \copy meta-command of psql.

For .xml files (or any format really) you can use the built in pg_read_file() inside a plpgsql function. However, I quote:

Only files within the database cluster directory and the log_directory can be accessed.

So you have to put your source file there or create a symbolic link to your actual file/directory. Then you can parse it with unnest() and xpath() and friends. You need at least PostgreSQL 8.4 for that.

A kick start on parsing XML in this blog post by Scott Bailey.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top