Question

How do I import whole file as one string into PostgreSQL?

create table text_files (
  id serial primary key,
  file_content text
);

I've tried \copy text_files (file_content) from /home/test.txt, but this will create one row per line in text file.

I have hundreds of small text files and I would like to use some bash loop with \copy inside.

Update: If bash and \copyis not the best set of tools for this task, I can use other programming language - maybe Python has something to offer.

Was it helpful?

Solution

If you really must do it in bash, you'll need to do it somewhat by hand:

psql regress -c "insert into text_files(file_content) values ('$(sed "s/'/''/g" test.txt)');"

but it'll be a bit fragile. I recommend using a more sophisticated scripting language, personally. It'll also load the whole file into memory at least a couple of times over.

psql has \lo_import, but that imports files into pg_largeobject, not a text field.

OTHER TIPS

This is a basic example in python made from documentation!

Please note that no try catch blocks are used (which is bad), but it should work. You could end up with UTF-8 errors, IO errors or stuff I didn't bother about (I will revise the code if necessary) ... Anyway, save the code below into a file (let's say "myfile.py"), put the correct information in order to connect to your database, replace "/path/to/files/" with a real path and finally run "python myfile.py" in your console.

If you have lots of files this could take a bit, and mind your system's memory status. Each file will be read and put into the system's memory. If the file size exceeds the memory limits, the script will probably crash. If the files are small you are fine.

Test it first!

Requirements: python with psycopg2 installed

import os
import psycopg2

connection = psycopg2.connect(database='my_db', user='my_postgres_user', password='my_postgres_pass')
cursor = connection.cursor()
cursor.execute('DROP TABLE IF EXISTS text_files;CREATE TABLE text_files (id SERIAL UNIQUE PRIMARY KEY, file_name TEXT, file_content TEXT);')

directory = os.path.normpath('/path/to/files/')

for root, dirs, files in os.walk(directory):
  for filename in files:
    print filename
    with open(os.path.join(root, filename), 'rb+') as f:
      cursor.execute('INSERT INTO text_files (file_name, file_content) VALUES (%s, %s);', (filename, f.read()))
    f.closed  

connection.commit()
cursor.close()
connection.close()

For example, you have this file:

test
test
create table text_files (
  id serial primary key,
  file_content text
);
test
create table text_files (
  id serial primary key,
  file_content text
);

Run sed command:

sed '/(/{:a;N;/)/!ba};s/\n/ /g' file

test
test
create table text_files (   id serial primary key,   file_content text );
test
create table text_files (   id serial primary key,   file_content text );

It will merge the create table lines into one, is this what you are looking for?

I ended up with using temporary table where file is store by rows.

Tables design:

drop table if exists text_files_temp;
create table text_files_temp (
  id serial primary key,
  file_content text
);

drop table if exists text_files;
create table text_files (
  id serial primary key,
  file_name text,
  file_content text
);

Bash script:

#!/bin/sh

for f in /home/tgr/tmp/*
do
  psql -c"delete from text_files_temp;"
  psql -c"\copy text_files_temp (file_content) from $f delimiter '$'"
  psql -c"insert into text_files (file_content) select array_to_string(array_agg(file_content order by id),E'\n') from text_files_temp;"
  psql -c"update text_files set file_name = '$f' where file_name is null;"    
done

This works only for files without $ character - the only possible character not available in my files.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top