Question

I'm trying to copy created a whole database in SQL Server to Postgres. I've been trying to write a script that can run in ssms with a postgres instance set up as a linked server. This will not be a one off operation.

I've managed to create a script that creates most of the schema i.e. tables, constraints, indexes etc. I do this by using the information_schema tables in sql server and formatting the data to form valid sql for postgres and run EXEC(@sql) AT POSTGRES statement, where POSTGRES is the linked server and @SQL a variable containing my statement. This is working fine.

Now I'm trying to insert the data using a statement like this:

INSERT INTO OPENQUERY(POSTGRES,'SELECT * FROM targettable')
SELECT *
FROM sourcetable

The statements are actually slightly modified in some cases to deal with different data types, but this is the idea. The problem is when the table is particularly large, this statement fails with the error :

Msg 109, Level 20, State 0, Line 0
A transport-level error has occurred when receiving results from the server. (provider: Named Pipes Provider, error: 0 - The pipe has been ended. 

I think the error is caused by either postgre or sql server using too much memory generating the large statement. I've found that if I manually select only parts of the data to insert at a time, it works. For instance, the top 10000 rows. But I don't know a way to write a general statement to select only x amount of rows at a time that isn't specific to the table I'm referencing.

Perhaps someone can suggest a better way of doing this though. Keep in mind I do need to change some of the data before inserting it into postgres e.g. geospatial information is transformed to a string so postgres will be able to interpret it.

Thanks!

Was it helpful?

Solution 2

I ended up using the OFFSET X ROWS FETCH NEXT Y ROWS ONLY introduced in SQL Server 2012 so the complete statement looked like this:

INSERT INTO OPENQUERY(POSTGRES,'SELECT * FROM targettable')
SELECT *
FROM sourcetable
ORDER BY 1
OFFSET 0 ROWS FETCH NEXT 10000 ROWS ONLY

And everything is working and error appears! I actually iterate through the OFFSET value adding 10,000 to it on every iteration using dynamic SQL.

Not the cleanest or nicest solution. I think most people would be better writing something in another language as mentioned by Michal Niklas, but this worked for me.

OTHER TIPS

I have transfered some large databases and for PostgreSQL I see 2 ways:

  1. export data into CSV file, convert CSV file into PostgreSQL COPY format (see https://wiki.postgresql.org/wiki/COPY) and use COPY (wiki page shows more alternatives)
  2. make Jython program that connect to both databases (Python is easy and Jython can work with JDBC drivers), make SELECT from source database (if you have a lot od data then use setFetchSize()), use PreparedStatement with INSERT in destination database and then dest_insert_stmt.setObject(i, src_rs.getObject(i))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top