Question

I have a large amount of spatial data I need analyze and put into use in an application. Original data is represented in WKT format and I'm wrapping it into a INSERT SQL statements to upload the data.

INSERT INTO sp_table ( ID_Info, "shape") VALUES ('California', , ST_GeomFromText('POLYGON((49153 4168, 49154 4168, 49155 4168, 49155 4167, 49153 4168))'));

However this approach is taking too much time and data is large (10 million rows). So, is there any other way to upload large amount of spatial data ?

Any speedup hacks & tricks are welcome appreciated.

Was it helpful?

Solution

Insert your text file into a table (with proper columns) using COPY

Add a SERIAL PRIMARY KEY to this table if it doesn't have one

VACUUM

Spawn one process per CPU which does this :

INSERT INTO sp_table ( ID_Info, "shape")
SELECT state_name, ST_GeomFromText( geom_as_text )
FROM temp_table
WHERE id % numbre_of_cpus = x

Use a different value of "x" for each process, so the entire table is processed. This will allow each core to run on the slow ST_GeomFromText function.

Create GIST index after insertion.

OTHER TIPS

Here you can find some general performance tips. Probably you have fsync property enabled and every INSERT command is forced to be physically written to hard disk, that's why it takes so much time.

It's not recommended to turn off fsync (especially in production environments), because it allows you to safely recover data after unexpected OS crash. According to doc:

Thus it is only advisable to turn off fsync if you can easily recreate your entire database from external data.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top