Question

I have a to transfer RDF data from a triplestore to property tables. Example of a triple store and a property table is given below.

triplestore

Subject Property Object
  Sub1 prop1 hello
  Sub2 prop1 hello1
  Sub3 prop1 hello2
  Sub1 prop2 world
  Sub2 prop2 world1
  Sub3 prop2 world2
  Sub4 prop3 random

Property Table 1

Subject prop1 prop2
Sub1    hello world
Sub2    hello1 world1
Sub3    hello2 world2

Property Table 2

Subject prop3
Sub4  random 

This is a very simplified version of the dataset I am using. There are around a million records in the triplestore table. More than one property tables have to be created depending upon the various groupings of the various properties and objects. I have identified and created the various property tables? The properties that make a property table are chosen in such a way that a subject is fully contained by a single property table?

The problem that I am facing is the insertion of data from the triplestore to the property tables. Is there a way that data for a particular subject can be inserted into a row of property table in a single insert statement.If it cant be done in a single query what is the most efficient way to do so.

I am using python to create a dump of SQL queries which I latter run on a postgres server.

Was it helpful?

Solution

This is easy if you have a known, fixed set of properties. If you do not have a known set of fixed properties you have to generate dynamic SQL, either from your app, from PL/PgSQL or using the crosstab function from the tablefunc extension.

For fixed property sets you can self-join:

http://sqlfiddle.com/#!12/391b7/6

SELECT p1."Subject", p1."Object" AS "prop1", p2."Object" AS "prop2"
FROM triplestore p1
INNER JOIN triplestore p2 ON (p1."Subject" = p2."Subject")
WHERE p1."Property" = 'prop1'
  AND p2."Property" = 'prop2'
ORDER BY p1."Subject";

SELECT p1."Subject", p1."Object" AS "prop1"
FROM triplestore p1
WHERE p1."Property" = 'prop3'
ORDER BY p1."Subject";

To turn these into INSERTs simply use INSERT ... SELECT eg:

INSERT INTO "Property Table 1"
SELECT p1."Subject", p1."Object" AS "prop1"
FROM triplestore p1
WHERE p1."Property" = 'prop3'
ORDER BY p1."Subject";

OTHER TIPS

Generally speaking what you try to do smells a bit of EAV (Entity Attribute Value) Pattern - which is widely considered an antipattern. In addition i think i don't really understand what you are trying to achieve therefor sorry if my answer doesn't suit your needs

If your problem is storing data of previously unknown format under a certain key - in your example this seem to be subject - i would suggest using the postgres contrib hstore extension. Using this would allow you to create a table like

create table foo (
  id serial not null primary key,
  subject character varying not null,
  properties hstore
);

in which the properties field is essentially what Ruby for instance calls a "Hash". You can insert key/value pairs into this store (from your above example for instance 'prop1=>hello' and select it with equivalent syntax.

Inserting is fairly straight forward:

insert into foo (subject, properties) values ('Sub1', 'prop1=>Hello'::hstore);

Advantage over using other methods is that hstore supports btree, gin and gist indexes (all of which under certain preconditions). In your case - doing mostly direct matches searching for a certain value in a property even btree works since it support the equality operator for hstore.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top