Question

I have downloaded yago.n3 dataset

However for testing I wish to work on a smaller version of the dataset (as the dataset is 2 GB) and even though i make a small change it takes me a lot of time to debug.

Therefore, I tried to copy a small portion of the data and create a separate file, however this did not work and threw lexical errors.

I saw the earlier posts, however the earlier post is about big datasets, whereas I am searching for smaller ones.

Is there any means by which I may obtain a smaller amount of the same dataset?

Was it helpful?

Solution

If you have an RDF parser at hand to read your yago.n3 file, you can parse it and write on a separate file as many RDF triples as you want/need for your smaller dataset to run your experiments with.

If you find some data in N-Triples format (i.e. one RDF triple per line) you can just take as many line as you want and make your dataset as small as you want: head -n 10 filename.nt would give you a tiny dataset of 10 triples.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top