Question

I'm trying to add to Sesame 2.7.1 the URI "http://dbpedia.org/resource/Brazil" from DBPedia resources. But Sesame 2.7.1 stops with the following error:

'1000000000000' is not a valid value for datatype http://www.w3.org/2001/XMLSchema#int

The same error occurs with more resources, and I don't know how to fix it. In my Java code (and based in the answers of my other questions), I tried:

RepositoryConnection con = repository.getConnection();

con.getParserConfig().addNonFatalError( BasicParserSettings.VERIFY_DATATYPE_VALUES );
con.getParserConfig().addNonFatalError( BasicParserSettings.FAIL_ON_UNKNOWN_DATATYPES );

con.add(uri.toURL(), null, RDFFormat.RDFXML);
con.close();

But Sesame still persisting in the error.

I didn't have this problem with Sesame 2.6.9 (it accepted the same resource without problems).

There is any reason for that to happen? Is there a way to fix it? Or should I back to Sesame 2.6.9.?

Thanks!

Was it helpful?

Solution

The reason this happens is that in Sesame 2.7, datatype verification has been made more strict. It turns out that, unfortunately, DBPedia contains a lot of data with invalid datatypes.

The best fix would be to push the DBPedia maintainers to clean up their data, but of course that may not be more easily said than done :) In the meantime, you can of course edit the dbpedia files yourself to fix these problems as Sesame reports them.

I am assuming that you are using a HTTPRepository (or a SPARQLRepository) to try and load the file into a repository running on a Sesame server. In this case, configuring the parser to ignore errors (using addNonFatalError) has no influence, because the parser you are configuring with this is the client-side parser, which not the one used to actually parse the data (when uploading a file from a URL via a HTTPConnection, the data is parsed by a parser on the server, rather than the parser on the client side).

In Sesame 2.7.1, there is unfortunately no easy way around this: parser configuration used in a Sesame Server is fixed. We are looking into a mechanism to make this configurable for an upcoming 2.7.2 release though.

OTHER TIPS

try

con.getParserConfig()
                .setNonFatalErrors(
                        new HashSet<RioSetting<?>>(
                                Arrays.asList(BasicParserSettings.FAIL_ON_UNKNOWN_DATATYPES)));
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top