Currently in a locked region: Fuseki + full text search + inference

https://stackoverflow.com/questions/18385738

26-06-2022
|

Frage

I've recently started playing with the full text search in the Fuseki 0.2.8 snapshot.

I have an InfModel backed by a TDB dataset, which I've added a Lucene text index to. I have tested it out with some search queries like this:

prefix text: <http://jena.apache.org/text#>
select distinct ?s where { ?s text:query ('stu' 16) }

This works great, until I have two or more simultaneous queries to Fuseki, then occasionally I get:

Error 500: Currently in a locked region Fuseki - version 0.2.8-SNAPSHOT (Build date: 20130820-0755).

I've tried testing out the endpoint with 10 concurrent users sending queries at random intervals, over a two minute period around 30% of the queries return the 500 error above.

I have also tried disabling inference by replacing this section (full assembler file below):

<#dataset_fulltext> rdf:type     text:TextDataset ;
  text:dataset   <#dataset_inf> ;
  ##text:dataset   <#tdbDataset> ;
  text:index     <#indexLucene> .

with this:

<#dataset_fulltext> rdf:type     text:TextDataset ;
  ##text:dataset   <#dataset_inf> ;
  text:dataset   <#tdbDataset> ;
  text:index     <#indexLucene> .

and there are no exceptions generated when the TextDataset is using #tdbDataset rather than #dataset_inf.

Are there any problems with my set up, or is this a bug in Fuseki?

Here is my current assembler file:

@prefix :        <#> .
@prefix fuseki:  <http://jena.apache.org/fuseki#> .
@prefix rdf:     <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs:    <http://www.w3.org/2000/01/rdf-schema#> .
@prefix tdb:     <http://jena.hpl.hp.com/2008/tdb#> .
@prefix ja:      <http://jena.hpl.hp.com/2005/11/Assembler#> .
@prefix text:    <http://jena.apache.org/text#> .
@prefix dc:      <http://purl.org/dc/terms/> .

[] rdf:type fuseki:Server ;
  # Timeout - server-wide default: milliseconds.
  # Format 1: "1000" -- 1 second timeout
  # Format 2: "10000,60000" -- 10s timeout to first result, then 60s timeout to for rest of query.
  # See java doc for ARQ.queryTimeout
  ja:context [ ja:cxtName "arq:queryTimeout" ;  ja:cxtValue "12000,50000" ] ;

  fuseki:services (
    <#service1>
  ) .

# Custom code.
[] ja:loadClass "com.hp.hpl.jena.tdb.TDB" .

# TDB
tdb:DatasetTDB  rdfs:subClassOf  ja:RDFDataset .
tdb:GraphTDB    rdfs:subClassOf  ja:Model .

## Initialize text query
[] ja:loadClass       "org.apache.jena.query.text.TextQuery" .
# A TextDataset is a regular dataset with a text index.
text:TextDataset      rdfs:subClassOf   ja:RDFDataset .
# Lucene index
text:TextIndexLucene  rdfs:subClassOf   text:TextIndex .

## ---------------------------------------------------------------
## Service with only SPARQL query on an inference model.
## Inference model bbase data in TDB.

<#service1>  rdf:type fuseki:Service ;
  rdfs:label               "TDB/text service" ;
  fuseki:name              "dataset" ;         # http://host/dataset
  fuseki:serviceQuery      "query" ;
  fuseki:serviceUpdate     "update" ;
  fuseki:serviceUpload     "upload" ;
  fuseki:serviceReadWriteGraphStore "data" ;
  fuseki:serviceReadGraphStore "get" ;
  fuseki:dataset           <#dataset_fulltext> ;
    .

<#dataset_inf> rdf:type ja:RDFDataset ;
  ja:defaultGraph       <#model_inf> .

<#model_inf> rdf:type ja:Model ;
  ja:baseModel <#tdbGraph> ;
  ja:reasoner [ ja:reasonerURL <http://jena.hpl.hp.com/2003/OWLMicroFBRuleReasoner> ] .

<#tdbDataset> rdf:type tdb:DatasetTDB ;
  tdb:location "Data" .
<#tdbGraph> rdf:type tdb:GraphTDB ;
  tdb:dataset <#tdbDataset> .

# Dataset with full text index.
<#dataset_fulltext> rdf:type     text:TextDataset ;
  text:dataset   <#dataset_inf> ;
  ##text:dataset   <#tdbDataset> ;
  text:index     <#indexLucene> .

# Text index description
<#indexLucene> a text:TextIndexLucene ;
  text:directory <file:Lucene> ;
  ##text:directory "mem" ;
  text:entityMap <#entMap> ;
  .

# Mapping in the index
# URI stored in field "uri"
# rdfs:label is mapped to field "text"
<#entMap> a text:EntityMap ;
  text:entityField      "uri" ;
  text:defaultField     "text" ;
  text:map (
    [ text:field "text" ; text:predicate dc:title ]
    [ text:field "text" ; text:predicate dc:description ]
  ) .

And here is the full stack trace for one of the exceptions in Fuseki's log:

16:27:01 WARN  Fuseki               :: [2484] RC = 500 : Currently in a locked region
com.hp.hpl.jena.sparql.core.DatasetGraphWithLock$JenaLockException: Currently in a locked region
    at com.hp.hpl.jena.sparql.core.DatasetGraphWithLock.checkNotActive(DatasetGraphWithLock.java:72)
    at com.hp.hpl.jena.sparql.core.DatasetGraphTrackActive.begin(DatasetGraphTrackActive.java:44)
    at org.apache.jena.query.text.DatasetGraphText.begin(DatasetGraphText.java:102)
    at org.apache.jena.fuseki.servlets.HttpAction.beginRead(HttpAction.java:117)
    at org.apache.jena.fuseki.servlets.SPARQL_Query.execute(SPARQL_Query.java:236)
    at org.apache.jena.fuseki.servlets.SPARQL_Query.executeWithParameter(SPARQL_Query.java:195)
    at org.apache.jena.fuseki.servlets.SPARQL_Query.perform(SPARQL_Query.java:80)
    at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeLifecycle(SPARQL_ServletBase.java:185)
    at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.executeAction(SPARQL_ServletBase.java:166)
    at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.execCommonWorker(SPARQL_ServletBase.java:154)
    at org.apache.jena.fuseki.servlets.SPARQL_ServletBase.doCommon(SPARQL_ServletBase.java:73)
    at org.apache.jena.fuseki.servlets.SPARQL_Query.doGet(SPARQL_Query.java:61)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:735)
    at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
    at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1448)
    at org.eclipse.jetty.servlets.UserAgentFilter.doFilter(UserAgentFilter.java:82)
    at org.eclipse.jetty.servlets.GzipFilter.doFilter(GzipFilter.java:294)
    at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
    at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
    at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
    at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075)
    at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
    at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
    at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
    at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
    at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
    at org.eclipse.jetty.server.Server.handle(Server.java:370)
    at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
    at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53)
    at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:949)
    at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1011)
    at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:644)
    at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235)
    at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72)
    at org.eclipse.jetty.server.nio.BlockingChannelConnector$BlockingChannelEndPoint.run(BlockingChannelConnector.java:298)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
    at java.lang.Thread.run(Thread.java:722)

Any advice appreciated.

Thanks, Stuart.

Lösung

This looks like it is probably a bug which I have filed as JENA-522, if you have further details on the bug to add please add a comment there.

The issue is that the dataset with inference implicitly uses ARQ's standard in-memory Dataset implementation and this does not support transactions.

However text datasets which correspond to DatasetGraphText internally (and in your stack trace) requires the wrapped dataset to support transactions and where they do not wraps them with DatasetGraphWithLock. It is this that appears to be encountering the problem with the lock, the documentation states that this should support multiple readers but having followed the logic of the code I'm not sure that it actually allows this.

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow