How does the Titan Twitter Example work?

https://stackoverflow.com/questions/22821095

26-06-2023
|

Question

I'm trying to wrap my mind around graph data right now. I'm finding it difficult to think in terms of property graphs. On the vertex centric indeces docs page, there is an example involving twitter data. The Gremlin code is:

g = TitanFactory.open(conf)
// graph schema construction
g.makeKey('name').dataType(String.class).indexed(Vertex.class).make()
time = g.makeKey('time').dataType(Long.class).make()
if(useVertexCentricIndices)
  g.makeLabel('tweets').sortKey(time).make()
else 
  g.makeLabel('tweets').make()
g.commit()

// graph instance construction
g.addVertex([name:'v1000']);
g.addVertex([name:'v10000']);
g.addVertex([name:'v100000']);
g.addVertex([name:'v1000000']);

for(i=1000; i<1000001; i=i*10) {
  v = g.V('name','v' + i).next();
  (1..i).each {
    v.addEdge('tweets',g.addVertex(),[time:it])
    if(it % 10000 == 0) g.commit()
  }; g.commit()
}

The explanation is that each edge represents someone tweeting a tweet vertex. This doesn't make sense to me as a schema. Why should any two nodes be connected? If the answer is that the edge connects different tweets that a user has tweeted, then one edge connects more than one node. This would mean that Titan is a hypergraph, which I thought it wasn't.

In short, can someone explain this example better than the docs?

Solution

The example in the wiki is a bit over-simplified and is designed to convey the concept of vertex-centric indices. On its own, it might not be he best thing to use for purposes of understanding how to model a schema in general. That said, I think the model still makes basic sense (at least in that light).

If the answer is that the edge connects different tweets that a user has tweeted, then one edge connects more than one node.

I'm not sure where you see that in the code. I see 4 user vertices who are doing the tweeting (v1000, v10000, etc). The for loop iterates each user and adds tweet edges for each. On each creation of an edge a new vertex is created to represent the tweet. Perhaps I'm misunderstanding you but in that sense an edge does not connect more than two vertices. It only connects from user vertex into tweet vertex.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow