This solution might get you closer to what you need, though I'm not sure it's an exact fit as it only let's you calculate the the average at a selected level of the hierarchy (i.e. at "A" or "A1"). Here's my Gremlin session:
gremlin> g = new TinkerGraph()
==>tinkergraph[vertices:0 edges:0]
gremlin> a = g.addVertex("a")
==>v[a]
gremlin> a1 = g.addVertex("a1")
==>v[a1]
gremlin> a2 = g.addVertex("a2")
==>v[a2]
gremlin> a.addEdge('has',a1)
==>e[0][a-has->a1]
gremlin> a.addEdge('has',a2)
==>e[1][a-has->a2]
gremlin> a1.addEdge('relationship',g.addVertex("a11",[points:20]))
==>e[2][a1-relationship->a11]
gremlin> a1.addEdge('relationship',g.addVertex("a12",[points:20]))
==>e[3][a1-relationship->a12]
gremlin> a2.addEdge('relationship',g.addVertex("a21",[points:100]))
==>e[4][a2-relationship->a21]
gremlin> a2.addEdge('relationship',g.addVertex("a22",[points:0]))
==>e[5][a2-relationship->a22]
gremlin> p=g.v("a").out.loop(1){it.loops<10}{true}.path.filter{it.last().getProperty("points")!=null}.toList()
==>[v[a], v[a2], v[a22]]
==>[v[a], v[a2], v[a21]]
==>[v[a], v[a1], v[a12]]
==>[v[a], v[a1], v[a11]]
gremlin> p.collect{[it, it.last().getProperty("points")]}._().groupBy{it[0][0]}{it[1]}{it.sum()/it.size()}.cap.next()
==>v[a]=35
gremlin> p.collect{[it, it.last().getProperty("points")]}._().groupBy{it[0][1]}{it[1]}{it.sum()/it.size()}.cap.next()
==>v[a1]=20
==>v[a2]=50
So, this line gets us the paths that matter (i.e. those that end with a leaf node that has points
:
p=g.v("a").out.loop(1){it.loops<10}{true}.path.filter{it.last().getProperty("points")!=null}.toList()
I store those in p
for later usage. Note that this will explore the tree to a depth of 10 as controlled by it.loops<10
. From there it's pretty simple to use p
to calculate the averages. Here's the example for calculating it for A:
p.collect{[it, it.last().getProperty("points")]}._().groupBy{it[0][0]}{it[1]}{it.sum()/it.size()}.cap.next()
The above basically says, for each path, convert that to a new List where the first item is the path and the second item is the points at the leaf node. Convert that list into a pipeline with the identity function and group on the first item in the path (identified by it[0][0]
) and grab the points value (the second closure to groupBy
) for that path. The third closure to groupBy
is a reducer function which sums the points and calculates the average from that.
Another alternative, if you only need to calculate the average for a single vertex would be with this approach:
gremlin> g.v("a").out.loop(1){it.loops<10}{true}.path{it.points}.filter{it.last()!=null}
.transform{it.last()}.gather.transform{it.sum()/it.size()}
==>35
Note that the traversal is largely the same at the start, but uses a closure when grabbing the path. That closure converts the vertex to the value of the points
property (note that using it.getProperty("points")
is much more efficient than it.points
). From there I again filter out paths that have a null value the last item in the path (i.e. as leaf nodes are the only ones with a points property this should leave us with paths that end with leaves). I then transform those paths to grab the points, gather them to a list and convert the list to an average points for "A".