At what size is the Faunus Graph Analytics Framework needed for Titan?

Question

At 10000 nodes and 1M edges, you shouldn't have problems with plain Gremlin (no Faunus). See the code below where I generate a graph of approximately that size using Furnace:

gremlin>  g = TitanFactory.open('/tmp/titan/generated')
==>titangraph[local:/tmp/titan/generated]
gremlin> import com.tinkerpop.furnace.generators.*
==>import com.tinkerpop.gremlin.*
==>import com.tinkerpop.gremlin.java.*
...
==>import com.tinkerpop.furnace.generators.*
gremlin> for (int i=0;i<10000;i++) g.addVertex(i)
==>null
gremlin> r = new java.util.Random()
==>java.util.Random@137f0ced
gremlin> generator = new DistributionGenerator("knows", { it.setProperty("weight", r.nextInt(100)) } as EdgeAnnotator)
==>com.tinkerpop.furnace.generators.DistributionGenerator@111a3ce4
gremlin> generator.setOutDistribution(new PowerLawDistribution(2.1))
==>null
gremlin> generator.generate(g,1000000)
==>1042671

Recalling your post here on aggregates, I basically execute the same query on this data set.

gremlin> start=System.currentTimeMillis();g.E.groupBy{it.getProperty("weight")}{it}.cap.next();System.currentTimeMillis()-start
==>1415
gremlin> m.size()
==>100

As you can see, it takes about 1.5 seconds to do this traversal (it's a bout 500ms on TinkerGraph which is all in memory).

At 1B edges you will likely need Faunus. I don't think you would get through iteration of all those edges in under a minute even if you could fit it all in memory somehow. Note that with Faunus, you might not get 1 minute query/answer times. You will need to experiment a bit I think.