Question

pl. help me with this noob questions. I want to show a network with large number (70000) of nodes, and 2.1 million links in force layout. Looking for a good and scalable way to do this.

  1. How do we actually show such large nodes practically, can we do some kind of approximation and show semantically same network (e.g: http://www.visualcomplexity.com/vc/project.cfm?id=76 )
  2. How do we actually reduce such data in back end [ say using KDE ? We cannot afford to use science.js in front end as the volume is large ]
  3. Initial view can be the network with pre-determined locations of the nodes or clusters. How do we predertmine the locations in back end, before sending the data to d3js. Do we have to use topojson ?

Any such examples are available using d3js (and a backend - say java, python etc) ?

Was it helpful?

Solution

Sorry about the question, but do you really need to show all that information in one shot?

If you really need it, have first a look with Gephi and see what it looks like, then pass to the next step.

If you see that you can focus on specific nodes or patterns at the beginning and then explore the result of the chart, probably this is the best solution from a performance point of view. In case the discovery approach works but you are still having troubles with many items on the screen, just control the force layout with a time based threshold. It's not perfect but it will work for hundred nodes.

Next step

If you decide to go anyway on this path, I would recommend the followings:

  • Aggregate: that's probably the most useful thing you can do here: let the user interact with the data and dig in it to see more in detail. That is the best solution if you have to serve many clients.

  • Do not run the force directed layout on the front end with the entire network as is: it will eat all the browser resources for at least tens of minutes in any case.

  • Compute the layout on the back end - e.g. using JUNG or Gephi core itself in Java or NetworkX in Python - and then just display the result.

  • Cache the result of the point above as well: they are many even for the server if you have many clients, so cache it.

  • When the user drag the network, hide the links: it should speed up the computation ( sigmajs uses this trick)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top