Already answered here. For sake of completeness, I'm copying my answer here as well:
Vertex lookups are usually O(|V|) since vertex attributes are not indexed by default - except the
name
vertex attribute, which is indexed. Howeverg.vs.find
is using this index only if you do this:g.vs.find(url)
but not if you do this:g.vs.find(name=url)
. This is sort of a bug as the index could be used in both cases. Also see yesterday's thread from the mailing list.However, note that igraph's data structures are optimized for static graphs, so
g.add_vertex
(and I presume you also useg.add_edge
) could also be a bottleneck. Internally, igraph uses an indexed edge list to store the graph and the index has to be re-built every time you mutate the graph, so it is much more efficient to do vertex and edge additions in batches where possible.Since you already seem to have an iterator that yields the edges of your graph in
(subject, predicate, object)
form, maybe it's easier to useGraph.DictList
to construct the graph at once because it also takes care of storing the vertex IDs in thename
attribute, adding edges in batches where it makes sense, and also adding thepredicate
attribute from your triplets:>>> g = Graph.DictList(vertices=None, edges=({"source": subject, ... "target": object, "predicate": predicate} ... for subject, predicate, object in your_iterator))
Graph.DictList
processes 100000 pre-generated random triplets in 1.63 seconds on my machine so I guess that improves things a little bit.