Loading very large RDF triples into iGraph -> Fast vertices lookup?

Question

Already answered here. For sake of completeness, I'm copying my answer here as well:

Vertex lookups are usually O(|V|) since vertex attributes are not indexed by default - except the name vertex attribute, which is indexed. However g.vs.find is using this index only if you do this: g.vs.find(url) but not if you do this: g.vs.find(name=url). This is sort of a bug as the index could be used in both cases. Also see yesterday's thread from the mailing list.

However, note that igraph's data structures are optimized for static graphs, so g.add_vertex (and I presume you also use g.add_edge) could also be a bottleneck. Internally, igraph uses an indexed edge list to store the graph and the index has to be re-built every time you mutate the graph, so it is much more efficient to do vertex and edge additions in batches where possible.

Since you already seem to have an iterator that yields the edges of your graph in (subject, predicate, object) form, maybe it's easier to use Graph.DictList to construct the graph at once because it also takes care of storing the vertex IDs in the name attribute, adding edges in batches where it makes sense, and also adding the predicate attribute from your triplets:
>>> g = Graph.DictList(vertices=None, edges=({"source": subject,
...         "target": object, "predicate": predicate}
...         for subject, predicate, object in your_iterator))
Graph.DictList processes 100000 pre-generated random triplets in 1.63 seconds on my machine so I guess that improves things a little bit.