Question

When using Python's Library Networkx's function write_adjlist (source code) I run into the following problem:

The output looks like this:

164021756 15579697
 836289488
 268525305
527465237 1514162604
460419343
 317218275
397533608
 37880000
39066509
 1146692844

When it should look like this:

164021756 15579697 836289488 268525305
527465237 1514162604
460419343 317218275
397533608 37880000
39066509 1146692844

I can't really give you the data, because it's millions of nodes (which might be a factor here, although I don't think so) but this is basically how I'm getting there:

G = nx.DiGraph()
graph_file = open(filename, 'r')

for line in graph_file.readlines():
    try:
        x, y =line.replace('\n','').split(',')
    except: print "didn't work"; continue;
    G.add_edge(x,y)
    G.add_edge(y,x)
    #This is because it's undirected, but I need the relationships 
     to be presented on both  nodes

nx.write_adjlist(G,outfilename)

graph_file is presented in the form userid1,userid2\n

This code worked fine for a 2k nodes graph and a 16k nodes graph.

The error might be due to the generate_adjlist function in the source code, but I'm not really sure. I appreciate all help and recommendations for other methods to create an adjacency list as well.

Specs: Ubuntu 14.04 64bit, 32GB of RAM, SSD, AMD FX(tm)-8350 Eight-Core Processor

EDIT: This is what graph_file looks like:

212127041,218628098
840686875,2278293507
1854227586,2278293507
2266167497,2278293507
2254676097,2278293507
2240955304,2278293507
2226709709,2278293507
1859242609,2278293507
341722764,2278293507
1270686055,2278293507
1049821634,2278293507
1003015644,2278293507
616403983,2278293507
556471190,2278293507
27260086,2278293507
714928003,2278293507
1270696736,2278293507
586671909,2278293507
34507480,2278293507
Was it helpful?

Solution

Your graph_file is probably formatted with line-endings other than single '\n'. For example, '\r\n' or '\n\r'. Instead of line.replace('\n',''), try line.strip(), which removes all leading and trailing whitespace.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top