Reinforcement learning with neo4j: make 2 copies of the graph vs store 2 copies of all values on 1 graph

https://stackoverflow.com/questions/8829836

15-04-2021
|

Question

I'm planning on running a machine learning algorithm that learns node values and edge weights. The algorithm is very similar to the value iteration algorithm here. Each node represents a location and each edge is a path to a new location. Each node and edge maintains a value that represents how desirable it is. These values are updated every iteration based on the values of the previous iteration.

I plan on using neo4j as the backend. There will be around 600,000 Nodes and 100,000,000 edges for now, but more may be added later (The graph won't fit in memory). What is the best way to retain the values from the previous iteration? 2 ways that come to mind are:

clone the current database and use 1 copy as the "current iteration" copy and 1 copy as the "previous iteration" copy.
At the end of every iteration move all values of each node and edge to a "previous iteration" property.

How do people usually do this? Is there a better way?

Solution

what is the size of the graph, and how many iterations are you doing? With smaller graphs, I would duplicate the graph in versions, either on disk as a new graph, or just as a subgraph in the same database.

With bigger amounts, I think the property archiving approach is a better one.

OTHER TIPS

The MDP article you reference gives a good overview of value and policy iteration, but stops short of mentioning asynchronous dynamic programming. The fact is that you don't need to perform full isolated sweeps of the graph for the values to converge [1]; you can update the vertices one at a time in any order, given that you don't neglect any of them. That is, you don't need to make a copy of the graph at all.

[1] Bertsekas, D., 1982. Distributed dynamic programming. IEEE Transactions on Automatic Control, 27(3), p.610-616.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow