PyBrain Reinforcement Learning - Maze and Graph

https://stackoverflow.com/questions/11980051

26-06-2021
|

Question

I was trying to implement in PyBrain something similar to a Maze problem. However, it's more similar to a room with an emergency exit, where you leave an agent in one of the rooms to find the exit. To convert this to a computer method a bi-directional graph could be used with the weights showing the path between the rooms.

I tried to implement a new environment, but I'm kind of lost on what should be what. For example, based on the abstract environment class I have thought about this:

#!/usr/bin/python2.7

class RoomEnv(Environment):
    # number of action values acceptable by the environment
    # Two events: go forward and go back through the door (but, how we know what room is connect to another?)
    indim = 2
    # Maybe a matrix where 0 is no connection and 1 is a connection(?)
    #            A,B,C,D,E,F
    #indim = array([[0,0,0,0,0,0],  # A
                    [0,0,0,0,0,1],  # B
                    [0,0,0,0,0,0],  # C
                    [0,0,0,0,0,0],  # D
                    [0,0,0,0,0,1],  # E
                    [0,0,0,0,0,1],  # F
                  ])

    # the number of sensors is the number of the rooms
    outdim = 6

    def getSensors(self):
        # Initial state:
        # Could be any room, maybe something random(?)

    def performAction(self, action):
        # We should look at all the states possible to learn what are the best option to go to the outside state.
        # Maybe a for loop that goes through all the paths and use some weight to know where is the best option?

        print "Action performed: ", action

    def reset(self):
        #Most environments will implement this optional method that allows for reinitialization.

Sincerely,

Solution

In pybrain, you can define the room as an array and then pass the structure to the Maze to make a new environment. for example:

structure = array([[1, 1, 1, 1, 1, 1, 1, 1, 1],
                   [1, 0, 0, 1, 0, 0, 0, 0, 1],
                   [1, 0, 0, 1, 0, 0, 1, 0, 1],
                   [1, 0, 0, 1, 0, 0, 1, 0, 1],
                   [1, 0, 0, 1, 0, 1, 1, 0, 1],
                   [1, 0, 0, 0, 0, 0, 1, 0, 1],
                   [1, 1, 1, 1, 1, 1, 1, 0, 1],
                   [1, 0, 0, 0, 0, 0, 0, 0, 1],
                   [1, 1, 1, 1, 1, 1, 1, 1, 1]])

# defining the environment
environment = Maze(structure, (7, 7))

in the above example, the 1s show the walls and the 0s show the grids that the agent can walk on. So you can modify the structure to make your own.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow