So the basic operation of a depth first search is:
This works the same for any arbitrary graph as for a tree. A tree is just a special case. You can even visualize your maze as a tree:
#########
#123# #
#4### # #
#5# # #
# # # ###
# # # #
# ### # #
# # #
#########
The only difference between using this algorithm on a tree vs. an arbitrary graph is that it's implicitly known which nodes in a tree have been visited, due to the hierarchical structure of a tree. With an arbitrary graph you might have a structure like:
#####
#187#
#2#6#
#345#
#####
And when examining node eight you don't want to treat node one as a new place to visit.
With your maze one way to remember which nodes have been visited would be to fill them in with '#'
as soon as you encounter them.
I have pretty much figured out how to represent the position of the agent, how to move him around and such but my problem mostly is in how to use the stack for keeping track of which places the agent has been. By what I've found in google some keep a stack of the visited places but I never really understood when to remove positions from the stack, that's my biggest confusion
The stack itself is not used to keep track of which places are visited. It only keeps track of the current 'path' taken through the maze. When you reach a dead end nodes get removed from the stack; Those nodes must remain marked as visited even though they are removed from the stack. If removing those nodes also causes those nodes to be 'unvisited' then your search may continually try and retry dead ends.
I recommend that you first draw out a few little mazes and walk through them by hand using the flow chart above. This will give you a better understanding of the algorithm. Here are some example mazes:
Start at O, finish at X
#### ##### ##### #####
#OX# #O X# #O#X# #O #
#### ##### ##### # #X#
#####
Then consider each box in the flow chart and think about what data it uses, how to represent that data, and how to implement the box's action in code using that data.