Have a token at the starting station. When a train passes through the station, it "picks up" a copy of the token and then "drops off" copies of it at each stop.
Similarly, when any train passes through a station with a token, it picks up a copy if it doesn't have one already.
Simulate this until the destination station receives a token.
There are N+M
trains to keep track of, and at most O(N+M)
steps before the destination station receives a token. Hence this should run in O((N+M)^2)
time.
You can simulate this implicitly by having a counter for each station and propagating a newly-updated counter's value along its row and column with arithmetic, but it's still going to be quadratic time because you've got a quadratic number of stations to deal with, so you may as well do it explicitly and keep it simple.