A* uses a heuristic to guess the minimal cost of a node to the target. So when choosing a node, you don't only analyze the cost from the start to this node but also the probable cost from the node to the target. This allows to ignore paths that probably lead in the wrong direction.
If the heuristic in your example uses the geographic distance of the nodes, then roads leading north will be examined first (because they have a small estimated overall cost). But it is possible that during the execution of the algorithm those roads aggregate a very large actual distance from the start (maybe because there are lots of curves). Then also roads leading south can be examined. So it is possible to go south for a bit and go straight north if this is shorter than all the curvy roads leading north (not taking into account the actual road map of GB).
The nature of the heuristic defines the quality of the algorithm. If the heuristic gives a lower bound for the distance (i.e. it is not possible to get to the target with a cheaper cost than the heuristic says), then the path is really the shortest path. If the heuristic is not a lower bound, other paths could be allowed as well (which might be a tradeoff between algorithm runtime and path quality). The better the heuristic estimates the minimal cost, the less wrong directions you will examine. I.e. if the heuristic is constant, you'll end up with a plain Dijkstra's algorithm. If the heuristic calculates the actual cost to the target, you will only follow the shortest path and no other nodes will be examined.