The simple formula is valid if and only if the motion from left camera to right one is a pure translation (in particular, parallel to the horizontal image axis).
In practice this is hardly ever the case. It is common, for example, to perform the matching after rectifying the images, i.e. after warping them using a known Fundamental Matrix, so that corresponding pixels are constrained to belong to the same row. Once you have matches on the rectified images, you can remap them onto the original images using the inverse of the rectifying warp, and then triangulate into 3D space to reconstruct the scene. OpenCV has a routine to do that: reprojectImageTo3d