Your statement " But when I check the error e = Ax-By, its not zero always. " almost always will be true, regardless of your technique, or what weighting you choose. When you have an over-described system, you are basically trying to fit a straight line to a slew of points. Unless, by chance, all the points can be placed exactly on a single perfectly straight line, there will be some error. So no matter what technique you use to choose the line, (weights and so on) you will always have some error if the points are not colinear. The alternative would be to use some kind of spline, or in higher dimensions to allow for warping. In those cases, you can choose to fit all the points exactly to a more complicated shape, and hence result with 0 error.
So the choice of a weight matrix simply changes which straight line you will use by giving each point a slightly different weight. So it will not ever completely remove the error. But if you had a few particular points that you care more about than the others, you can give the error on those points higher weight when choosing the least square error fit.
For spline fitting see:
http://en.wikipedia.org/wiki/Spline_interpolation
For the really nicest spline curve interpolation you can use Centripital Catmull-Rom, which in addition to finding a curve to fit all the points, will prevent unnecessary loops and self intersections that can sometimes come up during abrupt changes in the data direction.