Derive Sobel's operators?

Question 1

The formulation was proposed by Irwin Sobel a long time ago. I think about 1974. There is a great page on the subject here.

The main advantage of convolving the 9 pixels surrounding one at which gradients are to be detected is that this simple operator is really fast and can be done with shifts and adds in low-cost hardware.

They are not the greatest edge detectors in the world - Google Canny edge detectors for something better, but they are fast and suitable for a lot of simple applications.

Question 2

So spatial filters, like the Sobel kernels, are applied by "sliding" the kernel over the image (this is called convolution). If we take this kernel:

[-1 0 1]
[-2 0 2]
[-1 0 1]

After applying the Sobel operator, each result pixel gets a:

high (positive) value if the pixels on the right side are bright and pixels on the left are dark
low (negative) value if the pixels on the right side are dark and pixels on the left are bright.

This is because in discrete 2D convolution, the result is the sum of each kernel value multiplied by the corresponding image pixel. Thus a vertical edge causes the value to have a large negative or positive value, depending on the direction of the edge gradient. We can then take the absolute value and scale to interval [0, 1], if we want to display the edges as white and don't care about the edge direction.

This works identically for the other kernel, except it finds horizontal edges.