A pixelshader is the program for exactly one fragment. But, all pixels are shaded within a 2x2 block. The pixelshaders are running simultaneously for this 4 pixels, so there can be computed dervatives for mipmapping etc. If you're calling tex.Sample
it's called for all pixels in the respective block, so the gradient can be determined. This is the reason why gradient functions cannot be used in branching ifs or loops, because the program would differ within the fragment-quad.
There is a good description how mipmapping for example works: http://www.arcsynthesis.org/gltut/Texturing/Tut15%20How%20Mipmapping%20Works.html
@ Your edit, which should be answered by my link, but anyway: I'm not sure how it's implemented, but in the best case your shader would be runned 16 times. The pixels are grouped to 2x2 quads. There are cases at edges where pixels will be computed and discarded, but maybe only with skewed edges.