Based on @hpaulj's comment, I created an IPython Notebook which can be seen on nbviewer. This shows that out of all methods mentioned the following is the fastest (assume that mat
is a sparse CSR matrix):
mat - scipy.sparse.dia_matrix((mat.diagonal()[scipy.newaxis, :], [0]), shape=(one_dim, one_dim))