I have mostly only a general answer, I never read any small sample Monte Carlo studies for M-estimators.
To 1.
In many models, like M-estimators, RLM, or generalized linear models, GLM, we have only asymptotic results, except for maybe a few special cases. Asymptotic results provide conditions that the estimator is normally distributed. Given this, statsmodels defaults to using normal distribution for all models outside of the linear regression model, OLS, and similar, and chisquare instead of the F distribution for Wald tests with joint hypothesis.
There is some evidence that in many cases using the t or F distribution with appropriate choice of degrees of freedom provides a better small sample approximation to the distribution of the test statistic. This relies on Monte Carlo results and is not directly justified by the theory, as far as I know.
In the next release, and in the current development version, of statsmodels users can choose to use the t and F distribution for the results, instead of the normal and chisquare distribution. The defaults stay the same as they are now.
There are other cases where it is not clear whether the t-distribution, and which small sample degrees of freedom should be used. In many cases, statsmodels tries to follow the lead of STATA, for example in cluster robust standard errors after OLS. Another consequence is that sometimes equivalent models that are special cases of different models use different default assumptions on the distribution, both in Stata and in statsmodels.
I recently read the SAS documentation for M-estimators, and SAS is using the chisquare distribution, i.e. also the normal assumption, for the significance of the parameter estimates and for the confidence intervals.
To 2.
(see first sentence)
I think the same as for linear models also applies here. If the data is highly non-normal, then the test statistics could have incorrect coverage in small samples. This can also be the case with some robust sandwich covariance estimators. On the other hand, if we don't use heteroscedasticity or correlation robust covariance estimators, then the tests can also be strongly biased.
For robust estimation methods like M-estimators, RLM, the effective sample size also depends on the number of inliers, or the weights assigned to the observations, not just the total number of observations.
For your case, I think the z-values and the sample size are large enough that, for example, using the t-distribution would not make them much less significant. Comparing M-estimators with different norms and scale estimates would provide an additional check on the robustness on the assumption on the outliers and for the choice of robust estimator. Another cross check: Does OLS with dropped outliers (observations with small weights in the RLM estimate) give a similar answer.
Finally as general caution: The references on robust methods often warn that we should not use (outlier-)robust methods blindly. Using robust methods estimates a relationship based on "inliers". But is our discarding or down-weighting of outliers justified? Or, do we have missing non-linearities, missing variables, a mixture distribution or different regimes?