To answer your question, Lucas, I think you want weights=(1/population). R parameterizes the weights as inversely proportional to the variances, so specifying the weights this way amounts to assuming that the variance of the error term is proportional to the population of the city, which is a common assumption in this setting.
But check the assumption! If the variance of the error term is indeed proportional to the population size, then if you divide each residual by the square root of its corresponding sample size, the residuals should have constant variance. Remember, dividing a random variable by a constant results in the variance being divided by the square of that constant.
Here's how you can check this: Obtain residuals from the regression by
residuals = lm(..., weights = 1/population)$residuals
Then divide the residuals by the square roots of the population variances:
standardized_residuals = residuals/sqrt(population)
Then compare the sample variance among the residuals corresponding to the bottom half of population sizes:
variance1 = var(standardized_residuals[population < median(population)])
to the sample variance among the residuals corresponding to the upper half of population sizes:
variance2 = var(standardized_residuals[population > median(population)])
If these two numbers, variance1
and variance2
are similar, then you're doing something right. If they are drastically different, then maybe your assumption is violated.