How to set a weighted least-squares in r for heteroscedastic data?

Question 1

To answer your question, Lucas, I think you want weights=(1/population). R parameterizes the weights as inversely proportional to the variances, so specifying the weights this way amounts to assuming that the variance of the error term is proportional to the population of the city, which is a common assumption in this setting.

But check the assumption! If the variance of the error term is indeed proportional to the population size, then if you divide each residual by the square root of its corresponding sample size, the residuals should have constant variance. Remember, dividing a random variable by a constant results in the variance being divided by the square of that constant.

Here's how you can check this: Obtain residuals from the regression by

residuals = lm(..., weights = 1/population)$residuals

Then divide the residuals by the square roots of the population variances:

standardized_residuals = residuals/sqrt(population)

Then compare the sample variance among the residuals corresponding to the bottom half of population sizes:

variance1 = var(standardized_residuals[population < median(population)])

to the sample variance among the residuals corresponding to the upper half of population sizes:

variance2 = var(standardized_residuals[population > median(population)])

If these two numbers, variance1 and variance2 are similar, then you're doing something right. If they are drastically different, then maybe your assumption is violated.

Question 2

From ?lm: "weights: an optional vector of weights to be used in the fitting process. Should be NULL or a numeric vector. If non-NULL, weighted least squares is used with weights weights (that is, minimizing sum(w*e^2)); otherwise ordinary least squares is used." R doesn't do any further interpretation of the weights argument.

So, if what you want to minimize is the sum of (the squared distance from each point to the fit line * 1/sqrt(population) then you want ...weights=(1/sqrt(population)). If you want to minimize the sum of (the squared distance from each point to the fit line * 1/population) then you want ...weights=1/population.

As to which of those is most appropriate... that's a question for CrossValidated!