How to use a multiple linear regression model built from normalized data
Вопрос
I built a linear multivariable regression model from normalized data (for the interval [0; 1]). Initially, the data was not normalized, I normalized the data by myself (independent and dependent variables). I want to use this model to make predictions from newly received data (I get the values of independent variables and I want to predict the value of the dependent variable). The problem is that the data comes in a raw, unnormalized form.
- How can I normalize newly arriving data if only one "observation" is received?
- What if I want to get the real values of the dependent variable using my model, and not the normalized ones?
Решение
So, the question asks:
- How to normalise incoming (individual observations)
- How to get the real value predictions and not the normalised values.
When we do normalisation using the Sci-kit learn module, instead of using the very handy
fit_transform()
method in the scaler, you could instead perform a.fit()
over your original observations and then apply the.transform()
to the newly-observed values. Obviously, in this case you need separate scalers foreach feature, as they are distributed differently from one another.Again in Sci-kit learn, there is an
inverse_transform()
method which reverts the normalised value back to the original scale.