Recommended AI/machine learning: profiles input, income prediction

https://stackoverflow.com/questions/12751546

05-07-2021
|

Frage

My project looks like this: my data set is a bunch of profiles of people, with various attributes, e.g. boolean hasJob and int healthScore, and their income. Using this data, I'm trying to predict their income for the future. Each profile also has a history: e.g., what their attributes and income were in the past.

So in essence I'm trying to map multiple sets of (x booleans, y numbers) to a number (salary in the coming year).

I've considered neural networks, Bayes nets, and genetic algorithms for function-fitting. Any suggestions or input?

Thanks in advance! --Emily

Lösung

What you want to do is called "time series modeling". However you probably have only very little data per series (per person). I think it is difficult to find one model that fits every person as you make some general assumptions that e.g. everyone is equally career oriented. Also this is such a noisy target, it could be that e.g. you have to take into account if someone is a sweettalker or not. How do you measure such a thing? I'm pretty sure your current attributes have enough noise that will make it difficult to predict anything. When you say health status, do you mean physical health only or mental health. In different businesses different things are important. What about the business or industry they are working in? Its health and growth potential? I would assume this highly influences their income. I also think that you have dependent variables as well as attributes could (and likely are) influenced by your target variable. E. g. people with higher income have better health. It sounds like a very very complex and difficult thing and definitely nothing where "I naively grouped my data and tried a bunch of methods" is going to give meaningful results. I would suggest to learn more about time series modeling and especially also about the data that you have. Maybe try starting out with clustering persons by their initial attributes and see how they develop. Are there any variables that correlate with this development?

What is your research question?

Lizenziert unter: CC-BY-SA mit Zuschreibung

Nicht verbunden mit StackOverflow