문제

I would like to predict for a given user (on a website) if he/she logs out from the website within ten minutes.

In terms of data, I have a user ID and timestamp of the latest post on the website.

example of an id: 54a47e7a9cd118513

It would be great to get advice on how to use this data to achieve the goal. Should I use the user ID as a feature during training? If so, how would I use this?

Many thanks in advance

도움이 되었습니까?

해결책

So the question asks how to model the task of predicting whether a user (u) will log out within the next ten minutes (?, worth clarifying this and the data you have available).

If so, as you suggest we represent this as a binary classification model (does log out within ten minutes / does not log out).

The modelling of the data itself is the next question and depends on the nature of your data.

If you have data which shows all timestamps of posts for particular users, then you can frame this as a sequence-to-one problem, whereby you encode the timestamps into a sequential encoder (e.g. RNN or LSTM). This encodes the data into a "hidden representation". Then, the hidden representation is decoded with a neural network with a final 2-node softmax layer, which produces a probability distribution over the two classes (logs out / does not log out).

When it comes to representing the input, to ensure that you are making this model user-specific, it would be worth concatenating this information (as a one-hot encoded vector) with the timestamp information.

If you only have the most recent post timestamps, then you could use a simple feedforward neural network, whereby you feed in the user ID along with timestamp through the model (with a final softmax layer as explained as above) to predict whether the user will log out or not within 10 minutes.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 datascience.stackexchange
scroll top