문제

I want to build an analytics engine on top of an article publishing platform. More specifically, I want to track the users' reading behaviour (e.g. number of views of an article, time spent with the article open, rating, etc), as well as statistics on the articles themselves (e.g. number of paragraphs, author, etc).

This will have two purposes:

  1. Present insights about users and articles
  2. Provide recommendations to users

For the data analysis part I've been looking at cubes, pandas and pytables. There is a lot of data, and it is stored in MySQL tables; I'm not sure which of these packages would better handle such a backend.

For the recommendation part, I'm simply thinking about feeding data from the data analysis engine to a clustering model.

Any recommendations about how to put all this together, as well as cool python projects out there that can help me out? Please let me know if I should give more information.

Thank you

도움이 되었습니까?

해결책

Scikit-learn should make you happy for the data processing (clustering) part.

다른 팁

For the analysis and visualization side, you have Cubes as you mentioned, and for viz I use CubesViewer which I wrote.

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top