Pergunta

I want to build an analytics engine on top of an article publishing platform. More specifically, I want to track the users' reading behaviour (e.g. number of views of an article, time spent with the article open, rating, etc), as well as statistics on the articles themselves (e.g. number of paragraphs, author, etc).

This will have two purposes:

  1. Present insights about users and articles
  2. Provide recommendations to users

For the data analysis part I've been looking at cubes, pandas and pytables. There is a lot of data, and it is stored in MySQL tables; I'm not sure which of these packages would better handle such a backend.

For the recommendation part, I'm simply thinking about feeding data from the data analysis engine to a clustering model.

Any recommendations about how to put all this together, as well as cool python projects out there that can help me out? Please let me know if I should give more information.

Thank you

Foi útil?

Solução

Scikit-learn should make you happy for the data processing (clustering) part.

Outras dicas

For the analysis and visualization side, you have Cubes as you mentioned, and for viz I use CubesViewer which I wrote.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top