Pergunta

Imagine you're writing some sort of Web Analytics system - you're recording raw page hits along with some extra things like tagging cookies etc and then producing stats such as

  • Which pages got most traffic over a time period
  • Which referers sent most traffic
  • Goals completed (goal being a view of a particular page)
  • And more advanced things like which referers sent the most number of vistors who later hit a goal.

The naieve way of approaching this would be to throw it in a relational database and run queries over it - but that won't scale.

You could pre-calculate everything (have a queue of incoming 'hits' and use to update report tables) - but what if you later change a goal - how could you efficiently re-calculate just the data that would be effected.

Obviously this has been done before ;) so any tips on where to start, methods & examples, architecture, technologies etc.

Nenhuma solução correta

Licenciado em: CC-BY-SA com atribuição
scroll top