Pregunta

I am going to ask some random questions from hundreds of them to my users and the answers can be only "true" or "false". Then I am going to record those answers into my database and lastly I am going to compare those answers and find relations between them such as "User A" and "User B" have 20 answers in common and notify those users.

New users can join the service or actual users can answer more questions in time, so I need to do make comparison frequently.

What is the best business logic or approach to proceed? How can I do this in a scalable way? How can I google solutions or tutorials for this question (any keyword)?

This is how my database looks like for this process:

I have three tables in my database:

  • One of them called "users" and keeps the information of the registered users such as username, password, etc.

  • Second one called "questions" and keeps the questions to be asked to users.
    (Two columns; question id and question text)

  • Last one called "answers" and keeps the answers given by the users.
    (Two columns; question id and answer (TRUE/FALSE))

¿Fue útil?

Solución

The following query:

SELECT a.user_id AS a_user_id, 
       b.user_id AS b.user_id,
       a.question_id, 
       a.answer
FROM 
    answers AS a INNER JOIN answers AS b
        ON a.question_id = b.question_id AND a.answer = b.answer;
WHERE 
    a.user_id <> b.user_id;

will give you a long, long table with all identical answers by all users. You could then process this table to find the answers that your users have in common, or you could use COUNT() and GROUP BY to get only counts.

Try this first; make some money while your users number in the thousands and not in the millions; if this does not perform well enough by the time your users will be numbering in the millions, you will already be a millionaire, so you will be able to afford a team of Ph.Ds to work on the problem and provide a more optimal, more scalable solution.

Licenciado bajo: CC-BY-SA con atribución
scroll top