Question

I have a table correlation with three columns.

correlation

user1   user2   corr 

This table contains the correlation values for all pairs of users.

I need to update the corr values for all pairs from another table.

The query I am using is:

UPDATE correlation
SET corr = (SELECT ROUND((COUNT(*) * SUM(x.rating * y.rating) - SUM(x.rating) * SUM(y.rating)) /
                   (SQRT(COUNT(*) * SUM(SQUARE(x.rating)) - SQUARE(SUM(x.rating))) * SQRT(COUNT(*) 
                     * SUM(SQUARE(y.rating)) - SQUARE(SUM(y.rating)))), 2) 
            FROM            users AS x INNER JOIN
                     users AS y ON x.itemID = y.itemID
            WHERE        (x.userID = @user1) AND (y.userID = @user2)))
WHERE user1 = @user1 and user2 = @user2

How can I execute this query with a procedure?

I am retrieving all pairs of users from correlation table first.

SELECT user1, user2 from correlation 

How can I use the results of this query and for each row returned , execute the update query?

There are ~2 million rows in correlation table.

I tried doing this within C# code with SqlDataReader (without any stored procedure), but it was taking too long. The SqlDataReader would read all rows, and for each row read it would execute the update query.

Was it helpful?

Solution

Do you want to update all records in the correlation table? Like this:

UPDATE correlation
   SET corr = (SELECT ROUND((COUNT(*) * SUM(x.rating * y.rating) - SUM(x.rating) * SUM(y.rating)) /
                      (SQRT(COUNT(*) * SUM(SQUARE(x.rating)) - SQUARE(SUM(x.rating))) * SQRT(COUNT(*) 
                      * SUM(SQUARE(y.rating)) - SQUARE(SUM(y.rating)))), 2) 
                 FROM users AS x
           INNER JOIN users AS y
                   ON x.itemID = y.itemID
                WHERE (x.userID = user1) AND (y.userID = user2)))

Or

    UPDATE c
       SET corr = z.corr
      FROM correlation c
INNER JOIN (SELECT ROUND((COUNT(*) * SUM(x.rating * y.rating) - SUM(x.rating) * SUM(y.rating)) /
                          (SQRT(COUNT(*) * SUM(SQUARE(x.rating)) - SQUARE(SUM(x.rating))) * SQRT(COUNT(*) 
                          * SUM(SQUARE(y.rating)) - SQUARE(SUM(y.rating)))), 2) AS corr,
                   x.userID AS user1, y.userID AS user2
              FROM users AS x
        INNER JOIN users AS y
                ON x.itemID = y.itemID
          GROUP BY x.userID, y.userID) AS z
        ON z.user1 = c.user1 AND z.user2 = c.user2

OTHER TIPS

If you are retrieving data from database table and sending it back to database then I would suggest to create SP and that SP will do same operation for you.

 CREATE PROCEDURE [Procedure_Name]

 AS

 BEGIN

 declare @user1  int
 declare @user2 int
 declare cur CURSOR LOCAL for

 SELECT user1, user2 from correlation 

 open cur

 fetch next from cur into @user1, @user2

 while @@FETCH_STATUS = 0 BEGIN

 UPDATE correlation
 SET corr = (SELECT ROUND((COUNT(*) * SUM(x.rating * y.rating) - SUM(x.rating) *         SUM(y.rating)) /
               (SQRT(COUNT(*) * SUM(SQUARE(x.rating)) - SQUARE(SUM(x.rating))) * SQRT(COUNT(*) 
                 * SUM(SQUARE(y.rating)) - SQUARE(SUM(y.rating)))), 2) 
        FROM            users AS x INNER JOIN
                 users AS y ON x.itemID = y.itemID
        WHERE        (x.userID = @user1) AND (y.userID = @user2)))
        WHERE user1 = @user1 and user2 = @user2

 --If you have Sp to perform above put your SP 
 --execute your SP with user1, user2 on each row
 --For example exec uspYourSP @user1, @user2

 fetch next from cur into @user1, @user2
END

close cur
deallocate cur

END

Now once your SP is in place you just need to call the SP from C# without any parameters. This will improve your performance also as we are getting records from CURSOR.

You can do three things :

  1. Use Bulk Insert/Update, or other techniques from here.

  2. use a CLR SP (so, you'll write it in your c# happily ever after, and then it's inserted to the DB as a stored procedure and you can run it). See more on MSDN.

  3. Use plain old SQL, but since I'm not an SQL master, I'll leave this option for someone else to answer

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top