I have a table of twitter data in MYSQL where the columns is_retweet, is_reply is made of binary values where 1=yes, 0=no. if a user retweeted multiple times in a day, there would then be multiple rows of ones in the retweet coulmn for that user on that day.
account_id, datetime, user_screenname, is_retweet, is_reply,followers_count
'9', '2008-06-11 20:06:35','Access2', '1', '0', '811'
'9', '2008-06-11 23:06:35','Access2', '1', '1', '812'
'9', '2008-06-12 20:01:21','Access2', '0', '1', '813'
'7', '2008-06-11 17:01:00','actingparty', '1', '1', '2000'
I rearrange my sql output to a table below which tells me: for a username on any day, what is the total number of retweets, replies and highest follower count.
account_id, date, user_screenname, sum_retweet, sum_reply, followers_count
'9', '2008-06-11', 'Access2', '2', '0', '812'
'9', '2008-06-12', 'Access2', '0', '1', '813'
Here is my sql code:
CREATE VIEW `tweet_sum` AS
select
`tweets`.`account_id` AS `account_id`,
`tweets`.`user_screenname` AS `user_screenname`,
CAST(`tweets`.`datetime` as date) AS `period`,
MAX(`tweets`.`followers_count`) AS `followers_count`,
SUM(`tweets`.`is_reply`) AS `sum_reply`,
SUM(`tweets`.`is_retweet`) AS `sum_retweet`,
from
`tweets`
group by cast(`tweets`.`datetime` as date), tweets.username
Ultimately, I want to have one more column Reach (which is equal to followers_count times the number of columns(is_retweet, is_reply) that is greater than zero.)
For example, in the output table below, the sum_retweet and sum_reply columns are both greater than zero for 2008-06-11 so i will need to take followers_count*2=1624 for the reach column.
How can i structure my sql code to do that?
account_id, date, user_screenname, sum_retweet, sum_reply, followers_count, **Reach**
'9', '2008-06-11', 'Access2', '2', '1', '812', '1624'
'9', '2008-06-12', 'Access2', '0', '1', '813', '813'
I thought of doing it this way:
1.create a new view
2.count the number of columns that have values >0
3.then take that number multiply by followers count for that day
And the code for that below:
CREATE VIEW tweet_reach AS
SELECT
COUNT(t.sum_reply,t.sum_retweet,t.sun_mention,t.sum_direct,t.sum_mytweet)*t.followers_count AS Reach
FROM information_schema.columns
WHERE table_name='tweet_sum' t AND
t.sum_reply>0 OR
t.sum_retweet>0 OR
t.sun_mention>0 OR
t.sum_direct>0 OR
t.sum_mytweet>0;
This code is wrong but hoping to do something like this. Is it possible?
Thanks,
J