Question

We're creating a twitter-like database using SQL. I'm having trouble creating a stored procedure for:

Allowing a user to retrieve top-2 most frequently used hash tags in his/her tweets.

UPDATE: This is my stored procedure

create or replace 
procedure TOP_2_FREQUENT_HASHTAGS is
first_hashtag varchar2(255);
second_hashtag varchar2(255);

begin

with TopTwoHashtags 
AS (
SELECT 
  t.userID,
  th.HASHTAGID,
  ROW_NUMBER() OVER  (ORDER BY COUNT(th.TWEETID) DESC) r

FROM 
  Tweet_Hashtag th
  INNER JOIN Tweets t
  ON th.TWEETID = t.TWEETID
WHERE 
  userID = t.userid
GROUP BY
  t.userID,
  th.HASHTAGID
)
SELECT 
  ht.TOPIC
 into first_hashtag
FROM
  Hashtag ht
  INNER JOIN TopTwoHashtags tt
  ON ht.HASHTAGID = tt.HASHTAGID
WHERE
  r < 3;

 dbms_output.put_line('Top 2 most frequent hashtags: '|| first_hashtag);

 exception 
  when no_data_found then
  dbms_output.put_line('This user does not exist');
  return;

end;

We have the following tables:

  • Admin
  • Follow
  • Hashtag

    • HASHTAGID
    • TOPIC
  • Tweet_Hashtag

    • TWEETHASHID
    • TWEETID
    • HASHTAGID
  • Tweets
    • TWEETID
    • USERID
    • TWEETDATE
    • TWEETTIME
    • TWEETTEXT
  • TwitUser

This is our stored procedure for searching for tweets:

create or replace 
procedure search_tweets(ttwitext in tweets.tweettext%type, tuserID in tweets.userid%type )
is
  twit_user  tweets.userid%type;
  twit_id tweets.tweetid%type;
  twit_text tweets.tweettext%type;

begin

  select tweettext into twit_text from tweets where userid = tuserid and tweettext like '%' ||ttwitext || '%';
  if twit_text is not null then

  dbms_output.put_line(twit_text);

 end if; 


exception
  when no_data_found then
    dbms_output.put_line('kersplat' );
    return;

end;
Was it helpful?

Solution 2

This is how I would write it. Using a WITH block I would calculate the ROW_NUMBER Over the counts using ROW_NUMBER.

Note: this does not deal well with ties and will pull back arbitrary rows when they occur. You could use RANK if you want to show more than two hashtags in this case

with TopTwoHashtags
AS (

SELECT 
  t.userID,
  th.HASHTAGID,
  ROW_NUMBER() OVER  (ORDER BY COUNT(th.TWEETID) DESC) r
FROM 
  Tweet_Hashtag th
  INNER JOIN Tweets t
  ON th.TWEETID = t.TWEETID
WHERE 
  userID = 3
GROUP BY
  t.userID,
  th.HASHTAGID
)
SELECT
  ht.TOPIC
FROM
  Hashtag ht
  INNER JOIN TopTwoHashtags tt
  ON ht.HASHTAGID = tt.HASHTAGID
WHERE
  r < 3;

DEMO

OTHER TIPS

In SQL you can get the most frequent hashtags using SQL such as this:

select tagid, COUNT(*) as cnt
from tweet_hashtag ht
where userid = tuserid
group by userid, tagid
order by cnt desc
limit 2

If you want the actual tag, you'd need to join in the tag name from the hashtag table.

If the tweet_hashtag table doesn't have the userid, then you might need to join in the tweets table:

select ht.tagid, COUNT(*) as cnt
from tweet_hashtag ht join
     tweets t
     on ht.tweetId = t.tweetId
where ht.userid = tuserid
group by t.userid, ht.tagId
order by cnt desc
limit 2

If you are using Oracle, replace the limit 2 with 'rownum <= 2'. If you are using SQL Server or Sybase, replace it with top 2 in the select clause.

Conrad is absolutely right. The correct format for rownum is:

select t.* from (select ht.tagid, COUNT(*) as cnt from tweet_hashtag ht join tweets t on ht.tweetId = t.tweetId where ht.userid = tuserid group by t.userid, ht.tagId order by cnt desc ) t where rownum <= 2

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top