Question

This might look really silly.. and a question with no research, but trust me it is not. I have done some research on it. One of them would be the following link: http://www.quora.com/Twitter-1/How-does-Twitter-implement-hashtags

Also I am not looking for a complete solution here.. I will do my hard work, but I just need some guidance regarding this, just want to know which way should I approach?

I want to implement twitter and now even facebook like hashtags for my application.. So that users can add messages with hashtags and others can search over them.. like what is trending and what is relevant.

We are using Mysql, mongo and elasticsearch in our storage tech stack. any ideas how could I start working to implement this? Would I need another storage? One way is that I can store my hastags in db and then do a text search for them in Elasticsearch.

What can people with more experience in this field suggest here?

Was it helpful?

Solution

A start with MongoDB would be to parse each message for hashtags the user used and put these into a sub-array of the document. Example status update:

Peter

April 29th 2014 12:28:34

Hello friends, I visited the #tradeshow in #washington and drank a delicious #coffee

This message would look like this in MongoDB:

{
    author: "Peter",
    date: ISODate("2014-04-29 12:28:34"),
    text: "Hello friends, I visited the #tradeshow in #washington and drank a delicious #coffee",
    hashtags: [
        "tradeshow",
        "washington",
        "coffee"
    ]
}

When you then create an index on db.collection.hashtags you can quickly search for all messages which include one of these hashtags. You likely want to order and limit the results by date so the user sees the most recent results first. When you make it a compound index which also includes the date, you can also speed that up.

How to implement "trending" topics is a quite complex question. It is also very subjective depending on what you would consider "trending". The exact algorithms Twitter or Facebook use to determine which topics are trending or not is not public. According to various social media analysts they also change them frequently, so we can assume that they are quite complex by now.

That means we can not help you to come up with an algorithm on your own. But when you already have an algorithm in mind to calculate the "trendyness" of a hashtag, we could help you to find a good implementation.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top