Question

I have created a blog table where I'm having a field called views_count but I have heard that updating the views_count field on each page view is strain. So I have created now a seperate table for views count as below:

views:
id,
blog_id,
ip_address,
counter

Now I'm storing unique visits in views table. And when I save record in view table I also update blog field views_count field, so is this a good approach? Or is there a better alternative?

Full Creates Schema:

CREATE TABLE `video_blog` (
  `id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
  `category_id` int(11) UNSIGNED DEFAULT NULL,
  `title` varchar(255) NOT NULL,
  `sub_title` varchar(255) DEFAULT NULL,
  `slug` varchar(255) NOT NULL,
  `video_embed_code` text,
  `video_thumbnail` varchar(255) DEFAULT NULL,
  `video_thumbnail_alt` varchar(255) DEFAULT NULL,
  `description` text,
  `views` int(11) UNSIGNED NOT NULL,
  `is_active` tinyint(1) UNSIGNED NOT NULL DEFAULT '1',
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
);

-- Table structure for table `video_blog_category`

CREATE TABLE `video_blog_category` (
  `id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `description` varchar(255) DEFAULT NULL,
  `meta_title` varchar(255) DEFAULT NULL,
  `meta_description` varchar(255) DEFAULT NULL,
  `order_by` int(11) UNSIGNED DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
);

-- Table structure for table `video_blog_views_tracker`

CREATE TABLE `video_blog_views_tracker` (
  `id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
  `video_blog_id` int(11) UNSIGNED DEFAULT NULL,
  `user_ip_address` varchar(255) DEFAULT NULL,
  `counter` int(11) UNSIGNED DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
);

Note: Our website blog is getting millions of visitors daily. So the new table will get updated frequently.

Was it helpful?

Solution

At a mere 10 updates per second, this is a non-issue.

When you get to 100/sec, and are still using HDD, then we can discuss further. Or 1000/sec with SDD.

At higher rates, yes, violate the textbook principles and have the view-counter in a separate table (with only the count, plus page_id as the PK count). The reason is to avoid conflicts with non-counter accesses to the main table.

If you are keeping track of each 'view' as in a table of "who viewed what, when", the problem gets messier. On the one hand, there are INSERTs into that table (again, 10/sec is not a problem). On the other hand, SELECT COUNT(*) ... will have extremes -- a count of 100 is no problem, but a count of a million can be.

"Likes" have similar issues.

For more extreme traffic, you need to gather up the updates/inserts, consolidate them, then apply them. This might get you another 10x speedup, at the expense of some complexity and a few seconds delay in updating the counters.

But, by that time, you will have outgrown a single server, and other solutions will be needed for all your problems. Sharding is a likely part of this next level of design.

For any system that starts small, and grows to be huge, you must expect to do a major redesign every so often. For you (today), moving the counters out is premature. However, doing so could forestall (for a while) the next major redesign.

Rehash

Plan A: (all in one)

CREATE TABLE Blog (
    id INT UNSIGNED AUTO_INCREMENT,
    lots of meta info -- title, etc
    view_ct INT UNSIGNED NOT NULL DEFAULT '0',
    PRIMARY KEY (id)
);

Plan B: (split out just the counter)

CREATE TABLE Blog (
    id INT UNSIGNED AUTO_INCREMENT,
    lots of meta info -- title, etc
    PRIMARY KEY (id)
);
CREATE TABLE BlogViews (
    blog_id INT UNSIGNED,   -- not A_I; for joining to Blog
    view_ct INT UNSIGNED NOT NULL DEFAULT '0',
    ts TIMESTAMP NOT NULL,   -- optional -- time of last viewing??
    PRIMARY KEY(blog_id)
);

Discussion of Plan A:

  • Simpler
  • Good enough for "low traffic" website -- say 10 views/sec.

Advantages of B:

  • Needs JOIN, but only when both meta and count are needed together. This JOIN is NOT a big burden.
  • Updating the count hits only BlogViews, thereby not interfering with any queries that need only meta info, especially UPDATEs to such.
  • Needed for busy website -- say peak loading of 1000 views/sec.

When to use C:

  • Thousands of views/sec.
  • C involves collecting views, consolidating them, then updating a structure like Plan B.
  • This further isolates both Blogs and BlogViews from interference.
  • View counts may be slightly delayed (seconds).
  • (Further details can be discussed elsewhere)

Plans A2, B2, C2, D2:

  • These are modifications to the other plans, wherein you keep track of 'who' views the blogs 'when'.
  • This needs to worry about SELECT COUNT(*) instead of merely SELECT view_ct.
  • SELECT COUNT(*) can be costly if you have a million views.
  • These extensions are best handled with "Summary Table" design concepts, which I cover here.

Plan E (Now that the actual schemas are presented, I'll call that E):

For video_blog_views_tracker, get rid of id, and have

PRIMARY KEY(video_blog_id, user_ip_address)  -- should be unique

That should be optimal for the counter query:

SELECT SUM(counter) FROM video_blog_views_tracker
    WHERE video_blog_id = ?

Yes, rolling that into video_blog.views via a TRIGGER or CRON job is possible. But I would not do it until the need is determined.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top