Is it good practice to create a separate table for views count?

https://dba.stackexchange.com/questions/224390

17-01-2021
|

Question

I have created a blog table where I'm having a field called views_count but I have heard that updating the views_count field on each page view is strain. So I have created now a seperate table for views count as below:

views:
id,
blog_id,
ip_address,
counter

Now I'm storing unique visits in views table. And when I save record in view table I also update blog field views_count field, so is this a good approach? Or is there a better alternative?

Full Creates Schema:

CREATE TABLE `video_blog` (
  `id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
  `category_id` int(11) UNSIGNED DEFAULT NULL,
  `title` varchar(255) NOT NULL,
  `sub_title` varchar(255) DEFAULT NULL,
  `slug` varchar(255) NOT NULL,
  `video_embed_code` text,
  `video_thumbnail` varchar(255) DEFAULT NULL,
  `video_thumbnail_alt` varchar(255) DEFAULT NULL,
  `description` text,
  `views` int(11) UNSIGNED NOT NULL,
  `is_active` tinyint(1) UNSIGNED NOT NULL DEFAULT '1',
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
);

-- Table structure for table `video_blog_category`

CREATE TABLE `video_blog_category` (
  `id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
  `name` varchar(255) NOT NULL,
  `description` varchar(255) DEFAULT NULL,
  `meta_title` varchar(255) DEFAULT NULL,
  `meta_description` varchar(255) DEFAULT NULL,
  `order_by` int(11) UNSIGNED DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
);

-- Table structure for table `video_blog_views_tracker`

CREATE TABLE `video_blog_views_tracker` (
  `id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
  `video_blog_id` int(11) UNSIGNED DEFAULT NULL,
  `user_ip_address` varchar(255) DEFAULT NULL,
  `counter` int(11) UNSIGNED DEFAULT NULL,
  `created_at` datetime DEFAULT NULL,
  `updated_at` datetime DEFAULT NULL,
  PRIMARY KEY (`id`)
);

Note: Our website blog is getting millions of visitors daily. So the new table will get updated frequently.

Solution

At a mere 10 updates per second, this is a non-issue.

When you get to 100/sec, and are still using HDD, then we can discuss further. Or 1000/sec with SDD.

At higher rates, yes, violate the textbook principles and have the view-counter in a separate table (with only the count, plus page_id as the PK count). The reason is to avoid conflicts with non-counter accesses to the main table.

If you are keeping track of each 'view' as in a table of "who viewed what, when", the problem gets messier. On the one hand, there are INSERTs into that table (again, 10/sec is not a problem). On the other hand, SELECT COUNT(*) ... will have extremes -- a count of 100 is no problem, but a count of a million can be.

"Likes" have similar issues.

For more extreme traffic, you need to gather up the updates/inserts, consolidate them, then apply them. This might get you another 10x speedup, at the expense of some complexity and a few seconds delay in updating the counters.

But, by that time, you will have outgrown a single server, and other solutions will be needed for all your problems. Sharding is a likely part of this next level of design.

For any system that starts small, and grows to be huge, you must expect to do a major redesign every so often. For you (today), moving the counters out is premature. However, doing so could forestall (for a while) the next major redesign.

Rehash

Plan A: (all in one)

CREATE TABLE Blog (
    id INT UNSIGNED AUTO_INCREMENT,
    lots of meta info -- title, etc
    view_ct INT UNSIGNED NOT NULL DEFAULT '0',
    PRIMARY KEY (id)
);

Plan B: (split out just the counter)

CREATE TABLE Blog (
    id INT UNSIGNED AUTO_INCREMENT,
    lots of meta info -- title, etc
    PRIMARY KEY (id)
);
CREATE TABLE BlogViews (
    blog_id INT UNSIGNED,   -- not A_I; for joining to Blog
    view_ct INT UNSIGNED NOT NULL DEFAULT '0',
    ts TIMESTAMP NOT NULL,   -- optional -- time of last viewing??
    PRIMARY KEY(blog_id)
);

Discussion of Plan A:

Simpler
Good enough for "low traffic" website -- say 10 views/sec.

Advantages of B:

Needs JOIN, but only when both meta and count are needed together. This JOIN is NOT a big burden.
Updating the count hits only BlogViews, thereby not interfering with any queries that need only meta info, especially UPDATEs to such.
Needed for busy website -- say peak loading of 1000 views/sec.

When to use C:

Thousands of views/sec.
C involves collecting views, consolidating them, then updating a structure like Plan B.
This further isolates both Blogs and BlogViews from interference.
View counts may be slightly delayed (seconds).
(Further details can be discussed elsewhere)

Plans A2, B2, C2, D2:

These are modifications to the other plans, wherein you keep track of 'who' views the blogs 'when'.
This needs to worry about SELECT COUNT(*) instead of merely SELECT view_ct.
SELECT COUNT(*) can be costly if you have a million views.
These extensions are best handled with "Summary Table" design concepts, which I cover here.

Plan E (Now that the actual schemas are presented, I'll call that E):

For video_blog_views_tracker, get rid of id, and have

PRIMARY KEY(video_blog_id, user_ip_address)  -- should be unique

That should be optimal for the counter query:

SELECT SUM(counter) FROM video_blog_views_tracker
    WHERE video_blog_id = ?

Yes, rolling that into video_blog.views via a TRIGGER or CRON job is possible. But I would not do it until the need is determined.

Licensed under: CC-BY-SA with attribution

Not affiliated with dba.stackexchange