Is it good practice to create a separate table for views count?
-
17-01-2021 - |
Question
I have created a blog
table where I'm having a field called views_count
but I have heard that updating the views_count
field on each page view is strain. So I have created now a seperate table for views count as below:
views:
id,
blog_id,
ip_address,
counter
Now I'm storing unique visits in views
table. And when I save record in view table I also update blog
field views_count
field, so is this a good approach? Or is there a better alternative?
Full Creates Schema:
CREATE TABLE `video_blog` (
`id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`category_id` int(11) UNSIGNED DEFAULT NULL,
`title` varchar(255) NOT NULL,
`sub_title` varchar(255) DEFAULT NULL,
`slug` varchar(255) NOT NULL,
`video_embed_code` text,
`video_thumbnail` varchar(255) DEFAULT NULL,
`video_thumbnail_alt` varchar(255) DEFAULT NULL,
`description` text,
`views` int(11) UNSIGNED NOT NULL,
`is_active` tinyint(1) UNSIGNED NOT NULL DEFAULT '1',
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
);
-- Table structure for table `video_blog_category`
CREATE TABLE `video_blog_category` (
`id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`name` varchar(255) NOT NULL,
`description` varchar(255) DEFAULT NULL,
`meta_title` varchar(255) DEFAULT NULL,
`meta_description` varchar(255) DEFAULT NULL,
`order_by` int(11) UNSIGNED DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
);
-- Table structure for table `video_blog_views_tracker`
CREATE TABLE `video_blog_views_tracker` (
`id` int(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`video_blog_id` int(11) UNSIGNED DEFAULT NULL,
`user_ip_address` varchar(255) DEFAULT NULL,
`counter` int(11) UNSIGNED DEFAULT NULL,
`created_at` datetime DEFAULT NULL,
`updated_at` datetime DEFAULT NULL,
PRIMARY KEY (`id`)
);
Note: Our website blog is getting millions of visitors daily. So the new table will get updated frequently.
Solution
At a mere 10 updates per second, this is a non-issue.
When you get to 100/sec, and are still using HDD, then we can discuss further. Or 1000/sec with SDD.
At higher rates, yes, violate the textbook principles and have the view-counter in a separate table (with only the count
, plus page_id
as the PK count
). The reason is to avoid conflicts with non-counter accesses to the main table.
If you are keeping track of each 'view' as in a table of "who viewed what, when", the problem gets messier. On the one hand, there are INSERTs
into that table (again, 10/sec is not a problem). On the other hand, SELECT COUNT(*) ...
will have extremes -- a count of 100 is no problem, but a count of a million can be.
"Likes" have similar issues.
For more extreme traffic, you need to gather up the updates/inserts, consolidate them, then apply them. This might get you another 10x speedup, at the expense of some complexity and a few seconds delay in updating the counters.
But, by that time, you will have outgrown a single server, and other solutions will be needed for all your problems. Sharding is a likely part of this next level of design.
For any system that starts small, and grows to be huge, you must expect to do a major redesign every so often. For you (today), moving the counters out is premature. However, doing so could forestall (for a while) the next major redesign.
Rehash
Plan A: (all in one)
CREATE TABLE Blog (
id INT UNSIGNED AUTO_INCREMENT,
lots of meta info -- title, etc
view_ct INT UNSIGNED NOT NULL DEFAULT '0',
PRIMARY KEY (id)
);
Plan B: (split out just the counter)
CREATE TABLE Blog (
id INT UNSIGNED AUTO_INCREMENT,
lots of meta info -- title, etc
PRIMARY KEY (id)
);
CREATE TABLE BlogViews (
blog_id INT UNSIGNED, -- not A_I; for joining to Blog
view_ct INT UNSIGNED NOT NULL DEFAULT '0',
ts TIMESTAMP NOT NULL, -- optional -- time of last viewing??
PRIMARY KEY(blog_id)
);
Discussion of Plan A:
- Simpler
- Good enough for "low traffic" website -- say 10 views/sec.
Advantages of B:
- Needs
JOIN
, but only when both meta and count are needed together. ThisJOIN
is NOT a big burden. - Updating the count hits only
BlogViews
, thereby not interfering with any queries that need only meta info, especiallyUPDATEs
to such. - Needed for busy website -- say peak loading of 1000 views/sec.
When to use C:
- Thousands of views/sec.
- C involves collecting views, consolidating them, then updating a structure like Plan B.
- This further isolates both
Blogs
andBlogViews
from interference. - View counts may be slightly delayed (seconds).
- (Further details can be discussed elsewhere)
Plans A2, B2, C2, D2:
- These are modifications to the other plans, wherein you keep track of 'who' views the blogs 'when'.
- This needs to worry about
SELECT COUNT(*)
instead of merelySELECT view_ct
. SELECT COUNT(*)
can be costly if you have a million views.- These extensions are best handled with "Summary Table" design concepts, which I cover here.
Plan E (Now that the actual schemas are presented, I'll call that E):
For video_blog_views_tracker
, get rid of id
, and have
PRIMARY KEY(video_blog_id, user_ip_address) -- should be unique
That should be optimal for the counter query:
SELECT SUM(counter) FROM video_blog_views_tracker
WHERE video_blog_id = ?
Yes, rolling that into video_blog.views
via a TRIGGER
or CRON job is possible. But I would not do it until the need is determined.