Should I create a slug on the fly or store in DB?

https://stackoverflow.com/questions/807195

slug

03-07-2019
|

Question

A slug is part of a URL that describes or titles a page and is usually keyword rich for that page improving SEO. e.g. In this URL PHP/JS - Create thumbnails on the fly or store as files that last section "php-js-create-thumbnails-on-the-fly-or-store-as-files" is the slug.

Currently I am storing the slug for each page with the page's record in the DB. The slug is generated from the Title field when the page is generated and stored with the page. However, I'm considering generating the slug on the fly in case I want to change it. I'm trying to work out which is better and what others have done.

So far I've come up with these pro points for each one:

Store slug: - "Faster" processor doesn't need to generate it each time (is generated once)

Generate-on-the fly: - Flexible (can adjust slug algorithm and don't need to regen for whole table). - Uses less space in DB - Less data transferred from DB to App

What else have I missed and how do/would you do it?

EDIT:

I'd just like to clarify what looks like a misunderstanding in the answers. The slug has no effect on landing on the correct page. To understand this just chop off or mangle any part of the slug on this site. e.g.:

PHP/JS - Create thumbnails on the fly or store as files

will all take you to the same page. The slug is never indexed.

You wouldn't need to save the old slugs. If you landed on a page which had an "old slug" then you can detected that and just do a 301 redirect to the correctly "slugged" one. In the examples above, if Stack Overflow implemented it, then when you landed on any of the links with truncated slugs above, it would compare the slug in the url to the one generated by the current slug algorithm and if different it would do a 301 redirect to the same page but with the new slug.

Remember that all internally generated links would immediately be using the new algorithm and only links from outside pointing in would be using the old slug.

Solution

You might need to take another thing into consideration, what if you want the user/yourself to be able to define their own slugs. Maybe the algorithm isn't always sufficient.

If so you more or less need to store it in the database anyhow.

If not I don't think it matters much, you could generate them on the fly, but if you are unsure whether you want to change them or not let them be in the database. In my eyes there is no real performance issue with either method (unless the on-the-fly generation is very slow or something like that).

Choose the one which is the most flexible.

OTHER TIPS

Wouldn't changing the slugs for existing pages be a really bad idea? It would break all your inlinks for a start.

Edit, following Guy's clarification in the question: You still need to take old slugs into account. For instance: if you change your slug algorithm Google could start to see multiple versions of each page, and you could suffer a duplicate content penalty, or at best end up sharing PR and SERPs between multiple versions of the same page. To avoid that, you'd need a canonical version of the page that any non-canonical slugs redirected to - and hence you'd need the canonical slug in the database anyway.

For slug generation I don't think that generation time should be an issue, unless your slug algorithm is insanely complicated! Similarly, storage space won't be an issue.

I would store the slug in the database for the simple reason that slugs usually form part of a permalink and once a permalink is out in the wild it should be considered immutable. Having the ability to change a slug for published data seems like a bad idea.

The best way to handle slugs is to only store the speaking part of the slug in the database and keep the routing part with the unique identifier for dynamic generation. Otherwise (if you store the whole url or uri) in the database it might become a massive task to rewrite all the slugs in the database first if you changed your mind about how to call them.

Let's take this questions SO slug as example:

/questions/807195/should-i-create-a-slug-on-the-fly-or-store-in-db

it's:

/route/unique-ID/the-speaking-part-thats-not-so-important

The dynamic part is obviously:

/route/unique-ID/

And the one I would store in the database is the speaking part:

the-speaking-part-thats-not-so-important

This allows you to always change your mind about the route's name and do the proper redirects without to have to look inside the database first and you're not forced to do db changes. The unique Id is always your database data unique Id so you can identify it correctly and you of cause know what your routes are.

And don't forget to set the canonical tag. If you take a look inside this page code it's there:

<link rel="canonical" href="http://stackoverflow.com/questions/807195/should-i-create-a-slug-on-the-fly-or-store-in-db" />

This allows search engines to identify the correct page link and ignore others in case you have duplicate content.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow