Question

Someone can add articles on my ASPX webpage.

For example they type in something like this as Title:

Öçöçü

This is saved in the database as:

Öçöçü

But when I need to read it from the database to create an url for it, then I read it in this format:

Ococu

So I send people to read the article to this url: www.test.com/ococu and the Ococu is the Querystring.

I need to retrieve the content of this article with the title Öçöçü, but I can't do something like

SELECT * FROM Article WHERE Title = Ococu (Ococu = querystring), because it has been saved to the database as Öçöçü

What can I do to solve this? I need to execute that Query but that is not possible because the querystring doesn't contain the foreign characters which are saved in the database.

Or should I create two attributes in the database? One for the Title and the other for the URL? So I can execute the query in the where claus as: WHERE URL = Ococu

Please I need your help guys.

Was it helpful?

Solution

It's hard to tell how to fix this problem without knowing the way your columns are declared in your MySQL table, and how your entities (characters) are stored. This kind of multinational stuff is easiest to handle if your columns are declared something like this:

  Title VARCHAR(50) CHARACTER SET utf8 COLLATE utf8_turkish_ci

This allows the table to contain your Turkish characters (and Greek and Hungarian, for that matter) without having to be entity-coded (ü, etc.)

If in fact your tables are coded this way, try the following SELECT statement:

 SELECT * 
   FROM Article
  WHERE Title = 'ococu' 
COLLATE utf8_general_ci

Being ignorant of Turkish, I don't know the reasons for this, but it's clear that the Turkish collation treats ö as a different letter from o, likewise for ç and c, and ü and u. However, the utf_general_ci collation treats those letters as the same. That's why the SELECT statement above works.

IF your data in the table is stored entity-coded (ü, etc.) you really ought to translate it to utf8 so you can use this kind of search.

Finally, the fragment of URL you're mentioning with the value ococu is often called a slug in the trade. Your title Öçöçü needs to be converted to the slug value for searching. My suggestion above employs the collation to do that. It's worth mentioning that content management systems often store an article's title and slug in separate columns in the database table. This allows the creation of the slug from the title to be done explicitly at the time the article is created.

Here's a Stack Overflow item explaining how to use C# to turn a unicode phrase into a slug.

URL Slugify algorithm in C#?

OTHER TIPS

Best would be to have 2 separate fields in the DB, one for title, another for url available. Sometimes you would want to have the title changed, but url intact, so the already existing links are not broken. And you probably would not want to have spaces and other special characters in your url, so you'd want to shorten and replace symbols (just as you describe in your question).

Also this mitigates some of the security risks that can be exposed, if third party create articles on your site, like executing nasty scripts from url parameter.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top