I added "IF message_table.message_text_id != 0" and don't know if something like that is possible.
Unless there actually is a row with text_id = 0
in your text_table
, there is no need to do this. Simply omit the IF
and use the following query:
SELECT IFNULL(text_table.text_body, message_table.message_short_body) AS body,
…
FROM message_table
LEFT JOIN text_table ON text_table.text_id = message_table.message_text_id
WHERE message_table.message_to_id = $user_id
In terms of performance, it might be that the engine can optimize things more efficiently if you add your condition to the join conditions:
SELECT IFNULL(text_table.text_body, message_table.message_short_body) AS body,
…
FROM message_table
LEFT JOIN text_table ON text_table.text_id = message_table.message_text_id
AND message_table.message_text_id != 0
WHERE message_table.message_to_id = $user_id
You could also try an approach using a subquery:
SELECT IF(message_text_id = 0, message_short_body, (
SELECT text_table.message_short_body
FROM text_table
WHERE text_table.text_id = message_table.message_text_id)) AS body,
…
FROM message_table
WHERE message_table.message_to_id = $user_id
This has the benefit of not executing the search in text_table
if none is required, but the drawback of performing a separate query for each case with a long message. I would expect the above queries to be superior, but I'm not sure.
As a general rule is it possible to tell if this would reduce the size of the database / speed up queries ?
You'll have to benchmark, as it depends on the use case. If most of your queries retrieve data from the fields other than the text, then the smaller table will make those queries faster, yielding a performance gain. If, on the other hand, you usually want the body along withe the rest of the message, then you'll likely end up with worse performance.
You should also use benchmarks to distinguish between the different alternatives described above.
In terms of size of the database, you'll likely see an increase: the storage requirements for the text data are about the same, but the indices for the extra table will cost you.
I guess if this were my schema, I'd drop the message_text_id
and instead have primary key of the text_table
match that of the message_table
. I.e. each key occurs either only in the message table or in both tables, and rows with the same key belong together. Whether or not the message is in the other table could be encoded by setting message_table.message_short_body
to NULL
in these cases.