Question

Referencing:

CREATE TABLE words ( word_id int(11) NOT NULL, word varchar(25) NOT NULL, PRIMARY KEY (word) )

CREATE TABLE synonyms ( source_index int(10) unsigned NOT NULL, destination_index int(10) unsigned NOT NULL )

Querying:

SELECT w.word, z.word FROM words w INNER JOIN synonyms y ON w.word_id=y.source_index INNER JOIN words z ON z.word_id=y.destination_index WHERE w.word='kind'

The problem is a query on a table with less than 120,000 entries takes 400+ seconds. I was hoping this would be more efficient than having a second table with a similar word list for the synonyms, but so far it is proving otherwise. I have no qualms with keeping synonyms a separate word table as it isn't quite a duplicate of words. I am not finding anything applicable on tuning such queries for lower time online. Is there a way to tune this for reasonable speed (<100 msec) or am I better off without the 'split self reference'?

Was it helpful?

Solution

Right now you have only added one index on words for word. Therefore your query will right now result in at least one full table scan for both words and synonyms, probably even more - that might depend on your DBMS and it's query optimizer's abilities.

Try adding an index on words for word_id and on synonyms for at least source_index. This way your query will use the index and not do a full table scan.

You could probably improve on that by using covering indeces, e.g. (word_id, word) on words and (source_index, destination_index) on synonyms.

Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top