Question

I have a database that has 60 million+ records. The current set up is there is 1 table with 30+ million and a couple small tables with 5 million (ish) in each one. The data structure is the same for each table. The person who had created our search the first time (3-4 years ago before i was here) used multiple small tables. We are using match against for each with joins. My boss and him were under the impression that using multiple tables lets MySQL search each table simultaneously. In everything I read everyone says that one big table would be better but as the 30+ million table get bigger its seems to be slowing down significantly taking 40+ secs sometimes. Is this slower than it should be?

The select statment

SELECT $stuff FROM table1 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE) UNION ALL
SELECT $stuff FROM table2 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE) UNION ALL
SELECT $stuff FROM table3 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE) UNION ALL
SELECT $stuff FROM table4 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE) UNION ALL
SELECT $stuff FROM table5 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE) UNION ALL
SELECT $stuff FROM table6 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE) UNION ALL
SELECT $stuff FROM table7 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE) UNION ALL
SELECT $stuff FROM table8 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE) UNION ALL
SELECT $stuff FROM table9 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE) UNION ALL
SELECT $stuff FROM table10 WHERE MATCH (Name) AGAINST ('+john +smith' IN BOOLEAN MODE)

The tables are MyISAM and there is a full text index on col Name. Table3 is the one that has 30+ million records (approx 10gb). Would putting it in one table or splitting it up make much of a performance increase? I am missing something else? Or is 60+ million records to big to get a quick response on fulltext search?

Was it helpful?

Solution

A small note first. There is no other real answer except "change it an try". That said,

If you are always querying all your tables and you mostly do reads I'm pretty sure it would be faster to use one big table.

Using union (or union all) will always put the individual results in a temporary table and if the that table is big enough it will be created on disk. If you have one big table you can return the result directly to the client.

If you do a lot of inserts they will be faster if you put them in a smaller table (as the index to traverse is smaller.

However, if you could determine what tables could possibly return results and just use them in the query you could gain a lot of splitting them. This can also be be done with partitioning

Also, if you could put the queries in your application and execute them in parallel and make the join outside MySQL you might gain some performance, but again, you need to try and measure to really know.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top