How do I add composite MySQL Indexes with denormalization and JOINS?
-
14-03-2021 - |
Domanda
I have the following tables:
CREATE TABLE base_event (
id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
created_by ... -- some columns
);
CREATE TABLE transaction_events (
event_id BIGINT UNSIGNED NOT NULL,
transaction_time TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
merchant_id BIGINT UNSIGNED NULL DEFAULT NULL,
merchant_city VARCHAR (...) NULL DEFAULT NULL, -- Denormalize
customer_id BIGINT UNSIGNED NULL DEFAULT NULL,
customer_ip_address VARCHAR(...) NULL DEFAULT NULL, -- Denormalize
...
FOREIGN KEY (event_id) REFERENCES base_event(id),
FOREIGN KEY (customer_id) REFERENCES customers(id),
FOREIGN KEY (merchant_id) REFERENCES merchants(id),
);
CREATE TABLE customers (
id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
customer_ip_address VARCHAR(...) NULL DEFAULT NULL,
...
);
CREATE TABLE merchants (
id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
...
);
And my SELECT
:
(SELECT t.*, c.name AS customer_name ...
FROM transaction_events t
JOIN customers c ON t.customer_id = c.id
JOIN merchants m ON t.merchant_id = m.id
WHERE t.customer_ip_address = 'abc' AND t.transaction_time > 'abc')
UNION DISTINCT
(SELECT t.*, c.name AS customer_name ...
FROM transaction_events t
JOIN customers c ON t.customer_id = c.id
JOIN merchants m ON t.merchant_id = m.id
WHERE t.merchant_city = 'abc' AND t.transaction_time > 'abc')
And my indexes are:
ALTER TABLE transaction_events
ADD INDEX index_1 (customer_ip_address, transaction_time),
ADD INDEX index_2 (merchant_city, transaction_time);
- My query is in this form to avoid
OR
. - I've denormalized to a degree for the sake of the indexes.
- I do not need to reference my
base_event
table for this query. - The relation with
transaction_events
tocustomers
andmerchants
is not 1-to-1 but 1-to-0-or-1.
My questions:
- I can get rid of the wildcard, but
transaction_events
has around 20 columns (Would it help creating any further indexes to speed up the query? - Do I need to put any other composite indexes (that potentially reference my FKs) to further improve this query?
Soluzione
The WHERE
clauses refer to t
, so it is very likely that the Optimizer will start with t
in each SELECT
. You have the optimal indexes for them.
Then it needs to reach into the other two tables (merchants
and customers
) and get 1 (or 0) row from them. Those tables have the optimal index for the JOIN
, namely PRIMARY KEY(id)
in each case. (The FKs do not play any role in this query.)
t.*
might slow things down if it is fetching large TEXT
columns that you then ignore.
Since you need all the columns, then the only possible inefficiency is if each SELECT
is fetching the same row redundantly, only to be dedupped by UNION DISTINCT
. I think that that problem is not worth fixing. (The fix would be to have the UNION
find and dedup only t.id
; then join back to t
to get the other 19 columns. The cost of the extra work may outweigh the benefit; I cannot tell.)
Altri suggerimenti
Do you need all 20 columns from transaction_events
? If not, then getting rid of *
and specifying only the columns you need not only reduces the amount of data you're pulling back at one time but also reduces the chances of a sub-optimal query plan. It's possible the query plan generated will vary based on the columns in your SELECT
clause.
You could test adding an index on the customer_id
field and another index on the merchant_id
field for your JOIN
clauses and see if it improves performance and produces a better query plan. But this will require testing and comparison of the EXPLAIN for each case.