Domanda

Non ho alcun controllo su cose come tmp_table_size e max_heap_table_size, e così come i nostri tavoli crescere il tempo impiegato dalla query che necessitano di tabelle temporanee sta crescendo geometricamente.

Mi chiedo se c'è un modo per evitare che MySQL da utilizzare tabelle temporanee per queste ricerche? Quale sarebbe l'approccio migliore in questa situazione:

Ecco un esempio del più grande autore del reato:

SELECT `skills`.`id`
FROM (`jobs_skills`)
JOIN `jobs` ON (`jobs`.`id` = `jobs_skills`.`job_id`)
JOIN `skills` ON (`skills`.`id` = `jobs_skills`.`skill_id`)
WHERE `jobs`.`job_visibility_id` = 1
AND `jobs`.`active` = 1
AND `skills`.`valid` = 1
AND `jobs_skills`.`skill_id` IN (96,101,103,108,121,2610,99,119,2607,102,104,112,113,122,1032,1488,2608,109,126,1438,2310,2318,2622,118,1046,1387,2609,100,116,123,2611,2612,2616,2618,114,127,1562,1587,1608,2276,2615,125,1070,1071,1161,1658,2613,2614,2617,105,110,111,120,1394,1435)
GROUP BY `jobs_skills`.`job_id`

per il quale copying to temp table ha preso 107 secondi, il 99% del tempo totale di query.

nonostante i timori di tl; dr sindrome, sto offrendo. . .

maggiori informazioni

Ecco la dichiarazione EXPLAIN per la query:

+----+-------------+-------------+--------+----------------------+--------------+---------+----------------------------------+--------+----------------------------------------------+
| id | select_type | table       | type   | possible_keys        | key          | key_len | ref                              | rows   | Extra                                        |
+----+-------------+-------------+--------+----------------------+--------------+---------+----------------------------------+--------+----------------------------------------------+
|  1 | SIMPLE      | jobs        | ref    | PRIMARY,active_index | active_index | 1       | const                            | 468958 | Using where; Using temporary; Using filesort |
|  1 | SIMPLE      | jobs_skills | ref    | PRIMARY              | PRIMARY      | 4       | 557574_prod.jobs.id              |      1 | Using where; Using index                     |
|  1 | SIMPLE      | skills      | eq_ref | PRIMARY              | PRIMARY      | 4       | 557574_prod.jobs_skills.skill_id |      1 | Using where                                  |
+----+-------------+-------------+--------+----------------------+--------------+---------+----------------------------------+--------+----------------------------------------------+

e qui ci sono le dichiarazioni CREATE TABLE per le relative tabelle:

| jobs  | CREATE TABLE `jobs` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `user_id` int(10) unsigned NOT NULL,
  `title` varchar(40) NOT NULL,
  `description` text NOT NULL,
  `address_id` int(10) unsigned NOT NULL,
  `proximity` smallint(3) unsigned NOT NULL default '15',
  `job_payrate_id` tinyint(1) unsigned NOT NULL default '1',
  `payrate` int(10) unsigned NOT NULL,
  `start_date` int(10) unsigned NOT NULL,
  `job_start_id` tinyint(1) unsigned NOT NULL default '1',
  `duration` tinyint(1) unsigned NOT NULL COMMENT 'Full-time, Part-time, Flexible',
  `posting_date` int(10) unsigned NOT NULL,
  `revision_date` int(10) unsigned NOT NULL,
  `expiration` int(10) unsigned NOT NULL,
  `active` tinyint(1) unsigned NOT NULL default '1',
  `team_size` tinyint(2) unsigned NOT NULL default '1',
  `job_type_id` tinyint(1) unsigned NOT NULL default '1',
  `job_shift_id` tinyint(1) unsigned NOT NULL default '1',
  `job_visibility_id` tinyint(1) unsigned NOT NULL default '1',
  `position_count` smallint(5) unsigned NOT NULL default '1',
  `impressions` int(10) unsigned NOT NULL default '0',
  `clicks` int(10) unsigned NOT NULL default '0',
  `employer_email` varchar(100) NOT NULL default '',
  `job_source_id` smallint(6) unsigned NOT NULL default '0',
  `job_password` varchar(50) NOT NULL default '',
  PRIMARY KEY  (`id`),
  KEY `active_index` (`active`),
  KEY `user_id_index` (`user_id`),
  KEY `address_id_index` (`address_id`),
  KEY `posting_date_index` USING BTREE (`posting_date`)
) ENGINE=InnoDB AUTO_INCREMENT=875013 DEFAULT CHARSET=utf8

-

| jobs_skills | CREATE TABLE `jobs_skills` (
  `job_id` int(10) unsigned NOT NULL,
  `skill_id` int(10) unsigned NOT NULL,
  `required` tinyint(1) unsigned NOT NULL,
  PRIMARY KEY  (`job_id`,`skill_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 |

-

| skills | CREATE TABLE `skills` (
  `id` int(10) unsigned NOT NULL auto_increment,
  `parent_id` int(10) unsigned NOT NULL,
  `name` varchar(35) NOT NULL default '',
  `description` varchar(250) NOT NULL,
  `valid` tinyint(1) unsigned NOT NULL default '0',
  `is_category` tinyint(1) unsigned NOT NULL default '0',
  `last_edited` int(10) unsigned NOT NULL default '0',
  `impressions` int(10) unsigned NOT NULL default '0',
  `clicks` int(10) unsigned NOT NULL default '0',
  `jobs` int(10) unsigned NOT NULL default '0',
  PRIMARY KEY  (`id`),
  KEY `name` (`name`),
  KEY `parent` (`parent_id`)
) ENGINE=InnoDB AUTO_INCREMENT=2657 DEFAULT CHARSET=utf8 |

Come ho detto, questo non è l'unico query con questo problema, in modo da tutto il consiglio generale sarebbe più utile, anche se non voglio rifiutare qualsiasi consulenza specifica a questa domanda.

È stato utile?

Soluzione

Il tuo subquery originale:

SELECT `skills`.`id`
FROM (`jobs_skills`)
JOIN `jobs` ON (`jobs`.`id` = `jobs_skills`.`job_id`)
JOIN `skills` ON (`skills`.`id` = `jobs_skills`.`skill_id`)
WHERE `jobs`.`job_visibility_id` = 1
AND `jobs`.`active` = 1
AND `skills`.`valid` = 1
AND `jobs_skills`.`skill_id` IN (96,101,103,108,121,2610,99,119,2607,102,104,112,113,122,1032,1488,2608,109,126,1438,2310,2318,2622,118,1046,1387,2609,100,116,123,2611,2612,2616,2618,114,127,1562,1587,1608,2276,2615,125,1070,1071,1161,1658,2613,2614,2617,105,110,111,120,1394,1435)
GROUP BY `jobs_skills`.`job_id`

È necessario rafactor la query in modo tale che una di controllare e microgestire le tabelle temporanee in fase di creazione e le loro dimensioni. Basata esclusivamente sulla JOIN, WHERE e clausole GROUP BY, è necessario implementare le seguenti modifiche:

lavori deve essere indicizzato su job_visibility_id, attiva, id

Necessario sottoquery

(SELECT id job_id FROM jobs WHERE job_visibility_id=1 AND active=1 ORDER BY id)

competenze deve essere indicizzato su valida, id

Necessario sottoquery

(SELECT id skill_id FROM skills WHERE valid=1 ORDER BY id)

jobs_skills deve essere indicizzato su skill_id, job_id

Necessario sottoquery

(SELECT job_id FROM jobs_skills WHERE skill_id IN (96,101,103,108,121,2610,99,119,2607,102,104,112,113,122,1032,1488,2608,109,126,1438,2310,2318,2622,118,1046,1387,2609,100,116,123,2611,2612,2616,2618,114,127,1562,1587,1608,2276,2615,125,1070,1071,1161,1658,2613,2614,2617,105,110,111,120,1394,1435) ORDER BY skill_id,job_id)

SQL per creare gli indici necessari

ALTER TABLE jobs ADD INDEX (job_visibility_id,active,id);
ALTER TABLE skills ADD INDEX (valid,id);
ALTER TABLE jobs_skills ADD INDEX (skill_id,job_id);

Ora combinare le subquery per formare VOLTRON

SELECT skill_id
FROM (SELECT JS.*
FROM (SELECT skill_id,job_id FROM jobs_skills WHERE skill_id IN (96,101,103,108,121,2610,99,119,2607,102,104,112,113,122,1032,1488,2608,109,126,1438,2310,2318,2622,118,1046,1387,2609,100,116,123,2611,2612,2616,2618,114,127,1562,1587,1608,2276,2615,125,1070,1071,1161,1658,2613,2614,2617,105,110,111,120,1394,1435) ORDER BY skill_id,job_id) JS
INNER JOIN
(SELECT id job_id FROM jobs WHERE job_visibility_id=1 AND active=1 ORDER BY id) J
USING (job_id) INNER JOIN
(SELECT id skill_id FROM skills WHERE valid=1 ORDER BY id) S USING (skill_id)
) A
GROUP BY job_id;

fare un tentativo !!!

A proposito, se la sintassi è corretta, cercherò di regolarlo !!!

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a dba.stackexchange
scroll top