Domanda

I'm working with PostgreSQL (I'm a rookie in the database world) and I'd like to know your opinion on the efficiency of this kind of queries I found in the code I'm working with. These queries have a lot of JOINs, and one of them (bold font) has many rows by request. This forces us to GROUP BY request.id in order to obtain a row by request and a field (bold font) with all this rows data.

I think this kind of queries has to lose lots of time looking for all these maximums, but I can't figure an alternative way of doing this. Any ideas on its efficiency and how to improve it?

SELECT
  request.id AS id,
  max(request_type.name) AS request_type,
  to_char(max(request.timestamp),'DD/mm/YYYY HH24:mi') AS timestamp,
  to_char(max(request.timestamp),'YYYY-mm-DD') AS timestamp_filtering,
  max(state.name) AS request_state,
  max(users.name || ' ' || COALESCE(users.surname,'')) AS create_user,
  max(request.id_create_user) AS id_create_user,
  max(enterprise.name) AS enterprise,
  max(cause_issue.name) AS cause,
  max(request_movements.id_request_state) AS id_state,
  array_to_string(array_agg(DISTINCT act_code.name || '/' || req_res.act_code), ', ') AS act_code, /* here */
  max(revised.code) AS state_revised, 
  max(request_shipment.warehouse) AS warehouse,
  max(req_res.id_warehouse) AS id_warehouse
FROM
  request
  LEFT JOIN users
    ON users.id=request.id_create_user
  LEFT JOIN enterprise
    ON users.id_enterprise=enterprise.id
  LEFT JOIN request_movements
    ON request_movements.id=request.id_request_movement
  LEFT JOIN request_versions
    ON request_versions.id = request_movements.id_version
  LEFT JOIN state
    ON request_movements.id_request_state=state.id
  INNER JOIN request_type
    ON request.id_request_type=request_type.id
  LEFT JOIN cause_issue
    ON request.id_cause_issue=cause_issue.id
  LEFT JOIN request_reserve req_res
    ON req_res.id_request = request.id /* here */
  LEFT JOIN act_code
    ON req_res.id_act_code=act_code.id
  LEFT JOIN request_shipment
    ON (request_shipment.id_request=request.id)
  LEFT JOIN warehouse_enterprise
    ON (warehouse_enterprise.id = request_shipment.id_warehouse_enterprise)
  LEFT JOIN revised
    ON (revised.id = request_shipment.id_revised)
WHERE
  request.id_request_type = "any_type"  
GROUP BY
  request.id

The EXPLAIN returns this.

È stato utile?

Soluzione

You can much simplify this query by aggregating values in request_reserve and act_code before you JOIN to the big join. This avoids the need for aggregate functions on all the other columns and should generally be much faster for a larger number of rows.

SELECT r.id
      ,rt.name AS request_type
      ,to_char(r.timestamp, 'DD/mm/YYYY HH24:mi') AS timestamp
      ,to_char(r.timestamp, 'YYYY-mm-DD') AS timestamp_filtering
      ,s.name AS request_state
      ,u.name || COALESCE(' ' || u.surname, '') AS create_user
      ,r.id_create_user
      ,e.name AS enterprise
      ,c.name AS cause
      ,rm.id_request_state AS id_state
      ,rr.act_code
      ,rd.code AS state_revised
      ,rs.warehouse
      ,rr.id_warehouse
FROM      request              r
LEFT JOIN users                u  ON u.id = r.id_create_user
LEFT JOIN enterprise           e  ON e.id = u.id_enterprise
LEFT JOIN request_movements    rm ON rm.id = r.id_request_movement
LEFT JOIN request_versions     rv ON rv.id = rm.id_version
LEFT JOIN state                s  ON s.id = rm.id_request_state
     JOIN request_type         rt ON rt.id = r.id_request_type
LEFT JOIN cause_issue          c  ON c.id = r.id_cause_issue
LEFT JOIN request_shipment     rs ON rs.id_request = r.id
LEFT JOIN warehouse_enterprise w  ON w.id = rs.id_warehouse_enterprise
LEFT JOIN revised              rd ON rd.id = rs.id_revised
LEFT JOIN (
   SELECT rr.id_request, rr.id_warehouse
         ,array_to_string(array_agg(
             DISTINCT a.name || '/' || rr.act_code), ', ') AS act_code
   FROM   request_reserve rr
   LEFT   JOIN act_code   a ON r.id_act_code = a.id
   GROUP  BY rr.id_request, rr.id_warehouse
   )  rr ON rr.id_request = r.id
WHERE  r.id_request_type = "any_type";  -- use single quotes for values!

For big queries it is essential that you have a format the human eye can easily parse. Therefore I reformatted before I improved the query. I use table aliases to avoid unwieldy identifiers as much as possible.

Minor improvement to create_user: no trailing space. If either part of the name can be NULL, I suggest this to avoid a dangling space:

COALESCE(u.name || ' ' || u.surname, u.name, u.surname)

In PostgreSQL 9.1 or later you could use concat_ws().

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top