Question

Is there a way to set values in specific positions inside an array, based on information from other columns? (Postgres 9.3 or later.)

For example, I would like to select an item and its stock information from the following tables:

Table item:

CREATE TABLE item (
  id integer NOT NULL
);

INSERT INTO item VALUES
 (1), (2), (3), (4);

Table item_stock (containing shop-specific information like stock and prices):

CREATE TABLE item_stock (
    item_id integer NOT NULL,
    shop_id integer NOT NULL,
    stock integer,
    cost numeric(19,3),
);

INSERT INTO items_stock VALUES
  (1, 1, 2, 10),
  (1, 2, 0, 9),
  (2, 2, 0, 9),
  (3, 1, 3, 22);

SQLFiddle

Looking for a query to produce the following results, where the array in the column stock contains stock info for specific shops. In the example, array position 1 is stock for shop_id=1 and array position 2 is stock for shop_id=2. 0 instead of NULL where no data is found:

id | stock
---+-------
1  | {2, 0}
2  | {0, 0}
3  | {3, 0}
4  | {0, 0}
Was it helpful?

Solution

Your answer basically gets the job done:

SELECT b.id, array_agg(b.stock) AS stock
FROM  (
   SELECT i.id, COALESCE(i_s.stock, 0) AS stock
   FROM   item i
   CROSS  JOIN unnest('{1,2}'::int[]) n
   LEFT   JOIN item_stock i_s ON i.id = i_s.item_id AND n.n = i_s.shop_id
   ORDER  BY i.id, n.n
   ) b
GROUP  BY b.id;

Two notable changes:

  1. Order is not guaranteed without ORDER BY in the subquery or as additional clause to array_aggregate() (typically more expensive). And that's the core element of your question.

  2. unnest('{1,2}'::int[]) instead of generate_series(1,2) as requested shop IDs will hardly be sequential all the time.

I also moved the set-returning function from the SELECT list to a separate table expression attached with CROSS JOIN. Standard SQL form, but that's just a matter of clarity and taste, not a necessity. At least in Postgres 10 or later. See:

Doing the same with LEFT JOIN LATERAL and an ARRAY constructor might be a bit faster as we don't need the outer GROUP BY and the ARRAY constructor is typically faster, too:

SELECT i.id, s.stock
FROM   item i
CROSS  JOIN LATERAL (
   SELECT ARRAY(
      SELECT COALESCE(i_s.stock, 0)
      FROM   unnest('{1,2}'::int[]) n
      LEFT   JOIN item_stock i_s ON i_s.shop_id = n.n
                                AND i_s.item_id = i.id
      ORDER  BY n.n
      ) AS stock
   ) s;

Related:

And if you have more than just the two shops, a nested crosstab() should provide top performance:

SELECT i.id, COALESCE(stock, '{0,0}') AS stock
FROM   item i
LEFT   JOIN (
   SELECT id, ARRAY[COALESCE(shop1, 0), COALESCE(shop2, 0)] AS stock
   FROM   crosstab(
     $$SELECT item_id, shop_id, stock
       FROM   item_stock
       WHERE  shop_id = ANY ('{1,2}'::int[])
       ORDER  BY 1,2$$

     , $$SELECT unnest('{1,2}'::int[])$$
      ) AS ct (id int, shop1 int, shop2 int)
   ) i_s USING (id);

Needs to be adapted in more places to cater for different shop IDs.

Related:

db<>fiddle here

Index

Make sure you have at least an index on item_stock (shop_id, item_id) - typically provided by a PRIMARY KEY on those columns. For the crosstab query, it also matters that shop_id comes first. See:

Adding stock as another index expression may allow faster index-only scans. In Postgres 11 or later consider an INCLUDE item to the PK like so:

PRIMARY KEY (shop_id, item_id) INCLUDE (stock)

But only if you need it a lot, as it makes the index a bit bigger and possibly more susceptible to bloat from updates.

OTHER TIPS

This is the query I was able to come up (with some brute-force):

SELECT b.id, array_agg(b.stock) FROM (
  SELECT a.*, COALESCE(i_s.stock, 0) as stock FROM (
    SELECT id, generate_series(1, 2) as n FROM items  
  ) as a
  LEFT OUTER JOIN item_stock i_s ON a.id = i_s.item_id AND a.n = i_s.shop_id
) as b GROUP by b.id;
Licensed under: CC-BY-SA with attribution
Not affiliated with dba.stackexchange
scroll top