Question

Consider a database with these three tables:

category:

cat_id  name        parent_id
-----------------------
1       drinks      0
2       carbonated  1
3       cola        2
4       water       1
5       rc-cola     3

product:

prod_id  name           default_cat
-----------------------------------
1        cola-zero      2
2        mineral water  4

cat_prod:

cat_id  prod_id
---------------
1       1
2       1
3       1
4       2

We have category hierarchy and a product, which may belong to several categories.

Also, each product has a default category. In this case cola-zero product has default category 2 - carbonated, which is a mistake. Default category has to be 3 - cola. I.e., the lowest category in the category tree. However, I may consider only a subset of the category tree: only those categories that the product belongs to.

I need to update the default category of each product in the product table and ensure that product's default category is the most "defined" one, i.e., the lowest for a given product.

I can write a script, which would retrieve all categories, build the tree in memory and then for each product check the default category against this tree. But I hope there is a smarter way to do this via SQL only.

Is it even possible to do it in pure SQL?

Thanks.

Was it helpful?

Solution 2

I finally solved it. A bit dirty, since I have to create a temporary table to hold intermediate results, but it works.

Here is the full code:

-- schema
CREATE TABLE category
(
  cat_id INT NOT NULL,
  name VARCHAR(255) NOT NULL,
  parent_id INT NOT NULL,
  PRIMARY KEY (cat_id)
);

GO

CREATE TABLE product
(
  prod_id INT NOT NULL,
  name VARCHAR(255) NOT NULL,
  default_cat INT NOT NULL,
  PRIMARY KEY (prod_id)
);

GO

CREATE TABLE cat_prod
(
  cat_id INT NOT NULL,
  prod_id INT NOT NULL,
  PRIMARY KEY (cat_id, prod_id),
  FOREIGN KEY (cat_id) REFERENCES category(cat_id),
  FOREIGN KEY (prod_id) REFERENCES product(prod_id)
);

GO

-- data
INSERT INTO category (cat_id, name, parent_id)
VALUES
  (1, 'drinks', 0),
  (2, 'carbonated', 1),
  (3, 'cola', 2),
  (4, 'water', 1),
  (5, 'rc-cola', 3)
;

GO

INSERT INTO product (prod_id, name, default_cat)
VALUES
  (1, 'cola-zero', 2), -- this is a mistake! must be 3
  (2, 'mineral water', 4) -- this one should stay intact
;

GO

INSERT INTO cat_prod (cat_id, prod_id)
VALUES
  (1, 1),
  (2, 1),
  (3, 1),
  (4, 2),
  (4, 1)
;

GO

-- stored proc
CREATE PROCEDURE iterate_products()
BEGIN
    DECLARE prod_id INT;
    DECLARE default_cat INT;
    DECLARE new_default_cat INT;
    DECLARE done INT DEFAULT FALSE;
    DECLARE cur CURSOR FOR SELECT p.prod_id, p.default_cat FROM product p;
    DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;

    -- temporary table to hold the category subtree for a given product
    CREATE TABLE IF NOT EXISTS tmp_category_sub_tree
    (
        cat_id INT NOT NULL,
        parent_id INT NOT NULL
    );

    OPEN cur;

    UPDATE_LOOP: LOOP
        FETCH cur INTO prod_id, default_cat;
        IF done THEN
            LEAVE UPDATE_LOOP;
        END IF;

        TRUNCATE TABLE tmp_category_sub_tree;

        -- select all cateries this products belongs to
        INSERT INTO tmp_category_sub_tree (cat_id, parent_id)
            SELECT category.cat_id, category.parent_id
            FROM category
            INNER JOIN cat_prod
              ON category.cat_id = cat_prod.cat_id
            WHERE
              cat_prod.prod_id = prod_id;

        -- select a leaf (only one)
        SELECT t1.cat_id FROM
            tmp_category_sub_tree AS t1 LEFT JOIN tmp_category_sub_tree AS t2
        ON t1.cat_id = t2.parent_id
        WHERE
            t2.cat_id IS NULL
        LIMIT 1
        INTO NEW_DEFAULT_CAT;

        -- update product record, if required
        IF default_cat != new_default_cat THEN
            UPDATE product
            SET default_cat = new_default_cat
            WHERE
                product.prod_id = prod_id;
        END IF;

    END LOOP;

    CLOSE cur;

    DROP TABLE tmp_category_sub_tree;
END;

GO

Here is the SQLFiddle link: http://sqlfiddle.com/#!2/98a45/1

OTHER TIPS

If you store the hierarchy in a Closure Table, it's really easy to find the lowest node(s) in the tree:

SELECT c.descendant FROM closure c
JOIN (SELECT MAX(pathlength) AS pathlength FROM closure) x USING (pathlength); 

Finding the lowest node of a subtree, you just need to be specific about the starting node of the branch you want to search:

SELECT c.descendant FROM closure c
JOIN (SELECT MAX(pathlength) AS pathlength FROM closure) x USING (pathlength)
WHERE c.ancestor = 2;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top