Question

I have a somewhat complex database structure running that tracks products. Here is a diagram of it generated by MySQL Workbench:

Products Diagram

Under this structure I have 3 products that I've added. All three of these products have the attribute color and an option of red. I have a sql fiddle set up here: http://sqlfiddle.com/#!2/68470/4 displaying a query I'm running to try to get the opt_count column to say 3 on rows where the attribute column is color and the option column is red.

Nearly all the other opt_count values are wrong also, so I'm suspecting I am either not grouping by the correct column or I'm approaching this whole problem incorrectly.

How can I get the correct opt_count to show for each row?

Was it helpful?

Solution

As others have said, your schema is the problem as you have a many to many relationship (many products may have many options) which makes queries more difficult.

Here is a query that gives you the exact output you asked for. It shows each option, how many unique products that option is assigned to (the COUNT(distinct product_id)) and provides a comma separated list of the product_id values that are assigned.

SELECT pvo.option, 
       count(distinct product_id), 
       group_concat(distinct product_id) products
  FROM (`products`)
  JOIN `product_variant_combinations` pvc using(`product_id`)
  JOIN `product_variants` pv using(`combination_id`)
  JOIN `product_variant_ao_relation` pv_ao using(`ao_id`)
  JOIN `product_variant_options` pvo using(`option_id`)
  JOIN `product_variant_attributes` pva using(`attribute_id`)
 group by pvo.option;

This is the output for red:

red 3 111026,111025,111024

See here: http://sqlfiddle.com/#!2/68470/133

You asked how to add attribute:

SELECT pva.attribute, pvo.option, count(distinct product_id), group_concat(product_id)
FROM (`products`)
JOIN `product_variant_combinations` pvc using(`product_id`)
JOIN `product_variants` pv using(`combination_id`)
JOIN `product_variant_ao_relation` pv_ao using(`ao_id`)
JOIN `product_variant_options` pvo using(`option_id`)
JOIN `product_variant_attributes` pva using(`attribute_id`)
group by pva.attribute, option

You must GROUP BY each non-aggregate expression in the SELECT clause. In this case the two aggregate expressions are COUNT and GROUP_CONCAT, thus, you must GROUP BY pva.attribute, pvo.option

You probably want to find a good SQL tutorial on GROUP BY.

OTHER TIPS

See if this helps

SELECT products.product_name
, products.product_id
, pvc.combination_id
, pvc.combination
, pva.attribute
, pvo.option
, COUNT(pvo.option) as opt_count
FROM (`products`)
JOIN `product_variant_combinations` pvc ON `products`.`product_id` = `pvc`.`product_id`
JOIN `product_variants` pv ON `pv`.`combination_id` = `pvc`.`combination_id`
JOIN `product_variant_ao_relation` pv_ao ON `pv_ao`.`ao_id` = `pv`.`ao_id`
JOIN `product_variant_options` pvo ON `pvo`.`option_id` = `pv_ao`.`option_id`
JOIN `product_variant_attributes` pva ON `pva`.`attribute_id` = `pv_ao`.`attribute_id`
GROUP BY 1

Returns:

| PRODUCT_NAME | PRODUCT_ID | COMBINATION_ID |                                        COMBINATION | ATTRIBUTE | OPTION | OPT_COUNT |
|--------------|------------|----------------|----------------------------------------------------|-----------|--------|-----------|
|         Desk |     111025 |              4 |                  {"color":"Red","material":"Wood"} |     color |    red |         4 |
|         Lamp |     111024 |              1 |                                    {"color":"Red"} |     color |    red |         3 |
|      T shirt |     111026 |              6 | {"color":"Red","size":"Small","material":"Cotton"} |     color |    red |        18 |

When using a GROUP BY clause, all non-grouped fields need to be invoked with an aggregate function (such as max, avg or sum) to tell the database how to aggregate them away for grouped rows.

Because you grouped by a single field and only specified an aggregate for one other your results are inherently unreliable and 'messy' - you're essentially getting random results based on the on-disk stored order of the found rows.

MySQL is the only RDBMS which doesn't enforce this requirement by default (it even presents it as a feature) - all other common databases like SQL Server, PostgreSQL and Oracle would throw a hard error on the query you wrote. Strict checking of this rule can be enabled if you wish but this will break many, MANY, badly written legacy applications.

There is some relationship between the group-by handling of MySQL and your question, but the problem really originates in the "combination row"

The group-by includes the combination row, which has a concatinated list. Group-by means that the database engine will create sets of data where the values in each column are identical.

Your combination column contains,

{"color":"Red","material":"Wood"} {"color":"Red"} {"color":"Red","size":"Small","material":"Cotton"} {"color":"Red","size":"Medium","material":"Cotton"} {"color":"Red","size":"Large","material":"Cotton"}

Those are all unique values, and thus cause the opt_count to be 1.

To get around this, you need to get an opt_count based just on color=red as a derived table, and then join that back into the tables that have the rest of the data you are interested in

I believe this query returns what you what you want. Note that you don't get one row back for each product, again because of the non-unique nature of your combination column.

-- Outer query to return the product/option/attribute information, plus the count from the derived table
SELECT
 products.product_name, products.product_id,
 product_variant_combinations.combination_id, product_variant_combinations.combination, 
 product_variant_attributes.attribute, 
 product_variant_options.option,
 red_products.option_count

FROM product_variant_combinations

INNER JOIN product_variants
  ON product_variant_combinations.combination_id = product_variants.combination_id

INNER JOIN product_variant_ao_relation
  ON product_variants.ao_id = product_variant_ao_relation.ao_id

INNER JOIN product_variant_options
  ON product_variant_ao_relation.option_id = product_variant_options.option_id

INNER JOIN product_variant_attributes
  ON product_variant_ao_relation.attribute_id = product_variant_attributes.attribute_id

INNER JOIN products
  ON product_variant_combinations.product_id = products.product_id

INNER JOIN
(
    -- Inner table to count the distinct products with the color "red"
    SELECT COUNT(DISTINCT product_variant_combinations.product_id) AS option_count,
      product_variant_attributes.attribute_id,
      product_variant_options.option_id
    FROM product_variant_attributes

    INNER JOIN product_variant_ao_relation
      ON product_variant_attributes.attribute_id = product_variant_ao_relation.attribute_id

    INNER JOIN product_variant_options
      ON product_variant_ao_relation.option_id = product_variant_options.option_id

    INNER JOIN product_variants
      ON product_variant_ao_relation.ao_id

    INNER JOIN product_variant_combinations
        ON product_variants.COMBINATION_ID = product_variant_combinations.COMBINATION_ID

    WHERE product_variant_options.option = 'red'
    AND product_variant_attributes.attribute = 'color'

    GROUP BY product_variant_attributes.attribute_id, product_variant_options.option_id
) AS red_products
    ON product_variant_attributes.attribute_id = red_products.attribute_id
    AND product_variant_options.option_id = red_products.option_id
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top