Frage

I figured out that you can get around the 61 table join limit in MySQL by using subqueries. https://stackoverflow.com/a/20134402/2843690

I'm trying to figure out how to easily use this in a program I'm working on to get a detailed product list from Magento (but I think the answer to this question could apply to a lot of situations where eav is involved). The tables that need to be joined look something like this:

catalog_product_entity
+-----------+----------------+
| entity_id | entity_type_id |
+-----------+----------------+
|         1 |              4 |
|         2 |              4 |
|         3 |              4 |
|         4 |              4 |
|         5 |              4 |
|         6 |              4 |
|         7 |              4 |
|         8 |              4 |
|         9 |              4 |
+-----------+----------------+

catalog_product_entity_int
+----------+----------------+--------------+-----------+-------+
| value_id | entity_type_id | attribute_id | entity_id | value |
+----------+----------------+--------------+-----------+-------+
|        1 |              4 |            2 |         1 |   245 |
|        2 |              4 |            3 |         1 |   250 |
|        3 |              4 |            4 |         1 |   254 |
|        4 |              4 |            2 |         2 |   245 |
|        5 |              4 |            3 |         2 |   249 |
|        6 |              4 |            4 |         2 |   253 |
|        7 |              4 |            2 |         3 |   247 |
|        8 |              4 |            3 |         3 |   250 |
|        9 |              4 |            4 |         3 |   254 |
+----------+----------------+--------------+-----------+-------+

eav_attribute
+--------------+----------------+----------------+--------------+
| attribute_id | entity_type_id | attribute_code | backend_type |
+--------------+----------------+----------------+--------------+
|            1 |              4 | name           | varchar      |
|            2 |              4 | brand          | int          |
|            3 |              4 | color          | int          |
|            4 |              4 | size           | int          |
|            5 |              4 | price          | decimal      |
|            6 |              4 | cost           | decimal      |
|            7 |              4 | created_at     | datetime     |
|            8 |              3 | name           | varchar      |
|            9 |              3 | description    | text         |
+--------------+----------------+----------------+--------------+

eav_attribute_option
+-----------+--------------+
| option_id | attribute_id |
+-----------+--------------+
|       245 |            2 |
|       246 |            2 |
|       247 |            2 |
|       248 |            3 |
|       249 |            3 |
|       250 |            3 |
|       251 |            4 |
|       252 |            4 |
|       253 |            4 |
|       254 |            4 |
+-----------+--------------+

eav_attribute_option_value
+----------+-----------+-------------------+
| value_id | option_id |       value       |
+----------+-----------+-------------------+
|       15 |       245 | Fruit of the Loom |
|       16 |       246 | Hanes             |
|       17 |       247 | Jockey            |
|       18 |       248 | White             |
|       19 |       249 | Black             |
|       20 |       250 | Gray              |
|       21 |       251 | Small             |
|       22 |       252 | Medium            |
|       23 |       253 | Large             |
|       24 |       254 | Extra Large       |
+----------+-----------+-------------------+

The program that I'm writing generated sql queries that looked something like this:

SELECT cpe.entity_id
, brand_int.value as brand_int, brand.value as brand
, color_int.value as color_int, color.value as color
, size_int.value as size_int, size.value as size

FROM catalog_product_entity as cpe

LEFT JOIN catalog_product_entity_int as brand_int
ON (cpe.entity_id = brand_int.entity_id
AND brand_int.attribute_id = 2)
LEFT JOIN eav_attribute_option as brand_option
ON (brand_option.attribute_id = 2
AND brand_int.value = brand_option.option_id)
LEFT JOIN eav_attribute_option_value as brand
ON (brand_option.option_id = brand.option_id)

LEFT JOIN catalog_product_entity_int as color_int
ON (cpe.entity_id = color_int.entity_id
AND color_int.attribute_id = 3)
LEFT JOIN eav_attribute_option as color_option
ON (color_option.attribute_id = 3
AND color_int.value = color_option.option_id)
LEFT JOIN eav_attribute_option_value as color
ON (color_option.option_id = color.option_id)

LEFT JOIN catalog_product_entity_int as size_int
ON (cpe.entity_id = size_int.entity_id
AND size_int.attribute_id = 4)
LEFT JOIN eav_attribute_option as size_option
ON (size_option.attribute_id = 4
AND size_int.value = size_option.option_id)
LEFT JOIN eav_attribute_option_value as size
ON (size_option.option_id = size.option_id)
;

It was relatively easy to write the code to generate the query, and the query was fairly easy to understand; however, it's pretty easy to hit the 61 table join limit, which I did with my real-life data. I believe the math says 21 integer-type attributes would go over the limit, and that is before I even start adding varchar, text, and decimal attributes.

So the solution I came up with was to use subqueries to overcome the 61 table limit.

One way to do it is to group the joins in subqueries of 61 joins. And then all of the groups would be joined. I think I can figure out what the sql queries should look like, but it seems difficult to write the code to generate the queries. There is a further (albeit theoretical) problem in that one could again violate the 61 table limit if there were enough attributes. In other words, if I have 62 groups of 61 tables, there will be a MySQL error. Obviously, one could get around this by then grouping the groups of groups into 61. But that just makes the code even more difficult to write and understand.

I think the solution I want is to nest subqueries within subqueries such that each subquery is using a single join of 2 tables (or one table and one subquery). Intuitively, it seems like the code would be easier to write for this kind of query. Unfortunately, thinking about what these queries should look like is making my brain hurt. That's why I need help.

What would such a MySQL query look like?

War es hilfreich?

Lösung

You're right that joining too many attributes through an EAV design is likely to exceed the limit of joins. Even before that, there's probably a practical limit of joins because the cost of so many joins gets higher and higher geometrically. How bad this is depends on your server's capacity, but it's likely to be quite a bit lower than 61.

So querying an EAV data model to produce a result as if it were stored in a conventional relational model (one column per attribute) is problematic.

Solution: don't do it with a join per attribute, which means you can't expect to produce the result in a conventional row-per-entity format purely with SQL.

I'm not intimately familiar with the Magento schema, but I can infer from your query that something like this might work:

SELECT cpe.entity_id
, o.value AS option
, v.value AS option_value
FROM catalog_product_entity AS cpe
INNER JOIN catalog_product_entity_int AS i 
  ON cpe.entity_id = i.entity_id AND i.attribute_id IN (2,3,4)
INNER JOIN eav_attribute_option AS o 
  ON i.value = o.option_id AND i.attribute_id = o.attribute_id
INNER JOIN eav_attribute_option_value AS v
  ON v.option_id = o.option_id;

The IN(2,3,4,...) predicate is where you specify multiple attributes. There's no need to add more joins to get more attributes. They're simply returned as rows rather than columns.

This means you have to write application code to fetch all the rows of this result set and map them into fields of a single object.

From comments by @Axel, is sounds like Magento provides helper functions to do this consuming of a result set and mapping it into an object.

Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow
scroll top